R/MainLassoLDATraining.R
training.Rd
Training a haft of all cells to find optimal ElasticNet and LDA models to predict a subpopulation
training( genes = NULL, cluster_mixedpop1 = NULL, mixedpop1 = NULL, mixedpop2 = NULL, c_selectID = NULL, listData = list(), out_idx = 1, standardize = TRUE, trainset_ratio = 0.5, LDA_run = FALSE, log_transform = FALSE )
genes | a vector of gene names (for ElasticNet shrinkage); gene symbols must be in the same format with gene names in subpop2. Note that genes are listed by the order of importance, e.g. differentially expressed genes that are most significan, so that if the gene list contains too many genes, only the top 500 genes are used. |
---|---|
cluster_mixedpop1 | a vector of cluster assignment in mixedpop1 |
mixedpop1 | is a SingleCellExperiment object from the train mixed population |
mixedpop2 | is a SingleCellExperiment object from the target mixed population |
c_selectID | a selected number to specify which subpopulation to be used for training |
listData | list to store output in |
out_idx | a number to specify index to write results into the list output. This is needed for running bootstrap. |
standardize | a logical value specifying whether or not to standardize the train matrix |
trainset_ratio | a number specifying the proportion of cells to be part of the training subpopulation |
LDA_run | logical, if the LDA run is added to compare to ElasticNet |
log_transform | boolean whether log transform should be computed |
a list
with prediction results written in to the indexed
out_idx
Quan Nguyen, 2017-11-25
c_selectID<-1 out_idx<-1 day2 <- day_2_cardio_cell_sample mixedpop1 <-new_scGPS_object(ExpressionMatrix = day2$dat2_counts, GeneMetadata = day2$dat2geneInfo, CellMetadata = day2$dat2_clusters) day5 <- day_5_cardio_cell_sample mixedpop2 <-new_scGPS_object(ExpressionMatrix = day5$dat5_counts, GeneMetadata = day5$dat5geneInfo, CellMetadata = day5$dat5_clusters) genes <-training_gene_sample genes <-genes$Merged_unique listData <- training(genes, cluster_mixedpop1 = colData(mixedpop1)[, 1], mixedpop1 = mixedpop1, mixedpop2 = mixedpop2, c_selectID, listData =list(), out_idx=out_idx, trainset_ratio = 0.5)#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#> [1] "Accuracy" "ElasticNetGenes" "Deviance" "ElasticNetFit" #> [5] "LDAFit" "predictor_S1"listData$Accuracy#> [[1]] #> [[1]][[1]] #> [[1]][[1]][[1]] #> [1] 140 #> #> [[1]][[1]][[2]] #> [1] 72 #> #> #>