Predict a new mixed population after training the model for a subpopulation in the first mixed population. All subpopulations in the new target mixed population will be predicted, where each targeted subpopulation will have a transition score from the orginal subpopulation to the new subpopulation.

predicting(
  listData = NULL,
  cluster_mixedpop2 = NULL,
  mixedpop2 = NULL,
  out_idx = NULL,
  standardize = TRUE,
  LDA_run = FALSE,
  c_selectID = NULL,
  log_transform = FALSE
)

Arguments

listData

a list object containing trained results for the selected subpopulation in the first mixed population

cluster_mixedpop2

a vector of cluster assignment for mixedpop2

mixedpop2

a SingleCellExperiment object from the target mixed population of importance, e.g. differentially expressed genes that are most significant

out_idx

a number to specify index to write results into the list output. This is needed for running bootstrap.

standardize

a logical of whether to standardize the data

LDA_run

logical, if the LDA prediction is added to compare to ElasticNet, the LDA model needs to be trained from the training before inputting to this prediction step

c_selectID

a number to specify the trained cluster used for prediction

log_transform

boolean whether log transform should be computed

Value

a list with prediction results written in to the index out_idx

Author

Quan Nguyen, 2017-11-25

Examples

c_selectID<-1 out_idx<-1 day2 <- day_2_cardio_cell_sample mixedpop1 <-new_scGPS_object(ExpressionMatrix = day2$dat2_counts, GeneMetadata = day2$dat2geneInfo, CellMetadata = day2$dat2_clusters) day5 <- day_5_cardio_cell_sample mixedpop2 <-new_scGPS_object(ExpressionMatrix = day5$dat5_counts, GeneMetadata = day5$dat5geneInfo, CellMetadata = day5$dat5_clusters) genes <-training_gene_sample genes <-genes$Merged_unique listData <- training(genes, cluster_mixedpop1 = colData(mixedpop1)[, 1], mixedpop1 = mixedpop1, mixedpop2 = mixedpop2, c_selectID, listData =list(), out_idx=out_idx)
#> Total 224 cells as source subpop
#> Total 366 cells in remaining subpops
#> subsampling 112 cells for training source subpop
#> subsampling 112 cells in remaining subpops for training
#> use 6 genes for training model
#> use 6 genes 224 cells for testing model
#> rename remaining subpops to 2_3
#> there are 112 cells in class 2_3 and 112 cells in class 1
#> removing 1 genes with no variance
#> standardizing prediction/target dataset
#> performning elasticnet model training...
#> extracting deviance and best gene features...
#> lambda min is at location 16
#> the leave-out cells in the source subpop is 112
#> use 112 target subpops cells for leave-out test set
#> standardizing the leave-out target and source subpops...
#> start ElasticNet prediction for estimating accuracy...
#> evaluation accuracy ElasticNet 0.611111111111111
listData <- predicting(listData =listData, mixedpop2 = mixedpop2, out_idx=out_idx, cluster_mixedpop2 = colData(mixedpop2)[, 1], c_selectID = c_selectID)
#> standardizing target subpops before prediction...
#> predicting from source to target subpop 1...
#> number of cells in the target subpop 1 is 187
#> Number of genes in the target data, but not in model genes is 4995
#> Number of genes in the model present in the target data is 5
#> There are 0 genes that are in the model, but not in target subpopulations
#> the prediction (target) subop has 5 genes and 187 cells. The trained model has 5 genes
#> first 10 genes in model
#> GATA4VIMSNAI2GJA1TMEM88
#> first 10 genes in target
#> GATA4VIMSNAI2GJA1TMEM88
#> running elasticNet classification...
#> class probability prediction ElasticNet for target subpop 1 is 3.74331550802139
#> predicting from source to target subpop 2...
#> number of cells in the target subpop 2 is 140
#> Number of genes in the target data, but not in model genes is 4995
#> Number of genes in the model present in the target data is 5
#> There are 0 genes that are in the model, but not in target subpopulations
#> the prediction (target) subop has 5 genes and 140 cells. The trained model has 5 genes
#> first 10 genes in model
#> GATA4VIMSNAI2GJA1TMEM88
#> first 10 genes in target
#> GATA4VIMSNAI2GJA1TMEM88
#> running elasticNet classification...
#> class probability prediction ElasticNet for target subpop 2 is 20.7142857142857
#> predicting from source to target subpop 3...
#> number of cells in the target subpop 3 is 133
#> Number of genes in the target data, but not in model genes is 4995
#> Number of genes in the model present in the target data is 5
#> There are 0 genes that are in the model, but not in target subpopulations
#> the prediction (target) subop has 5 genes and 133 cells. The trained model has 5 genes
#> first 10 genes in model
#> GATA4VIMSNAI2GJA1TMEM88
#> first 10 genes in target
#> GATA4VIMSNAI2GJA1TMEM88
#> running elasticNet classification...
#> class probability prediction ElasticNet for target subpop 3 is 7.5187969924812
#> predicting from source to target subpop 4...
#> number of cells in the target subpop 4 is 40
#> Number of genes in the target data, but not in model genes is 4995
#> Number of genes in the model present in the target data is 5
#> There are 0 genes that are in the model, but not in target subpopulations
#> the prediction (target) subop has 5 genes and 40 cells. The trained model has 5 genes
#> first 10 genes in model
#> GATA4VIMSNAI2GJA1TMEM88
#> first 10 genes in target
#> GATA4VIMSNAI2GJA1TMEM88
#> running elasticNet classification...
#> class probability prediction ElasticNet for target subpop 4 is 17.5