R/CORE_clustering_bagging.R
CORE_bagging.Rd
CORE is an algorithm to generate reproduciable clustering, CORE is first implemented in ascend R package. Here, CORE V2.0 uses bagging analysis to find a stable clustering result and detect rare clusters mixed population.
CORE_bagging( mixedpop = NULL, bagging_run = 20, subsample_proportion = 0.8, windows = seq(from = 0.025, to = 1, by = 0.025), remove_outlier = c(0), nRounds = 1, PCA = FALSE, nPCs = 20, ngenes = 1500, log_transform = FALSE )
mixedpop | is a SingleCellExperiment object from the train mixed population. |
---|---|
bagging_run | an integer specifying the number of bagging runs to be computed. |
subsample_proportion | a numeric specifying the proportion of the tree to be chosen in subsampling. |
windows | a numeric vector specifying the ranges of each window. |
remove_outlier | a vector containing IDs for clusters to be removed the default vector contains 0, as 0 is the cluster with singletons. |
nRounds | an integer specifying the number rounds to attempt to remove outliers. |
PCA | logical specifying if PCA is used before calculating distance matrix. |
nPCs | an integer specifying the number of principal components to use. |
ngenes | number of genes used for clustering calculations. |
log_transform | boolean whether log transform should be computed |
a list
with clustering results of all iterations, and a
selected
optimal resolution
Quan Nguyen, 2018-05-11
day5 <- day_5_cardio_cell_sample cellnames<-colnames(day5$dat5_counts) cluster <-day5$dat5_clusters cellnames <- data.frame('cluster' = cluster, 'cellBarcodes' = cellnames) #day5$dat5_counts needs to be in a matrix format mixedpop2 <-new_summarized_scGPS_object(ExpressionMatrix = day5$dat5_counts, GeneMetadata = day5$dat5geneInfo, CellMetadata = day5$dat5_clusters) test <- CORE_bagging(mixedpop2, remove_outlier = c(0), PCA=FALSE, bagging_run = 2, subsample_proportion = .7)#>#>#>#>#>#>#>#>#>#>#>#>#>#>