subsamples cells for each bagging run and performs 40 clustering runs or more depending on windows.

clustering_bagging(
  object = NULL,
  ngenes = 1500,
  bagging_run = 20,
  subsample_proportion = 0.8,
  windows = seq(from = 0.025, to = 1, by = 0.025),
  remove_outlier = c(0),
  nRounds = 1,
  PCA = FALSE,
  nPCs = 20,
  log_transform = FALSE
)

Arguments

object

is a SingleCellExperiment object from the train mixed population.

ngenes

number of genes used for clustering calculations.

bagging_run

an integer specifying the number of bagging runs to be computed.

subsample_proportion

a numeric specifying the proportion of the tree to be chosen in subsampling.

windows

a numeric vector specifying the rages of each window.

remove_outlier

a vector containing IDs for clusters to be removed the default vector contains 0, as 0 is the cluster with singletons.

nRounds

a integer specifying the number rounds to attempt to remove outliers.

PCA

logical specifying if PCA is used before calculating distance matrix.

nPCs

an integer specifying the number of principal components to use.

log_transform

boolean whether log transform should be computed

Value

a list of clustering results containing each bagging run as well as the clustering of the original tree and the tree itself.

Author

Quan Nguyen, 2017-11-25

Examples

day5 <- day_5_cardio_cell_sample mixedpop2 <-new_summarized_scGPS_object(ExpressionMatrix = day5$dat5_counts, GeneMetadata = day5$dat5geneInfo, CellMetadata = day5$dat5_clusters) test <-clustering_bagging(mixedpop2, remove_outlier = c(0), bagging_run = 2, subsample_proportion = .7)
#> Performing 1 round of filtering
#> Identifying top variable genes
#> Calculating distance matrix
#> Performing hierarchical clustering
#> Finding clustering information
#> No more outliers detected in filtering round 1
#> Identifying top variable genes
#> Calculating distance matrix
#> Performing hierarchical clustering
#> Finding clustering information
#> 500 cells left after filtering
#> Running 2 bagging runs, with 0.7 subsampling...
#> Done clustering, moving to stability calculation...