Coarsened mixtures of hierarchical skew normal kernels for flow cytometry analyses
Flow cytometry (FCM) is the standard multi-parameter assay used to measure single cell phenotype and functionality. It is commonly used to quantify the relative frequencies of cell subsets in blood and disaggregated tissues. A typical analysis of FCM data involves cell classification - the identification of cell subgroups in the sample - and comparisons of the cell subgroups across samples. While modern experiments often necessitate the collection and processing of samples in multiple batches, analysis of FCM data across batches is challenging because the locations in the marker space of cell subsets may vary across samples. Differences across samples may occur because of true biological variation or technical reasons such as antibody lot effects or instrument optics. An important step in comparative analyses of multi-sample FCM data is cross-sample calibration, whose goal is to align cell subsets across multiple samples in the presence of variations in locations, so that variation due to technical reasons is minimized and true biological variation can be meaningfully compared. We introduce a Bayesian nonparametric hierarchical modeling approach for accomplishing calibration and cell classification simultaneously in a unified probabilistic manner. Three important features of our method make it particularly effective for analyzing multi-sample FCM data: a nonparametric mixture avoids prespecifying the number of cell clusters; the hierarchical skew normal kernels allow flexibility in the shapes of the cell subsets and cross-sample variation in their locations; and finally the "coarsening" strategy makes inference robust to small departures from the model, a feature that becomes crucial with massive numbers of observations such as those encountered in FCM data. We demonstrate the merits of our approach in simulated examples and carry out a case study in the analysis of two FCM data sets.
READ FULL TEXT