Context-Specific Infinite Mixtures for Clustering Gene Expression Profiles Across Diverse Microarray Dataset
Motivation: Identifying groups of co-regulated genes by monitoring their expression over various experimental conditions is complicated by the fact that such co-regulation is condition-specific. Ignoring the context-specific nature of co-regulation significantly reduces the ability of clustering procedures to detect co-expressed genes due to additional ‘noise’ introduced by non-informative measurements. Results: We have developed a novel Bayesian hierarchical model and corresponding computational algorithms for clustering gene expression profiles across diverse experimental conditions and studies that accounts for context-specificity of gene expression patterns. The model is based on the Bayesian infinite mixtures framework and does not require a priori specification of the number of clusters. We demonstrate that explicit modeling of context-specificity results in increased accuracy of the cluster analysis by examining the specificity and sensitivity of clusters in microarray data. We also demonstrate that probabilities of co-expression derived from the posterior distribution of clusterings are valid estimates of statistical significance of created clusters.
pre print, post print (12 month embargo)
Liu, X.; Sivaganesan, S.; Yeung, Ka Yee; Guo, J.; Bumgarner, R. E.; and Medvedovic, Mario, "Context-Specific Infinite Mixtures for Clustering Gene Expression Profiles Across Diverse Microarray Dataset" (2006). School of Engineering and Technology Publications. 301.