In a study published today in Scientific Reports Lenz et al. reveal a higher than previously reported linear intrinsic dimensionality of a global map of human gene expression. Link to the paper: Lenz et al, Scientific Reports (2016).
Principal components analysis (PCA) is a common unsupervised method for the analysis of gene expression microarray data, providing information on the overall structure of the analyzed dataset. In the recent years, it has been applied to very large datasets involving many different tissues and cell types, in order to create a low dimensional global map of human gene expression.
In this study, Michael Lenz, a former Ph.D. student of JRC-COMBINE now a Postdoctoral Researcher at the Maastricht Centre for Systems Biology (MaCSBio) together with Franz-Josef Müller (University Hospital Schleswig-Holstein, Kiel, Germany), Martin Zenke (RWTH Aachen University Medical School) and Andreas Schuppert (Joint Research Center for Computational Biomedicine (JRC-COMBINE), reevaluate this approach and show that the linear intrinsic dimensionality of this global map is higher than previously reported. Furthermore, Lenz et al. analyze, in which cases PCA fails to detect biologically relevant information and point to methods that can overcome these limitations. The results refine our current understanding of the overall structure of gene expression spaces and show that PCA critically depends on the effect size of the biological signal as well as on the fraction of samples containing this signal.
Publication: Michael Lenz, Franz-Josef Müller, Martin Zenke & Andreas Schuppert. Principal components analysis and the reported low intrinsic dimensionality of gene expression microarray data. Scientific Reports (2016), doi:10.1038/srep25696.
Funding: This work was supported by the Ministry for Innovation, Science and Research of German Federal State of North Rhine-Westphalia, Germany (M. L. and M. Z.), Bayer Technology Services GmbH, Germany (M. L.), and the Dutch Province of Limburg, The Netherlands (M. L.), the Federal Ministry of Education and Research (BMBF) through the “PluriTest2” project, grant nr. 13GW0128 (F.-J. M.), and the German Research Foundation (DFG) through grant MU 3231/3-1 (F.-J. M.).