Name:
srep08140-s3.xls
Size:
30.5Kb
Format:
Microsoft Excel
Description:
Supplemental File 3
Type
ArticleAuthors
Alanis Lobato, Gregorio
Cannistraci, Carlo Vittorio
Eriksson, Anders

Manica, Andrea
Ravasi, Timothy

KAUST Department
Biological and Environmental Sciences and Engineering (BESE) DivisionBioscience Program
Computational Bioscience Research Center (CBRC)
Computer Science Program
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Integrative Systems Biology Lab
Date
2015-01-30Online Publication Date
2015-01-30Print Publication Date
2015-07Permanent link to this record
http://hdl.handle.net/10754/344117
Metadata
Show full item recordAbstract
Detecting structure in population genetics and case-control studies is important, as it exposes phenomena such as ecoclines, admixture and stratification. Principal Component Analysis (PCA) is a linear dimension-reduction technique commonly used for this purpose, but it struggles to reveal complex, nonlinear data patterns. In this paper we introduce non-centred Minimum Curvilinear Embedding (ncMCE), a nonlinear method to overcome this problem. Our analyses show that ncMCE can separate individuals into ethnic groups in cases in which PCA fails to reveal any clear structure. This increased discrimination power arises from ncMCE's ability to better capture the phylogenetic signal in the samples, whereas PCA better reflects their geographic relation. We also demonstrate how ncMCE can discover interesting patterns, even when the data has been poorly pre-processed. The juxtaposition of PCA and ncMCE visualisations provides a new standard of analysis with utility for discovering and validating significant linear/nonlinear complementary patterns in genetic data.Citation
Alanis-Lobato, G., Cannistraci, C. V., Eriksson, A., Manica, A., & Ravasi, T. (2015). Highlighting nonlinear patterns in population genetics datasets. Sci. Rep., 5. doi: 10.1038/srep08140Publisher
Springer NatureJournal
Scientific ReportsPubMed ID
25633916Additional Links
http://www.nature.com/doifinder/10.1038/srep08140ae974a485f413a2113503eed53cd6c53
10.1038/srep08140
Scopus Count
Collections
Articles; Biological and Environmental Sciences and Engineering (BESE) Division; Bioscience Program; Integrative Systems Biology Lab; Computer Science Program; Computational Bioscience Research Center (CBRC); Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) DivisionThe following license files are associated with this item:
Related articles
- Nonlinear dimensionality reduction of gene expression data for visualization and clustering analysis of cancer tissue samples.
- Authors: Shi J, Luo Z
- Issue date: 2010 Aug
- Complex-valued neural networks for nonlinear complex principal component analysis.
- Authors: Rattan SS, Hsieh WW
- Issue date: 2005 Jan
- Study of large and highly stratified population datasets by combining iterative pruning principal component analysis and structure.
- Authors: Limpiti T, Intarapanich A, Assawamakin A, Shaw PJ, Wangkumhang P, Piriyapongsa J, Ngamphiw C, Tongsima S
- Issue date: 2011 Jun 23
- Nonlinear Dimensionality Reduction by Minimum Curvilinearity for Unsupervised Discovery of Patterns in Multidimensional Proteomic Data.
- Authors: Alessio M, Cannistraci CV
- Issue date: 2016
- Application of principal component analysis to pharmacogenomic studies in Canada.
- Authors: Visscher H, Ross CJ, Dubé MP, Brown AM, Phillips MS, Carleton BC, Hayden MR
- Issue date: 2009 Dec