Handle URI:
http://hdl.handle.net/10754/601396
Title:
Big Data Analysis of Human Genome Variations
Authors:
Gojobori, Takashi ( 0000-0001-7850-1743 )
Abstract:
Since the human genome draft sequence was in public for the first time in 2000, genomic analyses have been intensively extended to the population level. The following three international projects are good examples for large-scale studies of human genome variations: 1) HapMap Data (1,417 individuals) (http://hapmap.ncbi.nlm.nih.gov/downloads/genotypes/2010-08_phaseII+III/forward/), 2) HGDP (Human Genome Diversity Project) Data (940 individuals) (http://www.hagsc.org/hgdp/files.html), 3) 1000 genomes Data (2,504 individuals) http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ If we can integrate all three data into a single volume of data, we should be able to conduct a more detailed analysis of human genome variations for a total number of 4,861 individuals (= 1,417+940+2,504 individuals). In fact, we successfully integrated these three data sets by use of information on the reference human genome sequence, and we conducted the big data analysis. In particular, we constructed a phylogenetic tree of about 5,000 human individuals at the genome level. As a result, we were able to identify clusters of ethnic groups, with detectable admixture, that were not possible by an analysis of each of the three data sets. Here, we report the outcome of this kind of big data analyses and discuss evolutionary significance of human genomic variations. Note that the present study was conducted in collaboration with Katsuhiko Mineta and Kosuke Goto at KAUST.
KAUST Department:
Computational Bioscience Research Center (CBRC); Biological and Environmental Sciences and Engineering (BESE) Division
Conference/Event name:
KAUST Research Conference on Computational and Experimental Interfaces of Big Data and Biotechnology
Issue Date:
25-Jan-2016
Type:
Presentation
Appears in Collections:
Computational Bioscience Research Center (CBRC); KAUST Research Conference on Computational and Experimental Interfaces of Big Data and Biotechnology, January 2016; Biological and Environmental Sciences and Engineering (BESE) Division

Full metadata record

DC FieldValue Language
dc.contributor.authorGojobori, Takashien
dc.date.accessioned2016-03-16T12:53:17Zen
dc.date.available2016-03-16T12:53:17Zen
dc.date.issued2016-01-25en
dc.identifier.urihttp://hdl.handle.net/10754/601396en
dc.description.abstractSince the human genome draft sequence was in public for the first time in 2000, genomic analyses have been intensively extended to the population level. The following three international projects are good examples for large-scale studies of human genome variations: 1) HapMap Data (1,417 individuals) (http://hapmap.ncbi.nlm.nih.gov/downloads/genotypes/2010-08_phaseII+III/forward/), 2) HGDP (Human Genome Diversity Project) Data (940 individuals) (http://www.hagsc.org/hgdp/files.html), 3) 1000 genomes Data (2,504 individuals) http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ If we can integrate all three data into a single volume of data, we should be able to conduct a more detailed analysis of human genome variations for a total number of 4,861 individuals (= 1,417+940+2,504 individuals). In fact, we successfully integrated these three data sets by use of information on the reference human genome sequence, and we conducted the big data analysis. In particular, we constructed a phylogenetic tree of about 5,000 human individuals at the genome level. As a result, we were able to identify clusters of ethnic groups, with detectable admixture, that were not possible by an analysis of each of the three data sets. Here, we report the outcome of this kind of big data analyses and discuss evolutionary significance of human genomic variations. Note that the present study was conducted in collaboration with Katsuhiko Mineta and Kosuke Goto at KAUST.en
dc.titleBig Data Analysis of Human Genome Variationsen
dc.typePresentationen
dc.contributor.departmentComputational Bioscience Research Center (CBRC)en
dc.contributor.departmentBiological and Environmental Sciences and Engineering (BESE) Divisionen
dc.conference.dateJanuary 25-27, 2016en
dc.conference.nameKAUST Research Conference on Computational and Experimental Interfaces of Big Data and Biotechnologyen
dc.conference.locationKAUST, Thuwal, Saudi Arabiaen
kaust.authorGojobori, Takashien
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.