Exploring massive, genome scale datasets with the genometricorr package
Name:
Article-PLoS_Compu-Exploring_- ...
Size:
2.160Mb
Format:
PDF
Description:
Article - Full Text
Type
ArticleAuthors
Favorov, AlexanderMularoni, Loris
Cope, Leslie M.
Medvedeva, Yulia
Mironov, Andrey A.
Makeev, Vsevolod J.
Wheelan, Sarah J.
KAUST Department
Computational Bioscience Research Center (CBRC)Date
2012-05-31Permanent link to this record
http://hdl.handle.net/10754/325275
Metadata
Show full item recordAbstract
We have created a statistically grounded tool for determining the correlation of genomewide data with other datasets or known biological features, intended to guide biological exploration of high-dimensional datasets, rather than providing immediate answers. The software enables several biologically motivated approaches to these data and here we describe the rationale and implementation for each approach. Our models and statistics are implemented in an R package that efficiently calculates the spatial correlation between two sets of genomic intervals (data and/or annotated features), for use as a metric of functional interaction. The software handles any type of pointwise or interval data and instead of running analyses with predefined metrics, it computes the significance and direction of several types of spatial association; this is intended to suggest potentially relevant relationships between the datasets. Availability and implementation: The package, GenometriCorr, can be freely downloaded at http://genometricorr.sourceforge.net/. Installation guidelines and examples are available from the sourceforge repository. The package is pending submission to Bioconductor. © 2012 Favorov et al.Citation
Favorov A, Mularoni L, Cope LM, Medvedeva Y, Mironov AA, et al. (2012) Exploring Massive, Genome Scale Datasets with the GenometriCorr Package. PLoS Comput Biol 8: e1002529. doi:10.1371/journal.pcbi.1002529.Publisher
Public Library of Science (PLoS)Journal
PLoS Computational BiologyISSN
1553734XPubMed ID
22693437PubMed Central ID
PMC3364938ae974a485f413a2113503eed53cd6c53
10.1371/journal.pcbi.1002529
Scopus Count
Related articles
- Girafe--an R/Bioconductor package for functional exploration of aligned next-generation sequencing reads.
- Authors: Toedling J, Ciaudo C, Voinnet O, Heard E, Barillot E
- Issue date: 2010 Nov 15
- The Integrated Genome Browser: free software for distribution and exploration of genome-scale datasets.
- Authors: Nicol JW, Helt GA, Blanchard SG Jr, Raja A, Loraine AE
- Issue date: 2009 Oct 15
- rtracklayer: an R package for interfacing with genome browsers.
- Authors: Lawrence M, Gentleman R, Carey V
- Issue date: 2009 Jul 15
- GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists.
- Authors: Eden E, Navon R, Steinfeld I, Lipson D, Yakhini Z
- Issue date: 2009 Feb 3
- Rintact: enabling computational analysis of molecular interaction data from the IntAct repository.
- Authors: Chiang T, Li N, Orchard S, Kerrien S, Hermjakob H, Gentleman R, Huber W
- Issue date: 2008 Apr 15
Related items
Showing items related by title, author, creator and subject.
-
Long- and short-term selective forces on malaria parasite genomesNygaard, Sanne; Braunstein, Alexander; Malsen, Gareth; Van Dongen, Stijn; Gardner, Paul P.; Krogh, Anders; Otto, Thomas D.; Pain, Arnab; Berriman, Matthew; McAuliffe, Jon; Dermitzakis, Emmanouil T.; Jeffares, Daniel C. (Public Library of Science (PLoS), 2010-09-09)Plasmodium parasites, the causal agents of malaria, result in more than 1 million deaths annually. Plasmodium are unicellular eukaryotes with small ~23 Mb genomes encoding ~5200 protein-coding genes. The protein-coding genes comprise about half of these genomes. Although evolutionary processes have a significant impact on malaria control, the selective pressures within Plasmodium genomes are poorly understood, particularly in the non-protein-coding portion of the genome. We use evolutionary methods to describe selective processes in both the coding and non-coding regions of these genomes. Based on genome alignments of seven Plasmodium species, we show that protein-coding, intergenic and intronic regions are all subject to purifying selection and we identify 670 conserved non-genic elements. We then use genome-wide polymorphism data from P. falciparum to describe short-term selective processes in this species and identify some candidate genes for balancing (diversifying) selection. Our analyses suggest that there are many functional elements in the non-genic regions of these genomes and that adaptive evolution has occurred more frequently in the protein-coding regions of the genome. © 2010 Nygaard et al.
-
Pivotal role of the muscle-contraction pathway in cryptorchidism and evidence for genomic connections with cardiomyopathy pathways in RASopathiesCannistraci, Carlo; Ogorevc, Jernej; Zorc, Minja; Ravasi, Timothy; Dovc, Peter; Kunej, Tanja (Springer Nature, 2013-02-14)Background: Cryptorchidism is the most frequent congenital disorder in male children; however the genetic causes of cryptorchidism remain poorly investigated. Comparative integratomics combined with systems biology approach was employed to elucidate genetic factors and molecular pathways underlying testis descent. Methods. Literature mining was performed to collect genomic loci associated with cryptorchidism in seven mammalian species. Information regarding the collected candidate genes was stored in MySQL relational database. Genomic view of the loci was presented using Flash GViewer web tool (http://gmod.org/wiki/Flashgviewer/). DAVID Bioinformatics Resources 6.7 was used for pathway enrichment analysis. Cytoscape plug-in PiNGO 1.11 was employed for protein-network-based prediction of novel candidate genes. Relevant protein-protein interactions were confirmed and visualized using the STRING database (version 9.0). Results. The developed cryptorchidism gene atlas includes 217 candidate loci (genes, regions involved in chromosomal mutations, and copy number variations) identified at the genomic, transcriptomic, and proteomic level. Human orthologs of the collected candidate loci were presented using a genomic map viewer. The cryptorchidism gene atlas is freely available online: http://www.integratomics-time.com/cryptorchidism/. Pathway analysis suggested the presence of twelve enriched pathways associated with the list of 179 literature-derived candidate genes. Additionally, a list of 43 network-predicted novel candidate genes was significantly associated with four enriched pathways. Joint pathway analysis of the collected and predicted candidate genes revealed the pivotal importance of the muscle-contraction pathway in cryptorchidism and evidence for genomic associations with cardiomyopathy pathways in RASopathies. Conclusions: The developed gene atlas represents an important resource for the scientific community researching genetics of cryptorchidism. The collected data will further facilitate development of novel genetic markers and could be of interest for functional studies in animals and human. The proposed network-based systems biology approach elucidates molecular mechanisms underlying co-presence of cryptorchidism and cardiomyopathy in RASopathies. Such approach could also aid in molecular explanation of co-presence of diverse and apparently unrelated clinical manifestations in other syndromes. 2013 Cannistraci et al.; licensee BioMed Central Ltd.
-
Experimental evolution, genetic analysis and genome re-sequencing reveal the mutation conferring artemisinin resistance in an isogenic lineage of malaria parasitesHunt, Paul; Martinelli, Axel; Modrzynska, Katarzyna; Borges, Sofia; Creasey, Alison; Rodrigues, Louise; Beraldi, Dario; Loewe, Laurence; Fawcett, Richard; Kumar, Sujai; Thomson, Marian; Trivedi, Urmi; Otto, Thomas D; Pain, Arnab; Blaxter, Mark; Cravo, Pedro (Springer Nature, 2010-09-16)Background: Classical and quantitative linkage analyses of genetic crosses have traditionally been used to map genes of interest, such as those conferring chloroquine or quinine resistance in malaria parasites. Next-generation sequencing technologies now present the possibility of determining genome-wide genetic variation at single base-pair resolution. Here, we combine in vivo experimental evolution, a rapid genetic strategy and whole genome re-sequencing to identify the precise genetic basis of artemisinin resistance in a lineage of the rodent malaria parasite, Plasmodium chabaudi. Such genetic markers will further the investigation of resistance and its control in natural infections of the human malaria, P. falciparum.Results: A lineage of isogenic in vivo drug-selected mutant P. chabaudi parasites was investigated. By measuring the artemisinin responses of these clones, the appearance of an in vivo artemisinin resistance phenotype within the lineage was defined. The underlying genetic locus was mapped to a region of chromosome 2 by Linkage Group Selection in two different genetic crosses. Whole-genome deep coverage short-read re-sequencing (IlluminaSolexa) defined the point mutations, insertions, deletions and copy-number variations arising in the lineage. Eight point mutations arise within the mutant lineage, only one of which appears on chromosome 2. This missense mutation arises contemporaneously with artemisinin resistance and maps to a gene encoding a de-ubiquitinating enzyme.Conclusions: This integrated approach facilitates the rapid identification of mutations conferring selectable phenotypes, without prior knowledge of biological and molecular mechanisms. For malaria, this model can identify candidate genes before resistant parasites are commonly observed in natural human malaria populations. 2010 Hunt et al; licensee BioMed Central Ltd.