Species-independent MicroRNA Gene Discovery

Handle URI:
http://hdl.handle.net/10754/263237
Title:
Species-independent MicroRNA Gene Discovery
Authors:
Kamanu, Timothy K.
Abstract:
MicroRNA (miRNA) are a class of small endogenous non-coding RNA that are mainly negative transcriptional and post-transcriptional regulators in both plants and animals. Recent studies have shown that miRNA are involved in different types of cancer and other incurable diseases such as autism and Alzheimer’s. Functional miRNAs are excised from hairpin-like sequences that are known as miRNA genes. There are about 21,000 known miRNA genes, most of which have been determined using experimental methods. miRNA genes are classified into different groups (miRNA families). This study reports about 19,000 unknown miRNA genes in nine species whereby approximately 15,300 predictions were computationally validated to contain at least one experimentally verified functional miRNA product. The predictions are based on a novel computational strategy which relies on miRNA family groupings and exploits the physics and geometry of miRNA genes to unveil the hidden palindromic signals and symmetries in miRNA gene sequences. Unlike conventional computational miRNA gene discovery methods, the algorithm developed here is species-independent: it allows prediction at higher accuracy and resolution from arbitrary RNA/DNA sequences in any species and thus enables examination of repeat-prone genomic regions which are thought to be non-informative or ’junk’ sequences. The information non-redundancy of uni-directional RNA sequences compared to information redundancy of bi-directional DNA is demonstrated, a fact that is overlooked by most pattern discovery algorithms. A novel method for computing upstream and downstream miRNA gene boundaries based on mathematical/statistical functions is suggested, as well as cutoffs for annotation of miRNA genes in different miRNA families. Another tool is proposed to allow hypotheses generation and visualization of data matrices, intra- and inter-species chromosomal distribution of miRNA genes or miRNA families. Our results indicate that: miRNA and miRNA genes are not only species-specific but may also be DNA strand-specific and chromosome-specific; the genomic distribution of miRNA genes is conserved at the chromosomal level across species; miRNA are conserved; More than one miRNA with different regulatory targets can be excised from one miRNA gene; Repeat-related miRNA and miRNA genes with palindromic sequences may be the largest subclass of miRNA class that have eluded detection by most computational and experimental methods.
Advisors:
Bajic, Vladimir B. ( 0000-0001-5435-4750 )
Committee Member:
Gao, Xin ( 0000-0002-7108-3574 ) ; Gehring, Christoph A ( 0000-0003-4355-4591 ) ; Narasimhan, Kothandaraman; Pain, Arnab ( 0000-0002-1755-2819 )
KAUST Department:
Biological and Environmental Sciences and Engineering (BESE) Division
Program:
Bioscience
Issue Date:
Dec-2012
Type:
Dissertation
Appears in Collections:
Bioscience Program; Dissertations; Biological and Environmental Sciences and Engineering (BESE) Division

Full metadata record

DC FieldValue Language
dc.contributor.advisorBajic, Vladimir B.en
dc.contributor.authorKamanu, Timothy K.en
dc.date.accessioned2012-12-19T11:17:27Z-
dc.date.available2012-12-19T11:17:27Z-
dc.date.issued2012-12en
dc.identifier.urihttp://hdl.handle.net/10754/263237en
dc.description.abstractMicroRNA (miRNA) are a class of small endogenous non-coding RNA that are mainly negative transcriptional and post-transcriptional regulators in both plants and animals. Recent studies have shown that miRNA are involved in different types of cancer and other incurable diseases such as autism and Alzheimer’s. Functional miRNAs are excised from hairpin-like sequences that are known as miRNA genes. There are about 21,000 known miRNA genes, most of which have been determined using experimental methods. miRNA genes are classified into different groups (miRNA families). This study reports about 19,000 unknown miRNA genes in nine species whereby approximately 15,300 predictions were computationally validated to contain at least one experimentally verified functional miRNA product. The predictions are based on a novel computational strategy which relies on miRNA family groupings and exploits the physics and geometry of miRNA genes to unveil the hidden palindromic signals and symmetries in miRNA gene sequences. Unlike conventional computational miRNA gene discovery methods, the algorithm developed here is species-independent: it allows prediction at higher accuracy and resolution from arbitrary RNA/DNA sequences in any species and thus enables examination of repeat-prone genomic regions which are thought to be non-informative or ’junk’ sequences. The information non-redundancy of uni-directional RNA sequences compared to information redundancy of bi-directional DNA is demonstrated, a fact that is overlooked by most pattern discovery algorithms. A novel method for computing upstream and downstream miRNA gene boundaries based on mathematical/statistical functions is suggested, as well as cutoffs for annotation of miRNA genes in different miRNA families. Another tool is proposed to allow hypotheses generation and visualization of data matrices, intra- and inter-species chromosomal distribution of miRNA genes or miRNA families. Our results indicate that: miRNA and miRNA genes are not only species-specific but may also be DNA strand-specific and chromosome-specific; the genomic distribution of miRNA genes is conserved at the chromosomal level across species; miRNA are conserved; More than one miRNA with different regulatory targets can be excised from one miRNA gene; Repeat-related miRNA and miRNA genes with palindromic sequences may be the largest subclass of miRNA class that have eluded detection by most computational and experimental methods.en
dc.language.isoenen
dc.subjectmicroRNAen
dc.subjectmicroRNA familiesen
dc.subjectpattern matchingen
dc.subjectkernel methodsen
dc.subjectgenetic algorithmen
dc.subjectvisualisation of multiple categoriesen
dc.titleSpecies-independent MicroRNA Gene Discoveryen
dc.typeDissertationen
dc.contributor.departmentBiological and Environmental Sciences and Engineering (BESE) Divisionen
thesis.degree.grantorKing Abdullah University of Science and Technologyen_GB
dc.contributor.committeememberGao, Xinen
dc.contributor.committeememberGehring, Christoph Aen
dc.contributor.committeememberNarasimhan, Kothandaramanen
dc.contributor.committeememberPain, Arnaben
thesis.degree.disciplineBioscienceen
thesis.degree.nameDoctor of Philosophyen
dc.person.id102012en
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.