Transcriptional landscape of ncRNA and Repeat elements in somatic cells

Handle URI:
http://hdl.handle.net/10754/621938
Title:
Transcriptional landscape of ncRNA and Repeat elements in somatic cells
Authors:
Ghosheh, Yanal ( 0000-0002-3733-2098 )
Abstract:
The advancement of Nucleic acids (DNA and RNA) sequencing technology has enabled many projects targeted towards the identification of genome structure and transcriptome complexity of organisms. The first conclusions of the human and mouse projects have underscored two important, yet unexpected, findings. First, while almost the entire genome is transcribed, only 5% of it encodes for proteins. Thereby, most transcripts are noncoding RNA. This includes both short RNA (<200 nucleotides (nt)) comprising piRNAs; microRNAs (miRNAs); endogenous Short Interfering RNAs (siRNAs) among others, and includes lncRNA (>200nt). Second, a significant portion of the mammalian genome (45%) is composed of Repeat Elements (REs). RE are mostly relics of ancestral viruses that during evolution have invaded the host genome by producing thousands of copies. Their roles within their host genomes have yet to be fully explored considering that they sometimes produce lncRNA, and have been shown to influence expression at the transcriptional and post-transcriptional levels. Moreover, because some REs can still mobilize within host genomes, host genomes have evolved mechanisms, mainly epigenetic, to maintain REs under tight control. Recent reports indicate that REs activity is regulated in somatic cells, particularily in the brain, suggesting a physiological role of RE mobilization during normal development. In this thesis, I focus on the analysis of ncRNAs, specifically REs; piRNAs; lncRNAs in human and mouse post-mitotic somatic cells. The main aspects of this analysis are: Using sRNA-Seq, I show that piRNAs, a class of ncRNAs responsible for the silencing of Transposable elements (TEs) in testes, are present also in adult mouse brain. Furthermore, their regulation shows only a subset of testes piRNAs are expressed in the brain and may be controlled by known neurogenesis factors. To investigate the dynamics of the transcriptome during cellular differentiation, I examined deep RNA-Seq and Cap Analysis of Gene Expression (CAGE) data from time-course progression program of primary human skeletal muscle cell differentiation. I contrasted this program with Duchenne Muscular Dystrophy (DMD) donors. I identified novel candidates, protein-coding genes and lncRNAs, that may be involved in myogenesis and reaffirmed known myogenic players. Using RNA-Seq data, I designed a novel pipeline to identify possible de novo insertion sites during muscular differentiation, which I have also tested on embryonic mouse cerebral cortex.
Advisors:
Ravasi, Timothy ( 0000-0002-9950-465X )
Committee Member:
Orlando, Valerio ( 0000-0002-4906-8511 ) ; Gao, Xin ( 0000-0002-7108-3574 ) ; Forrest, Alistair
KAUST Department:
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Program:
Computer Science
Issue Date:
1-Dec-2016
Type:
Dissertation
Appears in Collections:
Dissertations

Full metadata record

DC FieldValue Language
dc.contributor.advisorRavasi, Timothyen
dc.contributor.authorGhosheh, Yanalen
dc.date.accessioned2016-12-05T09:06:55Z-
dc.date.available2016-12-05T09:06:55Z-
dc.date.issued2016-12-01-
dc.identifier.urihttp://hdl.handle.net/10754/621938-
dc.description.abstractThe advancement of Nucleic acids (DNA and RNA) sequencing technology has enabled many projects targeted towards the identification of genome structure and transcriptome complexity of organisms. The first conclusions of the human and mouse projects have underscored two important, yet unexpected, findings. First, while almost the entire genome is transcribed, only 5% of it encodes for proteins. Thereby, most transcripts are noncoding RNA. This includes both short RNA (<200 nucleotides (nt)) comprising piRNAs; microRNAs (miRNAs); endogenous Short Interfering RNAs (siRNAs) among others, and includes lncRNA (>200nt). Second, a significant portion of the mammalian genome (45%) is composed of Repeat Elements (REs). RE are mostly relics of ancestral viruses that during evolution have invaded the host genome by producing thousands of copies. Their roles within their host genomes have yet to be fully explored considering that they sometimes produce lncRNA, and have been shown to influence expression at the transcriptional and post-transcriptional levels. Moreover, because some REs can still mobilize within host genomes, host genomes have evolved mechanisms, mainly epigenetic, to maintain REs under tight control. Recent reports indicate that REs activity is regulated in somatic cells, particularily in the brain, suggesting a physiological role of RE mobilization during normal development. In this thesis, I focus on the analysis of ncRNAs, specifically REs; piRNAs; lncRNAs in human and mouse post-mitotic somatic cells. The main aspects of this analysis are: Using sRNA-Seq, I show that piRNAs, a class of ncRNAs responsible for the silencing of Transposable elements (TEs) in testes, are present also in adult mouse brain. Furthermore, their regulation shows only a subset of testes piRNAs are expressed in the brain and may be controlled by known neurogenesis factors. To investigate the dynamics of the transcriptome during cellular differentiation, I examined deep RNA-Seq and Cap Analysis of Gene Expression (CAGE) data from time-course progression program of primary human skeletal muscle cell differentiation. I contrasted this program with Duchenne Muscular Dystrophy (DMD) donors. I identified novel candidates, protein-coding genes and lncRNAs, that may be involved in myogenesis and reaffirmed known myogenic players. Using RNA-Seq data, I designed a novel pipeline to identify possible de novo insertion sites during muscular differentiation, which I have also tested on embryonic mouse cerebral cortex.en
dc.language.isoenen
dc.subjectnc RNAen
dc.subjectInc RNAen
dc.subjectNGSen
dc.subjectPiRNAen
dc.subjecttranscriptomeen
dc.titleTranscriptional landscape of ncRNA and Repeat elements in somatic cellsen
dc.typeDissertationen
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Divisionen
thesis.degree.grantorKing Abdullah University of Science and Technologyen_GB
dc.contributor.committeememberOrlando, Valerioen
dc.contributor.committeememberGao, Xinen
dc.contributor.committeememberForrest, Alistairen
thesis.degree.disciplineComputer Scienceen
thesis.degree.nameDoctor of Philosophyen
dc.person.id101744en
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.