Show simple item record

dc.contributor.authorCahill, Matt J.
dc.contributor.authorKöser, Claudio U.
dc.contributor.authorRoss, Nicholas E.
dc.contributor.authorArcher, John A.C.
dc.date.accessioned2014-08-27T09:44:52Z
dc.date.available2014-08-27T09:44:52Z
dc.date.issued2010-07-12
dc.identifier.citationCahill MJ, Köser CU, Ross NE, Archer JAC (2010) Read Length and Repeat Resolution: Exploring Prokaryote Genomes Using Next-Generation Sequencing Technologies. PLoS ONE 5: e11518. doi:10.1371/journal.pone.0011518.
dc.identifier.issn19326203
dc.identifier.pmid20634954
dc.identifier.doi10.1371/journal.pone.0011518
dc.identifier.urihttp://hdl.handle.net/10754/325284
dc.description.abstractBackground: There are a growing number of next-generation sequencing technologies. At present, the most cost-effective options also produce the shortest reads. However, even for prokaryotes, there is uncertainty concerning the utility of these technologies for the de novo assembly of complete genomes. This reflects an expectation that short reads will be unable to resolve small, but presumably abundant, repeats. Methodology/Principal Findings: Using a simple model of repeat assembly, we develop and test a technique that, for any read length, can estimate the occurrence of unresolvable repeats in a genome, and thus predict the number of gaps that would need to be closed to produce a complete sequence. We apply this technique to 818 prokaryote genome sequences. This provides a quantitative assessment of the relative performance of various lengths. Notably, unpaired reads of only 150nt can reconstruct approximately 50% of the analysed genomes with fewer than 96 repeat-induced gaps. Nonetheless, there is considerable variation amongst prokaryotes. Some genomes can be assembled to near contiguity using very short reads while others require much longer reads. Conclusions: Given the diversity of prokaryote genomes, a sequencing strategy should be tailored to the organism under study. Our results will provide researchers with a practical resource to guide the selection of the appropriate read length. 2010 Cahill et al.
dc.language.isoen
dc.publisherPublic Library of Science (PLoS)
dc.rightsCahill et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
dc.rightsArchived with thanks to PLoS ONE
dc.subjectaccuracy
dc.subjectalgorithm
dc.subjectbacterial genome
dc.subjectEscherichia coli K 12
dc.subjectgenetic variability
dc.subjectMycoplasma genitalium
dc.subjectnucleotide sequence
dc.subjectprediction
dc.subjectprokaryote
dc.subjectquantitative analysis
dc.subjectsequence analysis
dc.subjectspecies difference
dc.subjectStreptomyces coelicolor
dc.subjectDNA sequence
dc.subjectgenetics
dc.subjectgenome
dc.subjectmetabolism
dc.subjectmethodology
dc.subjectnucleotide repeat
dc.subjectprokaryotic cell
dc.subjectProkaryota
dc.subjectAlgorithms
dc.subjectGenome
dc.subjectProkaryotic Cells
dc.subjectRepetitive Sequences, Nucleic Acid
dc.subjectSequence Analysis, DNA
dc.titleRead length and repeat resolution: Exploring prokaryote genomes using next-generation sequencing technologies
dc.typeArticle
dc.contributor.departmentBiological and Environmental Sciences and Engineering (BESE) Division
dc.contributor.departmentComputational Bioscience Research Center (CBRC)
dc.identifier.journalPLoS ONE
dc.identifier.pmcidPMC2902515
dc.eprint.versionPublisher's Version/PDF
dc.contributor.institutionDepartment of Genetics, University of Cambridge, Cambridge, United Kingdom
dc.contributor.institutionDepartment of Chemical Engineering and Biotechnology, University of Cambridge, Cambridge, United Kingdom
dc.contributor.affiliationKing Abdullah University of Science and Technology (KAUST)
kaust.personArcher, John A.C.
refterms.dateFOA2018-06-13T14:47:30Z


Files in this item

Thumbnail
Name:
Article-PLoS_ONE-Read_lengt-2010.pdf
Size:
694.6Kb
Format:
PDF
Description:
Article - Full Text
Thumbnail
Name:
Supplement_1_-_PLoS_ONE-Read_lengt-2010.pone.0011518.s001.pdf
Size:
28.95Kb
Format:
PDF
Description:
Supplemental File 1
Thumbnail
Name:
Supplement_2_-_PLoS_ONE-Read_lengt-2010.pone.0011518.s002.tif
Size:
693.3Kb
Format:
TIFF image
Description:
Supplemental File 2
Thumbnail
Name:
Supplement_3_-_PLoS_ONE-Read_lengt-2010.pone.0011518.s004.tif
Size:
1.014Mb
Format:
TIFF image
Description:
Supplemental File 3
Thumbnail
Name:
Supplement_4_-_PLoS_ONE-Read_lengt-2010.pone.0011518.s003.xls
Size:
278.5Kb
Format:
Microsoft Excel
Description:
Supplemental File 4

This item appears in the following Collection(s)

Show simple item record