Transcription of Next-Generation DNA Sequencing Methods
1 ANRV353-GG09-20 ARI25 July 200814:57 Next-Generation DNAS equencing MethodsElaine R. MardisDepartments of Genetics and Molecular Microbiology and Genome Sequencing Center,Washington University School of Medicine, St. Louis MO 63108; email: Rev. Genomics Hum. Genet. :387 402 First published online as a Review in Advance onJune 24, 2008 TheAnnual Review of Genomics and Human Geneticsis online at article s 2008 by Annual rights reserved1527-8204/08/0922-0387$ Wordsmassively parallel Sequencing , Sequencing -by-synthesis, resequencingAbstractRecent scientific discoveries that resulted from the application of Next-Generation DNA Sequencing technologies highlight the striking impactof these massively parallel platforms on genetics.
2 These new meth-ods have expanded previously focused readouts from a variety of DNApreparation protocols to a genome-wide scale and have fine-tuned theirresolution to single base precision. The Sequencing of RNA also hastransitioned and now includes full-length cDNA analyses, serial analysisof gene expression (SAGE)-based Methods , and noncoding RNA dis-covery. Next-Generation Sequencing has also enabled novel applicationssuch as the Sequencing of ancient DNA samples, and has substantiallywidened the scope of metagenomic analysis of environmentally derivedsamples. Taken together, an astounding potential exists for these tech-nologies to bring enormous change in genetic and biological researchand to enhance our fundamental biological here for quick links to Annual Reviews content online, including: Other articles in this volume Top cited articles Top downloaded articles Our comprehensive searchFurtherANNUALREVIEWSAnnu.
3 Rev. Genom. Human Genet. :387-402. Downloaded from Columbia University on 09/03/10. For personal use July 200814:57 INTRODUCTIONThe Sequencing of the reference humangenome was the capstone for many years of hardwork spent developing high-throughput, high-capacity production DNA Sequencing and as-sociated sequence finishing pipelines. The ap-proach used>20,000 large bacterial artificialchromosome (BAC) clones that each containedan approximately 100-kb fragment of the hu-man genome, which together provided an over-lapping set or tiling path through each humanchromosome as determined by physical map-ping (31).
4 In BAC-based Sequencing , each BACclone is amplified in bacterial culture, isolatedin large quantities, and sheared to produce size-selected pieces of approximately 2 3 kb. Thesepieces are subcloned into plasmid vectors, am-plified in bacterial culture, and the DNA isselectively isolated prior to Sequencing . Bygenerating approximately eightfold oversam-pling (coverage) of each BAC clone in plasmidsubclone equivalents, computer-aided assemblycan largely recreate the BAC insert sequencein contigs (contiguous stretches of assembledsequence reads). Subsequent refinement, in-cluding gap closure and sequence qualityimprovement (finishing), produces a single con-tiguous stretch of high-quality sequence (typi-cally with less than 1 error per 40,000 bases).
5 Since the completion of the human genomeproject (HGP) (26, 51), substantive changeshave occurred in the approach to genome se-quencing that have moved away from BAC-based approaches and toward whole-genomesequencing (WGS), with changes in the ac-companying assembly algorithms. In the WGSapproach, the genomic DNA is sheared di-rectly into several distinct size classes and placedinto plasmid and fosmid subclones. Oversam-pling the ends of these subclones to gener-ate paired-end Sequencing reads provides thenecessary linking information to fuel whole-genome assembly algorithms.
6 The net result isthat genomes can be sequenced more rapidlyand more readily, but highly polymorphic orhighly repetitive genomes remain quite frag-mented after these dramatic changes in sequenc-ing and assembly approaches, the primary dataproduction for most genome Sequencing sincethe HGP has relied on the same type of capillarysequencing instruments as for the HGP. How-ever, that scenario is rapidly changing owingto the invention and commercial introductionof several revolutionary approaches to DNAsequencing, the so-called Next-Generation se-quencing technologies.
7 Although these instru-ments only began to become commerciallyavailable in 2004, they already are having a ma-jor impact on our ability to explore and an-swer genome-wide biological questions; morethan 100 Next-Generation Sequencing relatedmanuscripts have appeared to date in the peer-reviewed literature. These technologies are notonly changing our genome Sequencing ap-proaches and the associated timelines and costs,but also accelerating and altering a wide va-riety of types of biological inquiry that havehistorically used a Sequencing -based readout,or effecting a transition to this type of read-out, as detailed in this review.
8 Furthermore, Next-Generation platforms are helping to openentirely new areas of biological inquiry, includ-ing the investigation of ancient genomes, thecharacterization of ecological diversity, and theidentification of unknown etiologic DNASEQUENCINGT hree platforms for massively parallel DNAsequencing read production are in reasonablywidespread use at present: the Roche/454 FLX (30) ( ), the Illumina/Solexa Genome Analyzer (7) ( ), and theApplied Biosystems SOLiDTMS ystem ( / SolidKnowledge / flash / 102207 ). Recently, another two massivelyparallelsystemswereannounced:th eHelicos HeliscopeTM( )andPacificBiosciencesSMRT( ) instruments.
9 The388 MardisAnnu. Rev. Genom. Human Genet. :387-402. Downloaded from Columbia University on 09/03/10. For personal use July 200814:57 Helicos system only recently became com-mercially available, and the Pacific Biosciencesinstrument will likely launch commerciallyin early 2010. Each platform embodies acomplex interplay of enzymology, chemistry,high-resolution optics, hardware, and softwareengineering. These instruments allow highlystreamlined sample preparation steps prior toDNA Sequencing , which provides a significanttime savings and a minimal requirementfor associated equipment in comparison tothe highly automated, multistep pipelinesnecessary for clone-based high-throughputsequencing.
10 By different approaches outlinedbelow, each technology seeks to amplify singlestrands of a fragment library and performsequencing reactions on the amplified fragment libraries are obtained by anneal-ing platform-specific linkers to blunt-endedfragments generated directly from a genome orDNA source of interest. Because the presenceof adapter sequences means that the moleculesthen can be selectively amplified by PCR, nobacterial cloning step is required to amplify thegenomic fragment in a bacterial intermediate asis done in traditional Sequencing , both the Helicos and PacificBiosystems instruments mentioned above areso-called single molecule sequencers anddo not require any amplification of DNAfragments prior to contrast between these instrumentsand capillary platforms is the run time requiredto generate data.