Skip to main content
Contribution to Book
Bioinformatics Support of Genome Sequencing Projects
Bioinformatics‐From Genomes to Drugs
  • Xiaoqiu Huang, Iowa State University
Document Type
Book Chapter
Publication Version
Published Version
Publication Date
1-1-2002
Abstract

The genome of an organism is the "book of life". It encodes the complete set of genetic instructions for the development of the organism. The structure of a genome is a linear sequence of nucleotides. Determination of the sequence of a genome lays the foundation for understanding biology at the molecular level. With the current biotechnology, it is a challenging task to determine the sequence of a genome. A sequencing machine can read the sequence of a piece of DNA for up to 1000 bp (base pairs). However, genomes are very huge. For example, the genome of the bacterium E. coli is about 4 Mb (million base pairs) in size, the genome of the nematode C. elegans is 100 Mb in size, and the human genome is 3 Gb in size. The inability to produce long sequences by sequencing machines requires that long sequences be produced from short sequence reads. A shotgun sequencing strategy is widely used to determine the sequence of a long segment of DNA. In this strategy, multiple copies of the DNA segment are randomly cut into small pieces. The sequence of each piece is read by an automated sequencing machine. The sequence of the large DNA segment is reconstructed by a computer program from short sequence reads. The sequence assembly problem is to assemble short reads into long sequences. What makes the sequence assembly problem non-trivial is that there is no information about how short sequence reads are ordered with respect to the DNA segment.

Comments

This chapter was published as Huang, Xiaoqiu. "Bioinformatics Support of Genome Sequencing Projects." In Thomas Lengauer (Ed), Bioinformatics‐From Genomes to Drugs (2002): 25‐48. Copyright Wiley-VCH Verlag GmbH & Co. KGaA. Reproduced with permission.

Copyright Owner
Wiley-VCH Verlag GmbH & Co. KGaA
Language
en
File Format
application/pdf
Citation Information
Xiaoqiu Huang. "Bioinformatics Support of Genome Sequencing Projects" Bioinformatics‐From Genomes to Drugs (2002) p. 25 - 48
Available at: http://works.bepress.com/xiaoqiu-huang/27/