DNA sequencing is a term that describes using a number of different experiments to decipher the information stored within DNA molecules. This information is stored in the form of the order or sequence of molecules called nucleotide bases in a DNA molecule. This information is often shown as the order of the abbreviated form of these nucleotide bases, including A, C, G and T.
The first complete sequencing of an organism was completed in 1977, which was followed by the completion of the first human genome in 2001. The sequencing of the first human genome took nearly two decades and 3 billion dollars to complete, and was accomplished separately by two large consortia. Since this time, remarkable advancements in the technology available for sequencing DNA as well as the computational tools to analyze these results have allowed the rapid sequencing of entire DNA genomes in a matter of days to weeks, which is termed “high throughput DNA sequencing” or “next-generation sequencing”.
While several methods of high-throughout genome sequencing exist, these are mostly based on using a protein enzyme called a polymerase. DNA polymerase is an enzyme that is found in all living cells, which copies the entire genome allowing cell division to take place. DNA polymerase can be purified and used in a test tube together with unique “labeled” versions of the nucleotide bases that make up DNA (A, C, G and T). This type of reaction allows scientists to track how each of these bases is used and fit into a specific sequence of DNA that is made as a copy of the segment of DNA being studied. For this experiment, the nucleotide bases are typically modified to include a specific fluorescent molecule, which allows scientists to “see” the sequence by which different fluorescent molecules are added into a DNA molecule, which corresponds to the A, C, G and T bases in that sequence.
There are several different types of technology that largely use this basic principle to read the sequence of 4 million- 750 million nucleotide bases to be read per day, thus allowing the sequencing of the entire human genome of just over 3 billion nucleotide bases to be sequenced within days or weeks. Sophisticated computational tools are used to assemble the completed genome, akin to placing paragraphs and chapters of a book in the correct order.
The advent of high-throughput genome sequencing has allowed unprecedented understanding of human diseases involving DNA mutation, in particular cancer. Several large-scale projects have been completed and more are ongoing, aimed at studying the specific mutations found in individual human cancers, allowing scientists to decode the differences in the DNA sequence of cancer cells compared to that of healthy cells. This has revealed remarkable diversity of human cancers, and may soon lead to individualized therapy for cancer patients.