DNA Sequencing: Its History, Applications and Methods

DNA Sequencing: Its History, Applications and Methods

An operating manual is provided along with most gadgets. The manual contains overall instructions required to use the gadget efficiently. Just like an operating manual our DNA is composed of a four-letter alphabet: A, T, G, and C provide instructions for the building and functioning of an organism. From the colour of our eyes to the curl of our hair, everything depends upon the codes in DNA. The four letters A, T, G, and C represent nitrogenous bases, an essential part of DNA, that carries genetic information for life.
The process of determining the exact order of nucleotide bases present in a DNA molecule is known as DNA sequencing. It is a tool for deciphering the genetic information contained in DNA, one letter at a time. DNA sequencing has revolutionized numerous sectors, including medicine, biology, and phylogenetics. From helping in the development of personalized medicine to the resurrection of ancient species from fossils, DNA sequencing has established itself as a powerful tool central to scientific progress in the 21st century.

1.1 History of DNA sequencing

In 1953, Watson and Crick became the first individuals to discover the double helical structure of DNA. Scientists had accepted the fact that DNA is the carrier of genetic information, but they could not read the DNA sequences yet. It was years later in 1977 that two DNA sequencing techniques were introduced: Sanger sequencing by Fredrick Sanger and Maxam and Gilbert sequencing by Allan Maxam and Walter Gilbert.1 In the 1980s and 90s there was widespread use of Sanger sequencing in research laboratories. Researchers began to sequence genes one by one, then they successfully sequenced viruses, bacteria, and small eukaryotes.2
The human genome project was launched in 1990. It was an international collaborative project involving scientists around the globe. The primary objective of the project was to sequence the whole genome of humans containing three billion base pairs. It was completed in 2003 by complete sequencing of 99% of the human genome with 99.9% accuracy. In the early 2000s, new sequencing techniques were developed such as Illumina (Solexa), 454 (Roche), and SoLiD, that used massively parallel methods to read millions of DNA fragments at once. These technologies were collectively termed as next-generation sequencing (NGS). After 2010, third-generation sequencing techniques like Oxford Nanopore’s MinION and PacBio’s SMRT were introduced. They provide advantages like longer reads, portability, and higher accessibility. Now, DNA sequencing has become cheaper (less than 100$ in some contexts), faster, and accessible more than ever before.
Fig: Steps in DNA sequencing (Source: Bisen P. Laboratory Protocols in Applied Life Sciences. Boca Raton, FL: CRC Press; 2014. Figure 19. doi:10.1201/b16575)

1.2 Methods used for DNA sequencing

Various methods have been developed making the sequencing process easier, cheaper, and more accessible. Some of these techniques along with their limitations have been discussed below:

Sanger Dideoxy nucleotide synthetic method:

It was developed by Fredrick Sanger in 1997. Sanger sequencing is based on dideoxynucleotide in a DNA polymerizing reaction. Dideoxynucleotides refer to the nucleotide with an OH group missing in 3`end. The procedure for classical Sanger sequencing is as follows:
Fig: Structure of Dideoxynucleotide. (Source: Genetic Education. Inc. ddNTP in Sanger Sequencing: What Is It and Why We Use It.)
Four reaction tubes are set up, each containing a single-stranded DNA sample to be sequenced. The setup also requires 4 dNTPs (radioactively labeled), DNA polymerase,e and one type of dideoxynucleotide (ddNTP) per reaction (i.e. ddATP, ddGTP, ddCTP, ddTTP) at a low concentration.
ddNTPs lack a 3` OH group, causing chain termination when it is incorporated. This produces DNA fragments of various lengths ending at specific nucleotides (A, T, G, or C).
The DNA fragments are separated on the basis of size by electrophoresis on a high-resolution polyacrylamide gel.
The gel is used for autoradiography to visualize the position of different bands in each lane.1

Limitation of Sanger sequencing:

It is only able to sequence one DNA fragment at a time.
It has a short read length (500 – 900 base pairs).
Slower compared to modern sequencing methods.
It is very expensive for large genome sequencing.

Maxam and Gilbert’s chemical degradation method

It was introduced by Allan Maxam and Walter Gilbert in 1997.1
Maxam and Gilbert’s sequencing is based on the sequencing of nucleotides by cleavage at specific sites. These specific sites include cleavage at G (Guanine), G and A, C, C and T. It follows the following procedure:
Fig: Maxam and Gilbert sequencing (Source: Eidefors A. Strategies for de novo DNA sequencing.)
The 5` ends of DNA are labeled with 32P (radioactive labeling)
The two strands of the radioactively labeled DNA are separated.
The mixture is divided into 4 different tubes. Each tube is treated with reagents that cleave the nucleotide chain at specific sites i.e. G, G and  A, C, C and T.
For Eg: To cleave the nucleotide at G, the Guanine is methylated by using dimethyl sulfate. When methylated Guanine is treated with hot piperidine, the sugar-phosphate backbone gets cleaved.
The resulting fragments are separated by size using high-resolution polyacrylamide gel electrophoresis.
The gel is autoradiographed to determine the sequence from the position of bands in four lanes.2

 Limitation of Maxam and Gilbert Sequencing:

It involves the usage of toxic chemicals. (E.g. Hydrazine, Piperdine)
It has a short read length (100-300 base pairs) not compatible for sequencing the large genomes.
Radioactivity is involved which requires extra precaution.

 Next Generation Sequencing (NGS)

Next Generation Sequencing is also referred to as massively parallel sequencing or high throughput sequencing. It allows millions to billions of DNA fragments to be sequenced simultaneously. The general procedure for NGS is given below:
DNA is fragmented into smaller pieces.
Special adaptors are added to both ends of each DNA fragment. Adaptors are short synthetic sequences that enable the machine to recognize and copy the DNA.
In most cases, NGS platforms use amplifying techniques to make many copies of each fragment using the PCR method or Bridge amplification.
Different platforms use distinct techniques to detect base incorporation. For E.g. In Illumina, each base is attached to a fluorescent dye. The machine adds one base at a time and the color determines which base was added.
Lastly, the high-volume data is analyzed by base calling (i.e. sequencer reads color and translates it into DNA bases), quality control (bad reads are filtered out), read alignment (data is aligned to a reference genome), Variant calling (determining Indels, SNPs), etc.

Limitations of NGS

The instruments and regents required in NGS are costly.

It is difficult to detect large genomic changes through NGS.

It is difficult to resolve repetitive regions.

1.3 Applications of DNA sequencing

Genomic research

DNA sequencing is widely used in genomic research, especially for whole genome sequencing of humans, plants, animals, microbes, etc. It helps the scientists to decipher the genetic organization of an organism.

Clinical application

DNA sequencing is used to create gene maps and identify mutations in the human genome. Diagnosis of various genetic disorders like cystic fibrosis, Huntington’s disease as well as some forms of cancer is possible using this technology. This may provide a critical period for planning treatment options available. The development of personalized medicine based on an individual’s genetic profile further increases the efficacy of treatment methods.

Forensic science

DNA fingerprinting which is based on DNA sequencing technology can be used to identify, individual based on their unique DNA pattern. DNA fingerprinting is applied to perform criminal investigations, paternity testing, and identification of victims in natural disasters.

Evolutionary genomics

Comparing DNA sequences of organisms against each other can reveal insightful evolutionary relationships among various species. Conserved genes can be studied to track the evolution pattern of organisms.

Agriculture and Crop development

DNA sequencing can identify genes responsible for disease resistance, drought tolerance, or higher yield. This helps to develop genetically modified (GM) animals and crops through genetic engineering.

Metagenomics

DNA sequencing helps to analyze microbial populations in ecological regions such as soil and ocean. It helps to understand the various roles microbes play to maintain balance in the biogeochemical cycle. This enables the development of bioremediation technology and policies supporting it.

REFERENCE

Dr. P.S. Verma, Dr. C.K. Agrawal. Cell Biology, Genetics, Molecular Biology, Evolution and Ecology. fourteenth. (Bharatnagar S, Pradhan S, eds.). S.CHAND & COMPANY PVT.LTD.; 2016
Watson JD, Baker TA, Bell SP, Gann A, Levine M, Losick R. Molecular Biology of the Gene. 7th ed. Pearson Education; 2014.

Share
Pin Share

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply