The DNA Molecule Is a Double Helix
In 1953 Rosalind Franklin used X‐ray diffraction to show that DNA was a helical (i.e. twisted) polymer. James Watson and Francis Crick demonstrated, by building three‐dimensional models, that the molecule is a double helix (Figure 3.4). Two hydrophilic sugar‐phosphate backbones lie on the outside of the molecule and the purine and pyrimidine bases lie on the inside of the molecule. There is just enough space for one purine and one pyrimidine in the center of the double helix. The Watson–Crick model showed that the purine guanine (G) would fit nicely with the pyrimidine cytosine (C). The purine adenine (A) would fit nicely with the pyrimidine thymine (T). Thus A always base pairs with T and G always base pairs with C.
Hydrogen Bonds Form Between Base Pairs
A hydrogen bond forms when a hydrogen atom is shared. The hydrogen bonds in an A–T and a G–C base pair (Figure 3.4) form when the hydrogen attached to a nitrogen in one base gets close to an electron‐grabbing oxygen or nitrogen in the other base of the pair. The hydrogen bonds formed between the base pairs hold the DNA helix together. The three hydrogen bonds formed between G and C produce a relatively strong base pair. Because only two hydrogen bonds are formed between A and T, this weaker base pair is more easily broken. The difference in strengths between a G–C and an A–T base pair is important in the initiation of DNA replication (page 51) and in the initiation and termination of RNA synthesis (page 69).
DNA Strands Are Antiparallel
The two strands of DNA are said to be antiparallel because they lie in the opposite orientation with respect to one another, with the 3′‐hydroxyl terminus of one strand opposite the 5′‐phosphate terminus of the second strand. The sugar‐phosphate backbones do not completely conceal the bases inside. There are two grooves along the surface of the DNA molecule. One is wide and deep – the major groove – and the other is narrow and shallow – the minor groove (Figure 3.4). DNA‐binding proteins can use the grooves to gain access to the bases and bind to specific sequences. This is important in initiating replication (page 51) and transcription (page 69) and is also used when manipulating DNA in the laboratory.
Example 3.1 Erwin Chargaff's Puzzling Data
In a key discovery of the 1950s, Erwin Chargaff analyzed the purine and pyrimidine content of DNA isolated from many different organisms and found that the amounts of A and T were always the same, as were the amounts of G and C. Such an identity was inexplicable at the time but helped James Watson and Francis Crick build their double‐helix model in which every A on one strand of the DNA helix has a matching T on the other strand and every G on one strand has a matching C on the other.
The Two DNA Strands Are Complementary
A consequence of the base pairing that joins the two strands of DNA is that if the base sequence of one strand is known, then that of its partner can be inferred. A G in one strand will always be paired with a C in the other. Similarly an A will always pair with a T. The two strands are therefore said to be complementary.
DNA AS THE GENETIC MATERIAL
Deoxyribonucleic acid carries the genetic information encoded in the sequence of the four bases – guanine, adenine, thymine, and cytosine. The information in DNA is transferred to its daughter molecules through replication (the duplication of DNA molecules) and subsequent cell division. DNA directs the synthesis of proteins through the intermediary molecule messenger RNA( mRNA). The DNA code is transferred to mRNA by a process known as transcription (Chapter 5). The mRNA code is then translated into a sequence of amino acids during protein synthesis (Chapter 6). This is the central dogma of molecular biology: DNA makes RNA makes protein.
Retroviruses such as the human immunodeficiency virus, the cause of AIDS, are an exception to this rule. As their name suggests, they reverse the normal order of data transfer. Inside the virus coat is a molecule of RNA plus an enzyme that can make DNA from an RNA template by the process known as reverse transcription.
We do not yet know the exact number of genes that encode messenger RNA in the human genome. The current estimate is 19 116. Table 3.1 compares the number of predicted messenger RNA genes in the genomes of different organisms. In each organism, there are also a small number of genes (about 100 in humans) that code for ribosomal RNAs and transfer RNAs. The roles these three types of RNA play in protein synthesis is described in Chapter 6.
PACKAGING OF DNA MOLECULES INTO CHROMOSOMES
Eukaryotic Chromosomes and Chromatin Structure
A human cell contains 46 chromosomes (23 pairs), each of which is a single DNA molecule bundled up with various proteins. On average, each human chromosome contains about 1.3 × 108 base pairs (bp) of DNA. If the DNA in a human chromosome were stretched as far as it would go without breaking it would be about 5 cm long, so the 46 chromosomes in all represent about 2 m of DNA. The nucleus in which this DNA must be contained has a diameter of only about 10 μm, so large amounts of DNA must be packaged into a small space. This represents a formidable problem that is dealt with by binding the DNA to proteins to form chromatin. As shown in Figure 3.5, the DNA double helix is packaged at both small and larger scales. In the first stage, shown on the right of the figure, the DNA double helix with a diameter of 2 nm is bound to proteins known as histones. Histones are positively charged because they contain large amounts of the amino acids arginine and lysine (page 104) and bind tightly to the negatively charged phosphates on DNA. A 146 bp length of DNA is wound around a protein complex composed of two molecules each of four different histones – H2A, H2B, H3, and H4 – to form a nucleosome. Because each nucleosome is separated from its neighbor by about 50 bp of linker DNA, this unfolded chromatin state looks like beads on a string when viewed in an electron microscope. Nucleosomes undergo further packaging. A fifth type of histone, H1, binds to the linker DNA and pulls the nucleosomes together, helping to further coil the DNA into chromatin fibers 30 nm in diameter, which are referred to as 30‐nm solenoids. The fibers then form loops with the help of a class of proteins known as nonhistones and this further condenses the DNA (panels on left‐hand side of Figure 3.5) into a higher order set of coils in a process of supercoiling.