With the help of Wilkins (though behind Franklin’s back) Watson and Crick finally managed to get a good look at her latest X-ray pictures of DNA and were quick to recognize the telltale image of a helix. Several days of frantic model-building resulted in their triumphant unveiling of the double helical form of DNA. The first to share in the newly discovered ‘secret of life’ were the regulars of the Eagle, the pub just outside the Cavendish laboratories. But a paper was soon drafted to Nature which ended with the classic understatement: ‘It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.’ According to Crick, their intention was not to be coy but was born of Watson’s fear that he might ‘make an ass of himself by saying too much too soon.
HOW DNA WORKS
The key to Watson and Crick’s structure is the order and pairing of the nucleic acid bases. The backbone of each DNA molecule is a string (polymer) of deoxyribose sugars, linked by phosphate groups. Each sugar has a single base attached that can be one of four bases, guanine (G), cytosine (C), thymine (T) or adenine (A). Running down the length of a single DNA strand you can therefore read a linear sequence of bases, such as ATCCGTACCTGAACATAACCGATT… Codes were not unfamiliar in post-war England, particularly to Crick who during the war had worked as a scientist for the Admiralty. The linear sequence of bases looked like a code: the genetic code. Watson and Crick suggested that the sequence of bases codes for the structure of proteins. By the 1950s it was known that proteins performed nearly all the work of making living cells. In particular, enzymes which make everything else DNA, RNA, fats, sugars, polysaccharides – the complete living cell – are proteins. If DNA encoded the information to make proteins, and proteins made everything else, then the problem of how cells know what to make would be solved.
Over the next decade the code was cracked, confirming that DNA sequences did indeed code for proteins. Proteins are linear polymers of amino acids (another group of simple organic acids). There are twenty common amino acids that go into proteins but only four bases that go into DNA. There cannot therefore be a one-to-one coding between a DNA base and an amino acid. It was not long before experiments performed by Marshal Nirenberg, Gobind Khovana and Severo Ochoa established that a triplet of bases, called a codon, encodes each amino acid. The codon GCC for instance codes for the amino acid alanine, whilst GGC codes for glycine. A protein made of only 1,000 alanine amino acids would have a genetic code consisting of 1,000 codons (3,000 bases); each codon being the sequence GCC, generating a DNA sequence GCC GCC GCC GCC GCC… All natural proteins are far more complex than this and are encoded by a more complex code; but the principle is the same.
This was the answer to one of life’s great puzzles: how biological information is encoded and stored inside living cells. The DNA sequence encodes the sequence of proteins and the proteins make everything else (even DNA itself – a curious example of self-reference which is one of the intriguing features of life). By directing the synthesis of proteins, the DNA molecule is able to orchestrate all the activities of the entire cell – and thus the entire body. This is how a dog cell knows how to make a dog, how an oak cell knows how to make an oak tree or how a human cell knows how to make us. Each cell carries its own DNA molecule within it with a unique sequence of bases, encoding the essential dogginess, oakiness, or humanness of us all.
In their historic paper’s last line, Watson and Crick suggested that DNA’s structure also provided a solution to the other great mystery of life, how biological information is passed on from one generation to the next – or why you look like your daddy. A major feature of the double helix is that wherever a base occurs on one strand of the DNA, a complementary base is found on the opposite strand: A is paired with T and G is paired with C. The pairs of complementary bases are held together by a hydrogen bond, a type of chemical bond. Chapter Five will look at chemical bonds in more detail, but a hydrogen bond is held together by the electromagnetic force existing between a positively charged proton on one base and the negatively charged electrons on its complementary base.
The information held in the DNA double helix is therefore redundant. The same information is held in two different forms: the coding strand (the strand that codes directly for proteins) and its complement. If one strand is removed, it can be used as a template to direct the synthesis of its complementary strand. This is exactly what happens when a cell replicates its DNA. The strands are pulled apart and each is used as a template to synthesize its complement. The enzymes involved in DNA replication examine the sequence of the single-stranded template and insert only the complementary base into the newly synthesized strand (in fact, fortunately for evolution, the copying isn’t quite perfect – see below). After each strand has been copied, the pair of old and new strands form a duplex DNA molecule again. From a single parental DNA duplex, a pair of daughter duplexes is formed. One of the DNA duplex pair goes into one of the daughter cells and the other duplex goes into the other – biological information is copied. This simple mechanism underlies the replication of all living cells.
HOW DNA TELLS THE CELL WHAT TO DO
DNA encodes proteins but doesn’t make them. That job is performed by structures inside cells called ribosomes which stitch together single amino acid units into strings which are called peptides if short, proteins if they are long. Left to their own devices, ribosomes might randomly string together amino acids, making totally random proteins. Ribosomes could make a staggering variety of proteins if allowed to function in this manner. Consider a relatively small protein, say only one hundred amino acids long. For each of the hundred positions in the protein there are twenty possible amino acids that could be inserted. There are therefore 20100 different ways of putting such a protein together. 20100 is an immense number. It means the product of 20 x 20 x 20 x 20 x 20 x 20 x … 100 times. For convenience, big numbers like this are usually expressed as a power of 10, so that they can easily be compared. In this system, 20100 can also be written as 10130. For comparison, the number of electrons in the universe is a much smaller number, about 1080, so there are not even enough electrons in the entire universe to count the number of possible 100 amino acid proteins! The ribosome’s task is to make only a very tiny fraction of these possible proteins – the proteins the cell needs.
The problem is similar to house-building. The number of possible ways of putting several thousand bricks together is again a staggeringly large number and only a tiny fraction would amount to a functional house. The builder must select from the vast number of possible piles of bricks, one that corresponds to the desired house. He uses a plan that maps each brick (in principle if not in practice) to a specific position in space. The plan provides the builder with the information he needs to build the house. Similarly, the living cell must select from the vast number of all possible proteins the tiny fraction that corresponds to proteins with useful functions. The cell similarly needs to have some kind of plan or template and for this it uses DNA.
There is, however, (at least in animal and plant cells) a physical problem that must be overcome if DNA is to direct protein synthesis. DNA is held within the nucleus (a membrane-bound sac inside cells) of animal cells but the ribosomes are located outside it in the cytoplasm (the cellular material outside the nucleus). One possible solution would be for the DNA to pass through the nuclear membrane to the ribosomes where it is needed to direct protein synthesis. However DNA is a huge molecule, millions or even billions of bases long and it would not be easy for it to squeeze out through the membrane’s small pores. What actually happens is that the information held in DNA is copied into a smaller, mobile analogue of DNA, known as RNA. RNA has all the same bases as DNA (well nearly all, it uses a base called uracil instead of thymine) on a sugar phosphate backbone, just like DNA. The only difference is that the sugar that goes into its backbone is ribose rather than deoxyribose (hence RNA rather than DNA). Since the bases are nearly the same, a single DNA strand can pair with a complementary RNA strand (the RNA uracil pairs with adenine) to form a DNA::RNA hybrid double helix. An enzyme called RNA polymerase then makes RNA copies of DNA genes. The RNA copy,