The 64 codons of the genetic code are shown in Figure 3.9 together with the side chains of the amino acids for which each codes. Amino acids with hydrophilic side chains are shown in green while those with hydrophobic side chains are in black. Glycine, which has a hydrogen for a side chain, is shown in gray. The importance of these distinctions will be discussed in Chapter 7. Methionine is encoded by a single codon: AUG. Tryptophan is also encoded by a single codon, but the other 18 amino acids are encoded by more than one codon and so the code is degenerate. Although there are 64 possible codons, there are only 20 amino acids. Sixty‐one codons specify an amino acid and the remaining three act as stop signals for protein synthesis (Figure 3.9). No triplet codes for more than one amino acid and so the code is unambiguous. Notice that when two or more codons specify the same amino acid, they usually only differ in the third base of the triplet. Thus single base substitutions in the third base can often leave the amino acid sequence unaltered. Perhaps degeneracy evolved in the triplet system to avoid a situation in which 20 codons each meant one amino acid and 44 specified none. If this were the case, then most mutations would stop protein synthesis dead.
Start and Stop Codons and the Reading Frame
The order of the codons in DNA and the amino acid sequence of a protein are colinear. The start signal for protein synthesis is the codon AUG, specifying the incorporation of methionine. Because the genetic code is read in blocks of three, there are three potential reading frames in any mRNA. Figure 3.10 shows that only one of these results in the synthesis of the correct protein. When we look at a sequence of bases, it is not obvious which of the reading frames should be used to code for the protein. As we shall see later (page 89), the ribosome scans along the mRNA until it encounters an AUG. This both defines the first amino acid of the protein and the reading frame used from that point on. A mutation that inserts or deletes a nucleotide will change the normal reading frame and is called a frameshift mutation (Figure 3.11).
The codons UAA, UAG, and UGA are stop signals for protein synthesis. A base change that causes an amino acid codon to become a stop codon is known as a nonsense mutation (Figure 3.11). If, for example, the codon for tryptophan UGG changes to UGA, then a premature stop signal will have been introduced into the messenger RNA template. A shortened protein, usually without function, is produced.
The Code Is Nearly Universal
The code shown in Figure 3.9 is the one used by organisms as diverse as E. coli and humans for their nuclear‐encoded proteins. It was originally thought that the code would be universal. However, several mitochondrial genes use UGA to mean tryptophan rather than stop. The nuclear code for some unicellular eukaryotes uses UAA and UAG to code for glutamine rather than stop.
Missense Mutations
A mutation that changes the codon from one amino acid to that for another by substitution of one base for another is a missense mutation (Figure 3.11). As shown in Figure 3.9, the second base of each codon shows the most consistency with the chemical nature of the amino acid it encodes. Amino acids with hydrophobic side chains, shown in black in Figure 3.9, have a U or a C – a pyrimidine – in the second position. With two exceptions, serine and threonine, amino acids with hydrophilic side chains, shown in green in Figure 3.9, have a G or an A – a purine – in the second position. This has implications for mutations of the second base. Substitution of a purine for a pyrimidine is very likely to change the chemical nature of the amino acid side chain significantly and can therefore seriously affect the protein. Sickle cell anemia is an example of such a mutation. At position 6 in the β‐globin chain of hemoglobin, the mutation in DNA changes a glutamate residue encoded by GAG to a valine residue encoded by GTG (GUG in RNA). The shorthand notation for this mutation is E6V, meaning that the glutamate (E) at position 6 of the protein becomes a valine (V). This change in amino acid alters the overall charge of the chain and the hemoglobin tends to precipitate in the red blood cells of those affected. The cells adopt a sickle shape and therefore tend to block blood vessels, causing sickle cell anemia with painful cramp‐like symptoms and progressive damage to vital organs.
William Warrick Cardozo.
Source: AAREG. Image from https://aaregistry.org/story/sickle‐cell‐pioneer‐willliam‐w‐cardozo/.
The peculiar shape of red blood cells in patients with sickle cell anemia was first described in 1910 but little experimental investigation had been conducted until William Warrick Cardozo published a paper in 1937 reporting a comprehensive study of the largest number of patients ever tested for the disease. Cardozo was a pediatrician whose research on sickle cell anemia was conducted during a two‐year fellowship in pediatrics at the Children's Memorial Hospital and Provident Hospital in Chicago. Cardozo's findings confirmed the heritability of the disorder and revealed that “the sickling factor remains within the cell, no matter how long preserved, as long as the cell itself remains intact.” He concluded that future therapeutic interventions would need to be interventions on the cell itself. Today, the only cure for sickle cell anemia is a stem cell or bone marrow transplant that replaces the damaged red blood cells with healthy ones.
Medical Relevance 3.2 Osteogenesis Imperfecta
Collagen is the most abundant protein in the body and a major component of the