A horse show provides a good opportunity to compare characteristics that identify individual horses as belonging to a particular breed. The most conspicuous differences among breeds include color, size, gait, and carriage. Horse breeds are developed with very specific goals in mind and horses are often closely related or highly selected for the same genetic traits. Despite the action of selection, variation continues to exist. Judges still manage to award ribbons of different colors at horse shows. Even the casual observer can discern the close relationship among, for example, Arabian horses and distinguish them from groups of Thoroughbred horses, Quarter horses and Friesian horses. Horse breeders had an intuitive sense of genetics guiding them to create the diversity of breeds that exist today. However, the horse is not an ideal model for studying genetics.
In 1866, the Austrian monk Gregor Mendel determined the principles of genetics from work using garden peas. Subsequently, scientists proved that Mendelian genetic principles apply to the inheritance of traits in animals as well as in plants. The aim of these next two chapters is to describe the basic principles of Mendelian genetics using horses, not peas, as examples. But first, let us consider the nature of genes.
What Are Genes?
Until the discovery of DNA structure, the word “gene” was an abstract term. People used the word to imply a mechanism or fundamental unit for hereditary traits, such as hair color, performance, or size. In this way genes were useful concepts, not unlike numbers or musical notation. We cannot see concepts, but we become aware of them through experience or education. Mendel never saw a gene, yet he was able to describe the basic principles of genetics from working with peas. Beginning with domestication approximately 5500 years ago, the first horse breeders recognized that offspring most resembled their parents. If one wanted a gray horse, then one of the parents needed to be gray. This was clear. However, the inheritance patterns for other traits, such as conformation, size, and performance, were more complex and this confounded breeders. Furthermore, horses are slow breeding, usually producing a single offspring and are thus not well suited for studying the principles of genetics. The genius of Mendel was to study plants, an organism with a short generation interval producing lots of seeds, select a small number of traits, understand them well, then extend that concept to all of heredity.
Genes ceased being abstract concepts in 1953 (Watson and Crick, 1953). The accurate description of DNA structure as the basis for heredity created a second avenue for understanding genetics. The structure, replication, modification, and function of DNA provided a concrete basis for what had been abstract concepts.
DNA
Deoxyribonucleic acid (DNA) was shown to be the chemical substance of heredity by scientists working on bacteria (Avery et al., 1944). However, this molecule was poorly characterized, and early observations did not immediately explain how DNA worked. Since then, DNA has become iconic for genetics following the famous description of DNA structure by Watson and Crick (1953), who used chemistry and X-ray crystallography (Franklin and Gosling, 1953; Wilkins et al., 1953) to create their model. In this case, form explained function in a truly elegant fashion.
DNA structure
The nucleus of each horse cell has 64 DNA molecules. These are huge molecules, each composed of millions of units called nucleotide bases (also referred to as bases or simply nucleotides). At the same time, it is a simple molecule as only four types of bases compose DNA: adenine (referred to as A), guanine (referred to as G), thymidine (referred to as T), and cytosine (referred to as C). The bases are joined in a long, single strand that pairs with a second, complementary strand. This second strand contains a mirror image of the DNA bases found in the first strand. Throughout the length of this long, doubled molecule, all As in one strand pair with Ts in the other strand, and all Gs in one strand pair with Cs in the other strand. The combination is referred to as a “base pair.” The two strands of the DNA molecule wind around each other, with the pitch determined by the angle of the molecular bonds between each base; hence, DNA is referred to as a “double helix.”
DNA replication
The two-stranded structure of DNA serves two functions. Firstly, the second strand is a mirror image of the first strand, and any damage to one strand can be repaired precisely using the alternate strand as a template. DNA repair enzymes constantly monitor DNA sequences and repair damage. Second, the two complementary DNA strands provide a remarkably simple system for the replication of the DNA molecule. The two stands separate from one another and enzymes, called DNA polymerases, create complementary strands using the original strands as templates. At the end of the process, there are two chemically identical DNA molecules.
The central dogma of genetics (DNA ≥ RNA ≥ protein)
One of the major roles of DNA is to encode proteins. This is important because the protein functions are determined by the composition and order of amino acids in the polypeptide chain. The process is basically the following: DNA contains a code within its sequence which is “transcribed” into another information molecule called ribonucleic acid (RNA). The process of transferring the information from DNA to RNA is called “transcription.” Transcription is sometimes referred to as “gene expression.” RNA is similar to DNA except that: (i) it is a single-stranded copy of one of the DNA strands; (ii) its structural backbone contains the sugar “ribose” rather than “deoxyribose”; and (iii) it substitutes the nucleic acid uracil (U) for thymidine (T) wherever thymidine would have occurred based on the sequence of the DNA molecule. In transcription, one of the DNA strands is used as a template to make a complementary RNA strand such that a sequence “ATTCGAAGG” of DNA, for example, is transcribed to an RNA strand with the sequence “UAAGCUUCC.” The transcribed RNA strand is only a section of the DNA representing the gene of interest. It is therefore short and moves easily through the cell to engage the protein-manufacturing complex called a ribosome. Ribosomes travel down the RNA molecule, reading each set of three nucleotides and adding 1 of 20 amino acids according to the instructions from the genetic code. The term “translation” denotes the process of reading the RNA molecule and producing the protein.
Amino acids are small molecules which can be joined in series to create longer molecules called polypeptides, more commonly referred to as proteins. Proteins are the linear arrangement of tens to thousands of amino acids from among the basic set of 20 different amino acids (Table 4.1). The differences between amino acids reside in the side chains attached to the amino and carboxyl core of the molecule. Some of the side chains repel water, some attract water, some are basic or acidic, others have the capacity to form attachments with other amino acids (disulfide bonds). Altogether, the combination of amino acids and their side chains causes the folding of the linear peptide and provides clefts, pockets, and receptor sites that make the protein biologically active as a structure or an enzyme. Examples of proteins include hemoglobin, immunoglobulin, and the diverse molecules making up muscle fibers, as well as the liver enzymes which detoxify blood and blood clotting enzymes which heal wounds. Mammalian genomes contain over 20,000 genes for proteins (Chapter 6).
Table 4.1. The genetic code based on RNA sequences read by the ribosome. The triplet