We could also have introduced you to your genome with a slew of the DNA sequence units—As, Ts, Gs, and Cs—in a string, or we could have shown you a picture of DNA in a test tube or even a picture of a nucleus of one of your cells where the DNA would be visible as dark stringy stuff. There are many ways to visualize the genome and this is part of its beauty.
Figure 1.1 This picture, known as a karyotype, is a photograph of all 46 human chromosomes. With an X and a Y chromosome, this is a male’s karyotype. A female’s karyotype would show two X chromosomes.
Credit: Photo Researchers
Figure 1.2 The nucleus of every human cell (the large purple mass inside the cell) contains DNA. Mitochondria, organelles in cells that produce energy (the smaller purple objects within the cell), also contain some DNA.
Credit: Wiley
Still, to understand function, we do need to learn about basic form. And a karyotype, despite its limitations as a representation of the genome, illustrates that in almost all the cells in the human body there are 22 pairs of chromosomes and two sex‐determining chromosomes. The double helices that make up your chromosomes are composed of deoxyribonucleic acid, also known as DNA, on which are found approximately 20,000 genes. These cells are called somatic cells, and they are found in almost all nonreproductive tissue.
Humans also have cells with 23 nonpaired chromosomes. In these cells, each chromosome is made up of a single double helix of DNA that contains approximately 20,000 genes. These cells are called germ cells and are the sperm and egg cells produced for reproduction. These germ cells carry a single genome’s worth of DNA or more than 3 billion bases worth of nucleic acids.
Chromosomes are somewhat like genetic scaffolding—they hold in place the long, linearly arranged sequences of the nucleotides or base pairs that make up our genetic code. There are four different nucleotides that make up this code—adenine, thymine, guanine, and cytosine. These four nucleotides are commonly abbreviated as A, T, G, and C. Found along that scaffolding are our genes, which are made from DNA, the most basic building block of life. These genes code for proteins, which are the structural and machine‐like molecules that make up our bodies, physiology, our mental state. Through the Human Genome Project scientists are not simply learning the order of this DNA sequence, but are also beginning to locate and study the genes that lie on our chromosomes. But not all DNA contains genes.
On average 3 billion base pairs exist in the collection of the chromosomes your mother transmitted to you. Add to that the chromosomes given to you by your father gave you and in your cells there are around 6 billion bases, a complete diploid human genome. There are long stretches of DNA between genes known as intergenic or noncoding regions. And even within genes some DNA may not code for proteins. These areas, when they are found within genes, are called introns. While these genomic regions were once believed to have no products and/or no function, scientists now understand that both introns and intergenic regions play a role in regulating DNA function. The Encyclopedia of DNA Elements or ENCODE Project estimates, for example, that while only 2.94% of the entire human genome is protein coding, 80.4% of genome sequences might govern the regulation of genes. (1) Unlike the human genome and all other eukaryotic genomes, however, bacterial genomes do not have introns and have very short intergenic regions. Curiously though, the archaea, a third major domain of life (in addition to eukaryotes and bacteria) do have introns, but not necessarily the same kind of introns as eukaryotes.
Let’s begin our tour of the human genome with a very basic lesson in genetic terminology. For example, what exactly is genetics, and how is it different from genomics? Genetics is the study of the mechanisms of heredity. The distinction between genetics and genomics is one of scale. Geneticists may study single or multiple human traits. In genomics, an organism’s entire collection of genes, or at least many of them, is examined to see how entire networks of genes influence various traits. A genome is the entire set of an organism’s genetic material. The fundamental goal of the Human Genome Project was to sequence all of the DNA in the human genome. Sequencing a genome, whether human or nonhuman, simply means deciphering the linear arrangement of the DNA that makes up that genome. In eukaryotes (plants, animals, fungi, and single‐celled organisms called protists), the vast majority of the genetic material is found in the cell’s nucleus. The Human Genome Project has been primarily interested in the more than 3 billion base pairs of nuclear DNA. A tiny amount of DNA is also found in the mitochondria, a cellular structure responsible for the production of energy within a cell. Whereas the human nuclear genome contains more than 3 billion base pairs of DNA and approximately 20,000 genes (that’s nearly 10,000 genes fewer than when the first edition of this book was published in 2005), the reference human mitochondrial genome contains only 16,568 bases and 37 genes. (2) Like bacteria, mitochondrial DNA, or mtDNA, has short intergenic regions and its genes do not contain introns. Another interesting characteristic of mtDNA is that it is always maternally inherited. This has made mtDNA very helpful to track female human evolutionary phenomena. These discoveries were made possible, in part, by sequencing mtDNA.
What about heredity? In the most basic sense we should think about heredity as the transmission of traits from one generation to the next. When we talk about heredity in this book we refer to the ways in which traits are passed between generations via genes. The term heredity is also sometimes used to describe the transmission of cultural traits. Such traits are shared through a variety of means including laws, parental guidance, and social institutions. Unlike genetics, however, there are no physical laws governing the nature of this type of transmission.
What are genes? Genes are regions of DNA and are the basic units of inheritance in all living organisms. These words, genes and DNA, are too often used interchangeably. Both genes and DNA are components of heredity, but we identify genes by examining regions of DNA. In other words, DNA is the basic molecular ingredient of life, whereas genes are discrete components of that molecular brew.
If you look at any family you’ll see both shared and unique traits. Family members typically look alike, sharing many features such as eye color and nose shape, but they may also have very different body types and be susceptible to different diseases. This diversity is possible for two reasons. The first reason is that genes come in multiple forms. These alternative forms are known as alleles, and in sexual reproduction they are the staple of organismal diversity. According to the laws of genetics, siblings can inherit different traits from the same biological parents because there is an assortment of alleles that can be randomly passed along. The second reason is that the environment can exert a significant influence on the expression of genes. For example, an individual may inherit a gene that makes him or her susceptible to lung cancer. Such susceptibility is typically revealed, however, only after years of genetic damage caused by cigarette smoking or other lung‐related environmental impacts. (3) Recent advances in the field of epigenetics have brought new complexity to our understanding of how our genes interact with our environments, and how such interactions can be passed between generations (through the germline). Over the past decade epigenetic research has accelerated our understanding of how environmental factors can alter the peripheral structure of DNA—not the DNA sequence itself but the molecular structures that interact with and support the sequence—to elicit changes in the expression of a gene (the gene’s phenotype).
So how did science progress from thinking about the mechanisms of heredity to understanding that genes are the basic units of heredity, to deciphering and finally manipulating the DNA code that underlies all life on Earth? The results of the Human Genome Project were the fruits of over a century of struggle by scientists around the globe. Most historians of science would measure this progress beginning with Gregor Mendel’s work on pea plants during the middle of the nineteenth century. Although premodern