Table 3.5 Increases in gene expression that result from altering the codon usage of the wild-type gene (or cDNA) to more closely correspond to the host E. coli cell
Figure 3.7 Two commercially available plasmids may be used to increase the pool of certain rare tRNAs in E. coli. Plasmid pSJS1244 carries 3 and pRARE carries 10 rare E. coli tRNA genes. p15A represents the replication origins of these plasmids. Spectinomycin and chloramphenicol are the antibiotics for which resistance genes are carried within these plasmids (A). The expression of foreign proteins in a typical E. coli host cell is also shown; the concentration of rare tRNAs is shown schematically (B) and in an E. coli host cell that has been engineered (by introduction of one of the plasmids show in panel A) to overexpress several rare tRNAs (C). Sørensen and Mortensen, J. Biotechnol. 115:113-128, 2005.
Increasing Protein Stability
High levels of expression of some foreign proteins in bacterial hosts often results in the formation of inclusion bodies of insoluble, inactive protein aggregates. Foreign proteins may misfold in a cellular environment in which the pH, osmolarity, and redox status are different from that of the natural host. Misfolding exposes hydrophobic amino acids that leads to aggregation of proteins, especially at high concentrations. Moreover, there are large differences in the intrinsic stabilities of different proteins. Under normal growing conditions, the half-lives of different proteins range from a few minutes to hours. The basis for this differential stability is the extent of disulfide bond formation, the presence of certain amino acids at the N terminus, and the susceptibility to cleavage by proteases.
Facilitating Protein Folding
One simple strategy to increase the amount of recoverable active protein is to cultivate recombinant strains at low temperatures, which facilitates proper protein folding. However, mesophilic bacteria like E. coli grow extremely slowly at low temperatures. In one study, the chaperonin 60 gene (cpn60) and the cochaperonin 10 gene (cpn10) from the psychrophilic bacterium Oleispira antarctica were introduced into a host strain of E. coli with the result that the E. coli strain gained the ability to grow at a high rate at low temperatures (4 to 10°C). This strain was subsequently transformed with a plasmid encoding a target protein, a temperature-sensitive esterase. The expression of the esterase in the E. coli strain carrying the two chaperone genes at 4 to 10°C yielded esterase specific activity that was 180-fold higher than the activity from the native E. coli strain (without chaperonins) grown at 37°C. Although very high levels of expression of the cloned esterase were not attained, this work illustrates an expression system for proteins that are sensitive to high temperature and might otherwise be difficult to produce.
While the psychrophile chaperones enhanced the growth of E. coli at low temperatures, they did not directly participate in proper folding of the foreign protein. It is also possible to coexpress the target gene with one or more molecular chaperones that interact with and mediate correct folding of proteins. E. coli produces several chaperones that function in protein folding. The “folding chaperones” utilize ATP cleavage to promote conformational changes that enable refolding of their substrates (Table 3.6). The “holding chaperones” bind to partially folded proteins until the folding chaperones have done their job. The “disaggregating chaperone” promotes the solubilization of proteins that have become aggregated. Protein folding also involves the “trigger factor,” which binds to nascent polypeptide chains, acting as a holding chaperone. The proper folding of proteins has been facilitated by coexpression with some of these chaperones. In a study in which the chaperones DnaK and GroEL (and their cochaperonin protein molecules) were overexpressed, the yields of several target proteins expressed at the same time were increased up to 5-fold.
Table 3.6 E. coli proteins that facilitate the correct folding of recombinant proteins
Correct disulfide bond formation is essential for many proteins to fold properly and achieve an active configuration (Fig. 3.8). Covalent disulfide bonds form by oxidation of sulfhydryl groups on cysteine amino acids that are adjacent in the folded protein, a reaction that is catalyzed by periplasmic (DsbA and DsbC) and membrane-bound (DsbB and DsbD) enzymes in E. coli. Disulfide bond formation in the reducing environment of the cytoplasm is rare. Thus, foreign proteins that tend to form inclusion bodies may be directed to the periplasm. Overexpression of the disulfide bond isomerase DsbC also promotes disulfide bond formation. Human therapeutic protein tissue plasminogen activator is a 527-amino acid serine protease that requires formation of 17 disulfide bonds to attain an active state. The cDNA for this protein was cloned downstream of a DNA sequence that encodes a signal peptide (described below) to facilitate expression and secretion to the periplasm in E. coli. However, only trace amounts of the protein were produced. Coexpression of high levels of DsbC resulted in more than a 100-fold increase in the production of functional human tissue plasminogen activator. To realize the maximum benefit from DsbC overproduction, it was necessary to induce the synthesis of this protein approximately 30 minutes prior to the induction of human tissue plasminogen activator expression. Alterations in the levels of the other Dsb proteins did not affect the amount of active human tissue plasminogen activator that could be recovered; however, overproduction of all four Dsb proteins yielded the greatest amount of properly folded and active horseradish peroxidase.
Figure 3.8 Disulfide bond in a protein. (A) A covalent, disulfide bond forms by oxidation of sulfhydryl (SH) groups on cysteines. (B) Disulfide bonds between cysteines (represented by brown balls) within a polypeptide (ribbon diagram shown) contribute to the structural stability of the protein.
Another strategy to avoid formation of inclusion bodies is to express the target protein as a fusion protein. Fusion proteins are constructed at the DNA level by ligating a portion of the coding regions of two or more genes, such that a single polypeptide is synthesized. It is essential that the combined coding sequences have the correct reading frame; otherwise an incomplete or an incorrect translation product will result and the protein will not have the desired function. Fusion proteins that contain thioredoxin, a small (12-kilodalton (kDa)), highly soluble protein, as the fusion partner remain soluble even when up to 40% of the cellular protein consists of the fusion protein. With this system, the target gene is cloned just downstream from the thioredoxin gene and both genes are transcribed from a single promoter (Fig. 3.9). Fusion proteins containing thioredoxin accumulate preferentially at the cytoplasmic face of the host E. coli inner membrane at sites known as adhesion zones (regions where the inner and outer membranes adhere). This facilitates selective release of the soluble fusion protein by osmotic shock from E. coli cells into the growth medium. The presence of the host protein segment makes most fusion proteins unsuitable for clinical use and may affect the biological functioning of the target protein. In addition, fusion proteins require more extensive testing before being approved by regulatory agencies, such as the U.S. Food and Drug Administration. Thus, strategies have been developed to remove the unwanted amino acid sequence from the target protein following purification (described below).