proteins & purification

Studying Proteins and Protein Purification

Why study proteins
The value of protein purification
The Logic of Protein Purification
- Assays
- Sources
The strategy and logic of protein purification
Criteria for purity
The generality of purification to constant specific activity
Fractionation of proteins and the value of SDS gels.
An illustration of strategic decisions and luck in protein purification
Summary

Why study proteins. The study of proteins and their function is central to understanding both cells and organisms. Following are a few of the reasons why proteins are important in biology:

they serve as catalysts that maintain metabolic processes in the cell,
they serve as structural elements both within and outside the cell,
they are signals secreted by one cell or deposited in the extracellular matrix that are recognized by other cells,
they are receptors that convey information about the extracellular milieu to the cell,
they serve as intracellular signaling components that mediate the effects of receptors,
they are key components of the machinery that determines which genes are expressed and whether mRNAs are translated into proteins,
they are involved in manipulation of DNA and RNA through processes such as: DNA replication, DNA recombination, RNA splicing or editing.

Working in vitro. Because of these important roles, it is often valuable to be able to purify and study a protein in isolation. This ability to isolate and study a purified protein lies at the heart of much modern biochemistry. If a protein can be isolated and purified, it can be studied in isolation from other proteins and its enzymology or signaling capability can be studied in detail. It is this ability to independently isolate proteins and study them individually or in mixtures with complete control of their environment (salt, temperature, pH, etc.) that lies at the heart of working in vitro. Although this in vitro approach can clearly be complemented by genetic approaches and in vivo approaches, there is really no substitute for it in the repertoire of experimental design available to the modern biologist.

The value of protein purification. Beyond this, there are a number of important reasons for purifying proteins, and a brief list of these may serve to emphasize why individuals spend so much time and effort in this pursuit:

By purifying a protein it can be clearly established that a particular biological activity (enzymatic activity, signaling capacity, etc.) actually resides in a unique protein.
Purified proteins serve as extremely valuable biochemical reagents. It is remarkably valuable to be able to obtain things like purified growth factors or hormones, proteases, DNA polymerases, reverse transcriptases, ligases, phosphatases, or antibodies that recognize a particular epitope of interest.
Once a protein is purified, it is possible to study its enzymology, understand its affinity for particular substrates, or dissect its ability to catalyze enzymatic reactions. Such approaches have allowed us to understand how biological molecules can act as catalysts in metabolic processes or as transducers that will convert chemical energy into ionic gradients or mechanic forces.
The availability of purified proteins allows bio-organic chemists to modify specific residues to help understand how these residues confer particular structures or allow the protein to operate as catalyst.
Purified proteins can be sequenced either by Edmund degradation, which obtains sequence information from the N-terminal sequence of a protein or a peptide derived from a protein. This approach is rapidly being replaced by or by mass spectroscopy. Sequence information can be used to design probes to isolate cDNAs, and this information can be used to deduce the primary sequence of the entire protein. MALDI-TOF (Matrix Assisted Laser Desorption Ionization Time of Flight) is a technology that can provide much information about a protein, even when only small quantities are available.
Partially purified proteins can often be identified on the basis of fragments generated by ionizing the proteins or peptides derived from them and performing mass spectroscopy. For a site devoted to mass spectroscopy, click here. 'PROWL', a site at Rockefeller, includes protocools, software and links to useful databases for the analysis of proteins.
Traditional structure-function studies have become remarkably more powerful because of the ability to isolate and purify not only conventional enzymes but also mutant forms of those proteins. With the advent of site-directed mutagenesis and expression systems which will be discussed in another section on cDNA cloning, it is possible to produce and purify a protein with essentially any mutation of interest. This allows the testing of a broad spectrum of hypotheses as well the design of "improved" protein molecules.
Using genetic engineering, it is possible to create novel proteins or combination of protein elements that can serve a specific function.
- For example, it is possible to combine a region of one protein that is capable of activating transcription with a region of another protein that recognizes a particular site on the DNA producing a novel DNA-binding protein. This is the basis of the two hybrid screen which is discussed in another section.
- It is even possible to modify proteins so that particularly desirable characteristics like heat stability or protease resistance can be enhanced.
If a protein with a particular enzymatic activity or signaling capacity can be purified it serves as an important indication of how biological systems work. Proof that a cell contains a particular metabolic pathway is dramatically improved by isolating enzymes that catalyze components of that pathway. Isolation of a particular polypeptide growth factor can be a powerful indication of how an organism regulates particular cellular or tissue processes like angiogenesis. For example, our understanding of angiogenesis is dramatically enhanced by the isolation of the protein VEGF that stimulates capillary formation. Likewise understanding of how a growth cone can navigate within the body to make appropriate connections is dramatically improved when molecules capable of collapsing the growth cone (collapsin) are isolated and purified.
Once a protein has been isolated, it is possible to produce reagents (antibodies) that are capable of determining the location of that protein in vivo which can give important support to interesting hypotheses and disprove incorrect speculations.
By understanding the structure-function relationships of proteins, it is often possible to design specific reagents that can be used to test the function of a protein in vivo.
- For example, understanding the ways that G-proteins can be mutated so that the protein can either be active (always on) or be incapable of ever being activated (constitutively off), allows biologists to explore the role of these individual proteins in vivo. (You might already know that G proteins bind GTP and GDP and act as molecular switches and that GTP bound is 'on' while GDP bound is 'off'). Many of these ideas and their structural basis were worked out with a protein called ras, but subsequently many dozens of similar proteins (e.g., rac and rho) were isolated and they all work by a similar mechanism! Thus, understanding one makes it much easier to understand the others and quickly see important questions and make the right experimental decision. From knowledge of ras, one can predict specific mutations that might make rac constitutively on, and these predictions are correct.
- Likewise, understanding how DNA-binding proteins recognize DNA sequences allows us: (1) to understand the significance of particular DNA mutations, (2) to design proteins that can recognize modified DNA sequences or recognize a known DNA sequence and (3) to design mutant proteins that interfere with the activity of proteins normally present in the cell. For example, if a protein has a bHLH motif (basic Helix Loop Helix) it is possible to make a good prediction of what sequences interact with DNA and what mutations might prevent binding of the protein to DNA or which sequences might compete with an endogenous proteins for DNA binding.
Finally, the increasing power of x-ray diffraction* and 2-D NMR to determine protein structure requires the availability of larger quantities of purified proteins. Understanding 3-D protein structure has begun to allow us to understand how protein folding is controlled and it has provided a remarkable insight into the way the proteins act in vivo. The ability to visualize an ion channel, even at low resolution, must forever modify the way a scientist understands ion flow through the membranes. Likewise, understanding how proteins recognize specific DNA sequences changes the way our community understand transcriptional activation or repression. The impact of this type of 3-D image of a protein cannot help but have a dramatic impact on the way we visualize biological processes.
- Determining new protein structure is a major undertaking. To use either method, large amounts of pure protein must be available. For crystallography, protein crystals* must be made and defraction information must be analyzed. Interpreting this data requires one to solve the phase problem, which usually means getting additional data about the defraction pattern of a protein that contains a heavy metal. This site provides a more detailed discussion of difraction and crystalography.
- 2D-NMR avoids the need to make a crystal, but this approach is limited to smaller proteins (less than about 20,000-50,000 Daltons) and requires careful analysis of a complex data set. You can find a description of the theories and practical issues in NMT at these sites: 1, 2.
Understanding how particular amino acid residues are involved in protein function, especially when it is combined with knowledge of the 3-D structure of the protein, helps understand how particular sequences in proteins are involved in biological functions. The accumulation of this type of knowledge derived from the study of proteins, combined with tremendous amount of information in DNA databases have allowed a construction of what might be best called "protein descriptors." These descriptors can consist either of a particular conserved sequence of an amino acid in the primary sequence or a particular pattern of distributed amino acids in the protein. Frequently, these descriptors can be recognized with the primary sequence of a protein, but understanding how a particular descriptor might function in an actual protein can dramatically increase the value such a descriptor. For example, understanding the enzymology of proteolysis and the importance of a catalytic triad of asp-his- ser makes it worthwhile to look for the pattern in novel amino acid sequences. Being able to predict that a particular sequence of amino acid is likely to form an alpha-helix or a random coil can be important to understanding the properties and frequently the function of a protein. Visualizing the biological distribution of leucine residues in a 'leucine zipper' makes this pattern of leucine repeats of great interest. Recognizing that many of the proteins having the 'zinc finger motif' are transcription factors will often give a clue to the function of a protein.
Although purified proteins are often studied in purified systems (in vitro), the advent of microinjection systems allow purified proteins to be introduced into single cells where their biological activity can be determined. In marked contrast to in vitro systems, microinjectiom allows a purified protein to be introduced into a cell where it can interact with a vast number of proteins and structures. It is possible to study complex changes (e.g., cell behavior, cell shape, cell division) which can't be easily reconstituted in vitro. It is a quick way of introducing a variety of proteins into different cell types in the absence of using an expression vector. This type of study, which is between the classic in vivo and in vitro studies, might be called in 'vivtro.' It is an important approach to studying protein function. Likewise, it is often possible to study the effect of a particular protein on development or a physiological function by introducing it into an organism.
Purification of a protein can also help purify a nucleic acid of interest. For example, if the hnRNA, snRNA proteins are purified, they can be co-purified with associated RNA, and bound RNA can be amplified. Briefly, the steps in the procedure are : make purified protein-RNA complex, purify the bounded RNA, reverse transcribe, add polynucleotide promoter tail, PCR amplify. The RNAs which specifically bind to the purified protein can be affinity selected, amplified and sequenced.

The Logic of Protein Purification. Despite the obvious value of studying proteins in vitro, it is remarkable how frequently the logic of protein purification is not appreciated. I will begin by briefly describing the logic of protein purification and then will try to illustrate the logic using two specific examples. To isolate a protein two things are absolutely essential:

First, one must have an assay for the protein of interest, and
Second, one must have a reasonable source for that activity.

Assays. An assay for a protein is simply a method of determining in a quantitative fashion the amount of a particular activity present. Assays can be as diverse as biologists are creative. They can be as simple as:

measuring a change in absorbency as NADPH is oxidized in a coupled reaction,
a change in molecular weight of a protein that is being proteased,
a shift of a labeled fragment of DNA or RNA on a gel,
the ability of a fraction to retain a labeled DNA on a filter by forming a DNA protein complex,
the transformation of a radioactive labeled substrate from a chemical with one physical property to a -chemical with another physical property,
the ability to stimulate cell death, the ability to stimulate cell proliferation,
the ability to stimulate cell differentiation,
the effect on development in vivo, or
any other measurable quantity.

As we will see below it is essential that whatever assay is chosen it is important that it be quantifiable , that it must be able to determine whether more or less of the desired activity is present in a particular fraction. Although there are nearly an infinite number of assays present, there are two criterion that should be used to judge the value of any assay:

First, an assay should be experimentally convenient. This means that it should be reasonably sensitive and easy to perform. If an assay is not sensitive, that means that one would have to have large quantities of an activity in order to determine whether it was present or not and such large quantities are frequently not available. Likewise if an assay requires many days to perform then this will be a major barrier to the experimental progress. It almot seems to be a rule of nature that if an assay for a protein takes a long time, the protein is almost always unstable or subject to proteolysis, making it impossible to purify!
Second, the assay employed must be specific for the activity of interest. Richard Kolodner, a capable biochemist who spent much of his career purifying proteins involved in bacterial recombination, used to express this by paraphrasing a Rolling Stones' song. He used to sing "you can't always get what you want, but if you try real hard, sometimes you get what you assayed for." Thus, a careful consideration of the assay chosen is essential to long-term scientific success. An undergraduate working in the laboratory of a well-known scientist once purified what he thought was a single-stranded DNA-binding protein. His assay was the ability of this protein to bind strongly to single-stranded DNA, which on first appraisal might seem to be a reasonable assay for the protein he was interested in. Unfortunately, a protein whose biological role is to bind to RNA, may also have a substantial DNA-binding activity. In his case, that is what was isolated! Thus, it is always important to consider the specificity of a particular assay and to develop a method to determine the function of a protein in vivo rather than extrapolating on its activity in vitro. Here is another example. An enzyme whose primary function in vivo is to reduce ethanol to the aldehyde may also have the ability to convert retinol to retinaldehyde in vitro, while that activity may be of no consequence in vivo.

Sources. Second, protein purification requires a reasonable source of the protein. It is not difficult to appreciate that it is easier to purify a protein from a rich source than from a poor source. Furthermore, a rich source is likely to produce a greater yield of the protein of interest. The choice of what starting material should be tried will frequently determine the success or failure of a purification. For example, isolation of the acetylcholine receptor which is present in extremely small concentrations in muscle is an arduous and extremely difficult task. Purification of the acetylcholine receptor becomes much easier, although still a difficult task if it is purified from the electroplax of an electric ray. One of the surfaces of the cells in the electroplax is densely packed with acetylcholine receptor making its purification much easier.

Given a reasonable assay, and a source of material, it should be possible to purify any protein given enough time, money, and patience. Certainly it is possible that the task may prove difficult because the protein of interest may be extremely unstable. It may be extremely sensitive to proteolysis or it may be easily denatured, the attention to detail will frequently allow these problems to be overcome. Frequently, the problem of identifying a rich source of protein can even be obviated by substituting an artificial source for the protein. If the cDNA encoding a protein can be obtained the protein can usually be expressed at high levels giving an experimentalist an immediate advantage. Of course, in many cases, the gene for a protein of interest is not yet available.

The strategy and logic of protein purification. To purify a protein one must begin with the starting material and fractionate it using any one of a large number of physical or biochemical approaches. The initial material can be separated into fractions using centrifugation, precipitation with salt, binding to ionic columns ( Ion-exchange chromatography*), velocity sedimentation*, equilibrium density sedimentation*, binding to affinity columns*, separation by sizing columns (see gel filtration chromatography*), fractionation by isoelectric focusing, its ability to associate with known ligands or binding partners, or any other method available to the experimentalist. Once such a fractionation is accomplished, the activity of interest should be found in some fractions but not others. Assuming that the activity is not destroyed by the fractionation process (an assumption that is almost always less than completely correct), the fractionation serves two purposes. First it removes contaminating material and, second, it enriches the fraction containing the activity of interest. The enriched fraction can then be subjected yet to another round of fractionation and the process can be repeated. If the amount of activity can be accurately determined (assayed) and the total amount of the material (usually protein) present in the fraction can be determined. The specific activity of the preparation (i.e., the activity present in the fraction divided by the total amount of the material in the fraction) can be determined in each stage of the preparation. In a well-designed purification procedure, each fractionation step should result in the removal of the contamination material (proteins, nucleic acids, lipids, and anything else) and a progressive increase in specific activity. Of course, in the real world, there is not only an increase in specific activity, but there is almost always a loss of total activity. Maximizing enrichment while minimizing loss is the goal of any purification scheme, and developing a good purification scheme is often a series of compromises. A purification scheme can be summarized in a purification table* that keeps track of all these numbers. Such a table is important not only for making formal arguments in a paper, but also for making practical decisions about whether a particular step in purification scheme makes sense (i.e., is it a good combination of enrichment and yield?). This process can obviously be reiterated indefinitely, but at what point is it possible to say that a protein has been purified?

Criteria for purity. The classic answer to this question is provided by the idea of purifying a protein to constant specific activity or purifying to homogeneity. Purification to constant specific activity means that the specific activity of a fraction cannot be increased by further purification. In a rigorous sense this means that further attempts at purification are futile since the only material left in the fraction is the material that actually is responsible for the activity being assayed. In the final stages of purification, a measurement of specific activity across a column should result in a constant number, that is, as there is an increase in the level of protein there should be a corresponding increase in the level of activity. Thus, this classic criterion is in essence a negative one , it is a suggestion that there is little or no extraneous material present in the fraction that we assert to be pure or purified to a constant specific activity. Realistically, one never achieves purity. One only can argue that the material is 90%, 99%, or 99.99% pure. This fact must always be remembered in making statements about a particular material but it can be used in a powerful way to argue that a particular activity resides in a particular polypeptide or combination of polypeptides or that two different enzymatic activities (e.g., a DNA-polymerase activity and a exonuclease activity) reside in the same protein. If one can show that two activities co-fractionate then that suggests that they may be in the same protein. If two activities can't be separated after repeated attempts, the argument becomes stronger. If they both co-fractionate with a peptide or a group of peptides (subunits of a protein complex), then the argument becomes very strong. Of course, purity can only be determined in terms of what can be detected. If nucleic acid is not measured, a protein may be pure in terms of protein, but have a contaminating nucleic acid or contaminating carbohydrate. In some cases, the presence of a carbohydrate or nucleic acid may actually be an essential component of an enzymatic activity and this can be rigorously established by showing that the ratio of a protein to nucleic acid is also constant across a fractionation step.

One will often hear the statement that a protein is pure if there is a single band on a SDS gel or polyacrylamide gel* (see SDS gel*). There is some truth and some falsity in this assertion. If there is a single protein on the polyacrylamide gel that can be detected that simply means that there are no other proteins of equivalent abundance present in that fraction. If one fractionates 10x, 100x, or 10,000x more protein, eventually contaminating bands must be found. Again, purification is a quantitative question. Likewise, some activities are actually a combination of more than one protein (an enzyme can have many subunits) and so once the activity is purified to homogeneity the purified preparation should have a combination of polypeptides or even polypeptides and nucleic acids in a particular stoicheometry that are required for their enzymatic activity.

The generality of purification to constant specific activity. Finally, it should be noted that this kind of logic applies not only to protein purification. It is equally true for the fractionation of any other substance whether it is a lipid, an alkaloid, a psychotropic substance, an anticancer drug, a carbohydrate, or a vitamin. The ability to fractionate, measure the abundance of total material, and identify a particular activity is powerful in all aspects of biological and biomedical research.

Fractionation of proteins and the value of SDS gels. It is beyond the scope of this review to describe protein fractionation. Skill in this area requires knowledge of protein chemistry and physical biochemistry. To understand the logic of protein purification it is important to know that there are a broad range of approaches that can separate proteins on the basis of their properties, including size, shape, sedimentation velocity, ability to bind to various ionic groups, affinity for substrates or pseudo-substrates, solubility, stability, etc. These properties are different for different proteins. Thus, there is much greater variability among proteins than nucleic acids.

SDS polyacrylamide gels provide a simple, easily accessible, analytical tool that can be easily used to determine what proteins are present in a mixture. It is important because:

this technique dissociates multimeric proteins and separates peptides by molecular weight (see dictionary entry on SDS gels)
it allows estimation of the molecular weight of each peptide
it is sensitive (about 0.1 micrograms can be detected with staining by Comassie Brilliant Blue and 'silver staining' is about 100 fold more sensitive)
it is sensitive to protein modification (glycosylation & phosphorylation frequently change mobility)
it can resolve and allow visualization of 50 or more proteins
it can be combined with isotopic labeling to study synthesis and turnover of proteins
it can be modified to follow antigens (see western blotting)
it is fast (can be done in part of a day)
it is easy to learn and is done in the same way for all protein mixtures
of course, it denatures proteins so biological activity is usually (but not always) lost.

An illustration of strategic decisions and luck in protein purification and the transition to cloning

An excellent story of the importance of the considerations outlined above (and a bridge to cloning) is that purification of 3 members of the neurotrophin family NGF (nerve growth factor), BDNF (brain-derived neuronal trophic factor), and NT-3 (neurotrophin 3). This story covers a period of over half a century, but it illustrates the importance of choosing an appropriate assay, developing convenient assays, and choosing an appropriate source to purify a protein.

The biological insight. The first evidence for the existence of nerve growth factor, which is now known to be a diffusible, secreted polypeptide which acts to enhance neural survival and neural differentiation begins with the insight of Dr. Rita Levi-Montalcini, who was studying neuroembryology in the laboratory of Victor Hamburger. That laboratory had been studying the effects of transplanting or removing chick limbs on the development of the nervous system. Transplanted limbs could not only develop, but they could also be innervated, and neuroanatomical work had shown a match between the size of the target to be innervated (which could be increased by transplantation or decreased by amputation) and the number of neurons in the appropriate location to innervate the target. One of the important contributions was the observation that transplantation of extra limbs increased the number of neurons not only in the region of the nervous system that would innervate that target but also nerves that would not innervate the target. This led to the suggestion that there may be a secreted diffusible factor that could act at a distance to increase the number of nerves. A clever experiment confirmed this hypothesis. Addition of tissue (even tumor tissue) on top of a chorioallantoic membrane of a developing chick egg resulted in an increase in the number of nerves in the developing embryo. (The chorioallantoic membrane is the membrane that separates the developing embryo from the air pocket in the egg. It is permeable to gases and small molecules but is not permeable to cells, so there is no cell-cell contact ). This demonstrated that there must be a factor of some type that control the fate of the neurons and Dr. Levi-Montalcini set out to purify that factor.

Lesson one. At the time that this story began it was really unclear whether the factor to be purified was a protein or some other form of molecule. More importantly it was not known whether the real biological activity of this factor was to increase the number of nerves, which would be the simplest way to explain the problem or an effect on the survival of neurons. Ultimately, it was demonstrated that the effect of nerve growth factor was to control the amount of cell death that existed at late periods in embryogenesis and it was this effect on cell death that eventually caused an increase in the number of neurons and the size of the neuron clusters (called ganglia). Establishing this mechanism of NGF action was one of the most important conceptual developments of modern neurobiology and it provided a tremendous insight into the understanding of apoptosis. Only much later, was it demonstrated that nerve growth factor actually has multiple other activities. In early neuroblast, it can act to promote division and in that sense it is a classic mitogenic factor. Current work is demonstrating effects of the nerve growth factor on Schwann cells, which are non-neural support cells but myelinate nerves in the peripheral nervous system. Presumably this biological activity could have also been an avenue to purify nerve growth factor. In that case we would have been referring to nerve growth factor as Schwann cell factor or some other related name. This is not at all an unusual situation. The fibroblast growth factor family has multiple effects on a vast number of different tissues and so the first two members of this family, aFGF and bFGF were rediscovered and renamed by multiple labs and the early literature include no less that 27 different names for these growth factors. Thus, you 'get what you assay for,' but what you get may hold some unexpected surprises.

Lesson 2. The neuroanatomical analysis required to determine the size of ganglia and the number of surviving neurons in a ganglion. In a developing chick it is obviously time-consuming and cumbersome. Presumably such a phenomenon could serve as the basis of an assay for a growth factor, but work would certainly proceed quite slowly.

The difficulty of the use of developing chick as an assay system was fully appreciated by Dr. Levi-Montalcini who decided to take time to visit the laboratory of Dr. Carlos Chagus at the University of Brazil in Rio de Janeiro to learn tissue culture. This laboratory had succeeded in developing methods to isolate dorsal root ganglion in young animals and to allow these ganglia to survive in culture. The surgery required to isolate dorsal root ganglia (DRG) is relatively straight forward and many dorsal root ganglia can be isolated in a short procedure. As a visiting scientist, Dr. Levi-Montalcini made the monumental discovery that culturing dorsal root ganglia with a tumor type called sarcoma-180 (or S-180) resulted in a dramatic morphological change of the DRG. The DRG produced a 'halo' of processes that upon further analysis were clearly demonstrated to be extensions produced by the neuron. This was an important breakthrough not only because it indicated something about the molecule that was being studied in a biological sense, but also, and perhaps more importantly, because it provided a convenient assay that could be used in the purification of the protein.

Lesson 3. The switch to a DRG assay is important. First, it is convenient and allowed assays to be done quickly (one of the hallmarks of a good assay), but it is also important because it is a subtle change in the type of response that is being assayed. This assay is now a search for a neurite extension factor and such a factor may or may not be involved in increasing the number of neurons during development. Ultimately it proved true that NGF has effects both on survival and the expression of differentiated properties like neurite extension, so the switch in the assay system proved fortuitous. Certainly history could have taken a different course. It might be possible that there are two factors, one involved in neurite extension and second involved in cell survival. In this case the switch in assay systems would result in isolation and purification of only one of the two factors. In such a case the decision to follow and use an easy assay might have been the more sensible one, but it was the good luck of Dr. Levi-Montalcini and her collaborator that a single molecule was responsible for both activities.

With this assay in hand, Dr. Levi-Montalcini set out to purify the nerve growth factor from sarcoma-180 which was a tumor and therefore had the advantage of being a reasonably homogeneous source of the factor of interest that could be obtained in relatively large quantities, an important considerations in protein purification. Unfortunately, the level of NGF in sarcoma-180 is quite small and so the difficulty of loss that would have been encountered during a purification from the source would have probably been insurmountable in the early 50s when the technology of protein purification was less sophisticated than it is today. Fortuitously, or perhaps as indication that great minds identify and follow productive lines of research, this problem was overcome in a fascinating series of events. In the early 50's one of the approaches to protein purification was to do a partial proteolysis of tissue to help solubilize the proteins. At that time, snake venom was frequently used as a good source of proteases and so extracts from S-180 were treated with proteases derived from snake venom. In doing the bookkeeping required to monitor protein purification (i.e., following level of protein and biological activity), it was discovered that proteolysis did indeed result in a substantial increase in biological activity. The controls however demonstrated that this increase in biological activity was apparently obtained from snake venom which was actually a richer source of the factor than the S-180. This led to a decision to switch the source of purification from S-180 to snake venom. Snake venom was a rich source of the factor, however, it was not easy to obtain it in large quantities and it was expensive. A decision was therefore taken to explore other tissues that might be related to the gland responsible for producing venom. It turned out that the mouse salivary gland was an abundant source of nerve growth factor that could be obtained much more easily than snake venom. Thus, the experimental strategy was again changed to purify the growth factor from salivary gland. It is easier to purify from a rich source.

Lesson 4. The identification of a rich source of growth factor activity was an important landmark in the story since it now put in place the two essential components of the protein purification. Dr. Levi-Montalcini and her collaborators now had both a reasonably convenient assay and a rich source of NGF and this combination allowed them to purify the factor to homogeneity. There are two interesting ancillary stories that provide important lessons in protein purification.

Two interesting stories.

An interesting question about the switch to mouse salivary glands as a source of NGF purification is the question of whether the same molecule would have been identified if efforts to purify from S-180 had been continued. During 1980s, we (John Wagner and Pat D'Amore) had discovered that acidic FGF was capable of causing neurite extension in PC12 cells that was indistinguishable from that caused by NGF. We decided to revisit the question of whether sarcoma-180 might actually produce acidic FGF rather than NGF. Extracts of Sarcoma-180 were fractionated on heparin affinity columns and two peaks of equivalent size were isolated, one that eluted at the position of acidic FGF and a second that eluted at the position of a NGF! It is interesting to think that, if sarcoma-180 was the source of material for NGF purification and the former peak was chosen rather than the latter, the molecule purified by Dr. Levi-Montalcini and her collaborators might be the molecule we now know as acidic FGF.
Another interesting story that comes from this line of research is the discovery of epidermal growth factor (EGF). Once NGF was purified, antibodies to this factor were needed. Nerve growth factor preparations were injected into rabbits to produce antiserum, but one of the effects of this injection was that babies borne to injected rabbits open their eyes prematurely. Dr. Cohen decided to use this as an assay to monitor biological activity and this was one of the first clues that led to the identification and purification of EGF. EGF is the first purified growth factor that was studied as extensively as a classic mitogen. Once again, you get what you assay for.

Lesson 5. It is instructive to read some of the initial papers on the purification of nerve growth factor (Varon et al; Biochemistry 6: 2202 (1967) or Bocchini and Angeletti (PNAS 64:787 (1969) or the references in these papers). These papers are examples of classic protein purification and both discuss the evidence that the biological activity is in a single protein. The initial purification required many purification steps and was quite laborious, but they did allow the structure of NGF to be established.

Understanding and appreciating the structure of NGF allowed other workers to develop extremely quick and efficient protein purification schemes. Mobley et al. developed a rapid, two-step purification proceedure that can result in high concentrations of nerve growth factor from salivary glands (Biochemistry, 15:5543). This purification relies on the knowledge that the form of nerve growth factor initially secreted by salivary glands is a complex between the active form of the growth factor (called beta NGF) and several other protein molecules. This complex is known as 7S NGF. If extracts are made from salivary gland and debris is removed, the extracts can be passed over a CM-cellulose column ( for carboxymethyl) in a step that remove essentially no protein (perhaps 2%) and so results in no enrichment for NGF. If the fraction from this column however is adjusted to low pH, some material precipitates and the 7S NGF dissociates to form beta NGF. If the solution is then returned to the same conditions of pH and salt used in the first column fractionation and passed over a second column, the beta NGF, in contrast to the 7S NGF, now sticks to the column while the vast majority of proteins pass through. (Those proteins that would stick under these conditions to the column were removed during the first column step). There are only two proteins that stick to the second column and these can be isolated by sequential washes with high pH and high salt. The second of these proteins is NGF that is greater than 99% pure.

The purification scheme developed by Mobley and his colleagues is quick, because it takes advantage of the known structure of the protein in a clever way. It relies on a tissue where NGF is present in an abundant source. As a result, it is possible to purify milligram quantities of a near homogeneous protein in less than a week. This is much quicker and easier than the many weeks of work that were required to do the initial NGF purifications.

Lesson 6. The purification of NGF stands in marked contrast to the purification of BDNF which was not accomplished until the early 80s. The decision to try to isolate another protein that had activity similar to NGF was based on a sound biological argument. At the time that the purification of BDNF was initiated, it was known that there must be other factors that had activities similar to NGF but had a different biological structure. This was known because many neurons underwent a period of naturally occurring cell death, while only some of the populations were responsive to NGF. To many, it seemed obvious that there must be other factors that control the extent of the cell death. Silvio Aaron (the same one who was first author on the NGF purification paper discussed above) appreciated that in the peripheral nervous system, the activity of nerve growth factor was restricted to cells that were secreting catecholamines (epinepherine and norepinepherine), while it had no effect on cells that secreted acetylcholine. He therefore decided to use a strategy similar to that of Dr. Levi-Montalcini; but, rather than using DRGs as a ganglion, he chose to study the effects of protein extracts on the ciliary ganglion and this insight eventually led to the purification of CNTF (ciliary neuronal trophic factor) which in itself is an extremely interesting story, but I won't present it here.

BDNF (Brain Derived Neurotrophic Factor) is the name given to a neurotrophic factor that was present in brain. Brain has the advantage of being obtainable in relatively large quantities and from a biochemist' s point of view, could be an excellent source of growth factor activity. A young scientist, Dr. Barde and his colleagues decided to use brain as a source to purify a neurotrophic factor and used the effect of brain extracts on a neuroblastoma line as an assay. This was a convenient assay; but, of course, it made the assumption that neuroblastoma, which is a transformed neural line would be a good assay for a factor that had the biological activity of interest in vivo. As a young scientist, Dr. Barde had chosen a project that was extraordinarily difficult. There was an activity present in brain but in order to purify it to homogeneity he had to employ a relatively large number of steps that included substantial losses in each case. He had to purify BDNF over a million fold before it could be identified as a single band after fractionation on a 2-D gel! He obtained only a small amount of material that would have been useless for most biochemical studies (EMBO 1: 549), but a wave of discovery had introduced recombinant technology into biological research. Rather than purifying BDNF from brain and studying its biological activity, this group decided to isolate a cDNA for BDNF and purify the molecule from recombinant sources. This decision serves as an introduction to our next section where we will talk about the importance of cDNA cloning; but, in this chapter, the importance of cloning BDNF really lies in the fact that expression of BDNF in artificial systems gave these investigators an abundant source of BDNF where BDNF was present at relatively high concentrations. It allowed them to rapidly purify BDNF and study its biological effects. Furthermore, cloning BDNF and showing that the expressed protein had biological activity was strong evidence that the right molecule had been purified and cloned. This is a non-traditional proof that a biological activity can be found in a particular protein, but it is persuasive. To be rigorous it must be combined with classical proofs (e.g., it could be that the cloned molecule is only a part of the biological activity or it could be a regulatory molecule that activates expression of the real activity, etc.). Because of these technological breakthroughs the history of BDNF has certainly moved much more rapidly than the history of NGF.

A final lesson and a bridge to cDNA cloning strategy. The cDNA encoding BDNF predicted a protein that was very homologous to NGF. Thus BDNF could be considered a member of an NGF family of growth factors. Comparison of NGF and BDNF showed that there are regions of strong homology and regions where proteins were less conserved. Making the assumption that these regions of strong homology might be present in other members of the family, if indeed such other members existed, it is possible to design a strategy to isolate a cDNA for other family members based purely on primary structure of the hypothetical protein and its cDNA (by this time the cDNA for NGF had been isolated using an oligonucleotide probe generated on the basis of known sequence as described elsewhere).

By picking regions of homology and doing PCR between them, products corresponding to both NGF and BDNF were formed; but there was also another product which could be used as a probe to screen for a new cDNA. This strategy led to the isolation of a novel cDNA and the corresponding protein, in extremely rapid fashion and with dramatically less effort than has been required to get NGF or BDNF (Hohn et al, Nature, 344-399 (1990)). The third member of the neurotrophin family which was christened NT-3 . Because it was possible to isolate a cDNA for NT-3, it was never necessary to go through the laborious process of purifying it from naturally occurring sources.

Summary:

Protein purification requires an investigator to have a well-conceived assay that is reasonably convenient.
Protein purification requires a series of carefully monitored fractionation steps from initial starting material, and careful monitoring requires bookkeeping and the generation of a purification table*. The development of the purification procedure is guided by consideration of the stability of the protein, the ability of particular steps to remove contaminating material, and the ability of the fractionation steps to maintain the biological activity of the proteins of interest.
The purification can be much easier if the investigator can identify a rich source of the protein, both because purification from a rich source requires a less dramatic increase in the purity of the protein (smaller fold purification and thus fewer steps and fewer losses) and it should result in a larger amount of protein to be studied. The advent of modern cloning technology has meant that it is often preferable to choose an artificial rather than a natural source of a protein for purification. Expressed proteins can frequently be 10% or more of the total protein in an extract, which is a tremendous experimental advantage. Nevertheless, the basic logic of protein purification remains the same in all cases.
Fractionation techniques not only enrich proteins but also provide criteria to determine whether a fraction is contaminated by extraneous material and to establish the degree of this contamination. Different approaches require different degrees of purity, but the availability of a preparation that is nearly homogeneous can not be overstated.
Generally, it is dangerous to waste clean thoughts on dirty proteins.