Protein folding

Protein folding is the physical process by which a polypeptide folds into its characteristic and functional three-dimensional structure from random coil.^[1] Each protein exists as an unfolded polypeptide or random coil when translated from a sequence of mRNA to a linear chain of amino acids. This polypeptide lacks any developed three-dimensional structure (the left hand side of the neighboring figure). Amino acids interact with each other to produce a well-defined three dimensional structure, the folded protein (the right hand side of the figure), known as the native state. The resulting three-dimensional structure is determined by the amino acid sequence.^[2].

For many proteins the correct three dimensional structure is essential to function.^[3] Failure to fold into the intended shape usually produces inactive proteins with different properties including toxic prions. Several neurodegenerative and other diseases are believed to result from the accumulation of misfolded (incorrectly folded) proteins.^[4]

Known facts

Relationship between folding and amino acid sequence

The amino-acid sequence (or primary structure) of a protein defines its native conformation. A protein molecule folds spontaneously during or after synthesis. While these macromolecules may be regarded as "folding themselves", the process also depends on the solvent (water or lipid bilayer),^[5] the concentration of salts, the temperature, and the presence of molecular chaperones.

Folded proteins usually have a hydrophobic core in which side chain packing stabilizes the folded state, and charged or polar side chains occupy the solvent-exposed surface where they interact with surrounding water. Minimizing the number of hydrophobic side-chains exposed to water is an important driving force behind the folding process.^[6] Formation of intramolecular hydrogen bonds provides another important contribution to protein stability.^[7] The strength of hydrogen bonds depends on their environment, thus H-bonds enveloped in a hydrophobic core contribute more than H-bonds exposed to the aqueous environment to the stability of the native state.^[8]

The process of folding in vivo often begins co-translationally, so that the N-terminus of the protein begins to fold while the C-terminal portion of the protein is still being synthesized by the ribosome. Specialized proteins called chaperones assist in the folding of other proteins.^[9] A well studied example is the bacterial GroEL system, which assists in the folding of globular proteins. In eukaryotic organisms chaperones are known as heat shock proteins. Although most globular proteins are able to assume their native state unassisted, chaperone-assisted folding is often necessary in the crowded intracellular environment to prevent aggregation; chaperones are also used to prevent misfolding and aggregation which may occur as a consequence of exposure to heat or other changes in the cellular environment.

For the most part, scientists have been able to study many identical molecules folding together en masse. At the coarsest level, it appears that in transitioning to the native state, a given amino acid sequence takes on roughly the same route and proceeds through roughly the same intermediates and transition states. Often folding involves first the establishment of regular secondary and supersecondary structures, particularly alpha helices and beta sheets, and afterwards tertiary structure. Formation of quaternary structure usually involves the "assembly" or "coassembly" of subunits that have already folded. The regular alpha helix and beta sheet structures fold rapidly because they are stabilized by intramolecular hydrogen bonds, as was first characterized by Linus Pauling. Protein folding may involve covalent bonding in the form of disulfide bridges formed between two cysteine residues or the formation of metal clusters. Shortly before settling into their more energetically favourable native conformation, molecules may pass through an intermediate "molten globule" state.

The essential fact of folding, however, remains that the amino acid sequence of each protein contains the information that specifies both the native structure and the pathway to attain that state. This is not to say that nearly identical amino acid sequences always fold similarly.^[10] Conformations differ based on environmental factors as well; similar proteins fold differently based on where they are found. Folding is a spontaneous process independent of energy inputs from nucleoside triphosphates. The passage of the folded state is mainly guided by hydrophobic interactions, formation of intramolecular hydrogen bonds, and van der Waals forces, and it is opposed by conformational entropy.

Disruption of the native state

Under some conditions proteins will not fold into their biochemically functional forms. Temperatures above or below the range that cells tend to live in will cause thermally unstable proteins to unfold or "denature" (this is why boiling makes an egg white turn opaque). High concentrations of solutes, extremes of pH, mechanical forces, and the presence of chemical denaturants can do the same. Protein thermal stability is far from constant, however. For example, hyperthermophilic bacteria have been found that grow at temperatures as high as 122°C ^[11], which of course requires that their full complement of vital proteins and protein assemblies be stable at that temperature or above.

A fully denatured protein lacks both tertiary and secondary structure, and exists as a so-called random coil. Under certain conditions some proteins can refold; however, in many cases denaturation is irreversible.^[12] Cells sometimes protect their proteins against the denaturing influence of heat with enzymes known as chaperones or heat shock proteins, which assist other proteins both in folding and in remaining folded. Some proteins never fold in cells at all except with the assistance of chaperone molecules, which either isolate individual proteins so that their folding is not interrupted by interactions with other proteins or help to unfold misfolded proteins, giving them a second chance to refold properly. This function is crucial to prevent the risk of precipitation into insoluble amorphous aggregates.

Incorrect protein folding and neurodegenerative disease

Aggregated proteins are associated with prion-related illnesses such as Creutzfeldt-Jakob disease, bovine spongiform encephalopathy (mad cow disease), amyloid-related illnesses such as Alzheimer's Disease and familial amyloid cardiomyopathy or polyneuropathy, as well as intracytoplasmic aggregation diseases such as Huntington's and Parkinson's disease.^[4]^[13] These age onset degenerative diseases are associated with the multimerization of misfolded proteins into insoluble, extracellular aggregates and/or intracellular inclusions including cross-beta sheet amyloid fibrils; it is not clear whether the aggregates are the cause or merely a reflection of the loss of protein homeostasis, the balance between synthesis, folding, aggregation and protein turnover. Misfolding and excessive degradation instead of folding and function leads to a number of proteopathy diseases such as antitrypsin-associated Emphysema, cystic fibrosis and the lysosomal storage diseases, where loss of function is the origin of the disorder. While protein replacement therapy has historically been used to correct the latter disorders, an emerging approach is to use pharmaceutical chaperones to fold mutated proteins to render them functional.

Kinetics and the Levinthal Paradox

The duration of the folding process varies dramatically depending on the protein of interest. When studied outside the cell, the slowest folding proteins require many minutes or hours to fold primarily due to proline isomerization, and must pass through a number of intermediate states, like checkpoints, before the process is complete.^[14] On the other hand, very small single-domain proteins with lengths of up to a hundred amino acids typically fold in a single step.^[15] Time scales of milliseconds are the norm and the very fastest known protein folding reactions are complete within a few microseconds.^[16]

The Levinthal paradox^[17] observes that if a protein were to fold by sequentially sampling all possible conformations, it would take an astronomical amount of time to do so, even if the conformations were sampled at a rapid rate (on the nanosecond or picosecond scale). Based upon the observation that proteins fold much faster than this, Levinthal then proposed that a random conformational search does not occur, and the protein must, therefore, fold through a series of meta-stable intermediate states.

Techniques for studying protein folding

Circular Dichroism

Circular dichroism is one of the most general and basic tools to study protein folding. Circular dichroism spectroscopy measures the absorption of circularly polarized light. In proteins, structures such as alpha helicies and beta sheets are chiral, and thus absorb such light. The absorption of this light acts as a marker of the degree of foldedness of the protein ensemble. This technique can be used to measure equilibrium unfolding of the protein by measuring the change in this absorption as a function of denaturant concentration or temperature. A denaturant melt measures the free energy of unfolding as well as the protein's m value, or denaturant dependence. A temperature melt measures the melting temperature (T_m) of the protein. This type of spectroscopy can also be combined with fast-mixing devices, such as stopped flow, to measure protein folding kinetics and to generate chevron plots.

Vibrational circular dichroism of proteins

The more recent developments of vibrational circular dichroism (VCD) techniques for proteins, currently involving Fourier transform (FFT) instruments, provide powerful means for determining protein conformations in solution even for very large protein molecules. Such VCD studies of proteins are often combined with X-ray diffraction of protein crystals, FT-IR data for protein solutions in heavy water (D₂O), or ab initio quantum computations to provide unambiguous structural assignments that are unobtainable from CD.

Modern studies of folding with high time resolution

The study of protein folding has been greatly advanced in recent years by the development of fast, time-resolved techniques. These are experimental methods for rapidly triggering the folding of a sample of unfolded protein, and then observing the resulting dynamics. Fast techniques in widespread use include neutron scattering^[18], ultrafast mixing of solutions, photochemical methods, and laser temperature jump spectroscopy. Among the many scientists who have contributed to the development of these techniques are Jeremy Cook, Heinrich Roder, Harry Gray, Martin Gruebele, Brian Dyer, William Eaton, Sheena Radford, Chris Dobson, Sir Alan R. Fersht and Bengt Nölting.

Energy landscape theory of protein folding

The protein folding phenomenon was largely an experimental endeavor until the formulation of energy landscape theory by Joseph Bryngelson and Peter Wolynes in the late 1980s and early 1990s. This approach introduced the principle of minimal frustration, which asserts that evolution has selected the amino acid sequences of natural proteins so that interactions between side chains largely favor the molecule's acquisition of the folded state. Interactions that do not favor folding are selected against, although some residual frustration is expected to exist. A consequence of these evolutionarily selected sequences is that proteins are generally thought to have globally "funneled energy landscapes" (coined by José Onuchic[reference needed]) that are largely directed towards the native state. This "folding funnel" landscape allows the protein to fold to the native state through any of a large number of pathways and intermediates, rather than being restricted to a single mechanism. The theory is supported by both computational simulations of model proteins and numerous experimental studies, and it has been used to improve methods for protein structure prediction and design [reference needed]. The description of protein folding by the leveling free-energy landscape is also consistent with the 2^nd law of thermodynamics.^[19]

Computational prediction of protein tertiary structure

De novo or ab initio techniques for computational protein structure prediction is related to, but strictly distinct from, studies involving protein folding. Molecular Dynamics (MD) is an important tool for studying protein folding and dynamics in silico. Because of computational cost, ab initio MD folding simulations with explicit water are limited to peptides and very small proteins ^[20]^[21]. MD simulations of larger proteins remain restricted to dynamics of the experimental structure or its high-temperature unfolding. In order to simulate long time folding processes (beyond about 1 microsecond), like folding of small-size proteins (about 50 residues) or larger, some approximations or simplifications in protein models need to be introduced. An approach using reduced protein representation (pseudo-atoms representing groups of atoms are defined) and statistical potential is not only useful in protein structure prediction, but is also capable of reproducing the folding pathways.^[22]

There are distributed computing projects which use idle CPU or GPU time of personal computers to solve problems such as protein folding or prediction of protein structure. People can run these programs on their computer or PlayStation 3 to support them. See links below (for example Folding@Home) to get information about how to participate in these projects.

Experimental techniques of protein structure determination

Folded structures of proteins are routinely determined by X-ray crystallography and NMR.

References

^ Alberts, Bruce (2002). "The Shape and Structure of Proteins". Molecular Biology of the Cell; Fourth Edition. New York and London: Garland Science. ISBN 0-8153-3218-1. {{cite book}}: External link in |chapterurl= (help); Unknown parameter |chapterurl= ignored (|chapter-url= suggested) (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)
^ Anfinsen C (1972). "The formation and stabilization of protein structure". Biochem. J. 128 (4): 737–49. PMID 4565129.
^ Jeremy M. Berg, John L. Tymoczko, Lubert Stryer; Web content by Neil D. Clarke (2002). "3. Protein Structure and Function". Biochemistry. San Francisco: W.H. Freeman. ISBN 0-7167-4684-0. {{cite book}}: External link in |chapterurl= (help); Unknown parameter |chapterurl= ignored (|chapter-url= suggested) (help)CS1 maint: multiple names: authors list (link)
^ ^a ^b Template:Cite article
^ van den Berg B, Wain R, Dobson CM, Ellis RJ (2000). "Macromolecular crowding perturbs protein refolding kinetics: implications for folding inside the cell". EMBO J. 19 (15): 3870–5. doi:10.1093/emboj/19.15.3870. PMC 306593. PMID 10921869. {{cite journal}}: Unknown parameter |month= ignored (help)CS1 maint: multiple names: authors list (link)
^ Pace C, Shirley B, McNutt M, Gajiwala K (1 January 1996). "Forces contributing to the conformational stability of proteins". FASEB J. 10 (1): 75–83. PMID 8566551.{{cite journal}}: CS1 maint: multiple names: authors list (link)
^ Rose G, Fleming P, Banavar J, Maritan A (2006). "A backbone-based theory of protein folding". Proc. Natl. Acad. Sci. U.S.A. 103 (45): 16623–33. doi:10.1073/pnas.0606843103. PMID 17075053.{{cite journal}}: CS1 maint: multiple names: authors list (link)
^ Deechongkit S, Nguyen H, Dawson PE, Gruebele M, Kelly JW (2004). "Context Dependent Contributions of Backbone H-Bonding to β-Sheet Folding Energetics". Nature. 403 (45): 101–5. doi:10.1073/pnas.0606843103. PMID 17075053.{{cite journal}}: CS1 maint: multiple names: authors list (link)
^ Lee S, Tsai F (2005). "Molecular chaperones in protein quality control". J. Biochem. Mol. Biol. 38 (3): 259–65. PMID 15943899.
^ Alexander PA, He Y, Chen Y, Orban J, Bryan PN. (2007). "The design and characterization of two proteins with 88% sequence identity but different structure and function". Proc Natl Acad Sci U S A. 104 (29): 11963–8. doi:10.1073/pnas.0700922104. PMC 1906725. PMID 17609385.{{cite journal}}: CS1 maint: multiple names: authors list (link)
^ Takai K, Nakamura K, Toki T, Tsunogai U, Miyazaki M, Miyazaki J, Hirayama H, Nakagawa S, Nunoura T, Horikoshi K (2008). "Cell proliferation at 122°C and isotopically heavy CH4 production by a hyperthermophilic methanogen under high-pressure cultivation". Proc Natl Acad Sci USA. 105: 10949–54. doi:10.1073/pnas.0712334105.{{cite journal}}: CS1 maint: multiple names: authors list (link)
^ Shortle D (1 January 1996). "The denatured state (the other half of the folding equation) and its role in protein stability". FASEB J. 10 (1): 27–34. PMID 8566543.
^ Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1146/annurev.biochem.75.101304.123901, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1146/annurev.biochem.75.101304.123901 instead.
^ Kim PS, Baldwin RL (1990). "Intermediates in the folding reactions of small proteins". Annu. Rev. Biochem. 59: 631–60. doi:10.1146/annurev.bi.59.070190.003215. PMID 2197986.
^ Jackson SE (1998). "How do small single-domain proteins fold?" (^{[dead link‍]}). Fold Des. 3 (4): R81–91. doi:10.1016/S1359-0278(98)00033-9. PMID 9710577. {{cite journal}}: Unknown parameter |month= ignored (help)
^ Kubelka J, Hofrichter J, Eaton WA (2004). "The protein folding 'speed limit'". Curr. Opin. Struct. Biol. 14 (1): 76–88. doi:10.1016/j.sbi.2004.01.013. PMID 15102453. {{cite journal}}: Unknown parameter |month= ignored (help)CS1 maint: multiple names: authors list (link)
^ C. Levinthal (1968). "Are there pathways for protein folding?" (PDF). J. Chim. Phys. 65: 44–5.
^ Bu, Z; Cook, J; Callaway, DJ (2001). "Dynamic regimes and correlated structural dynamics in native and denatured alpha-lactalbuminC". J Mol Biol. 312 (4): 865–873. doi:10.1006/jmbi.2001.5006. PMID 11575938. {{cite journal}}: More than one of |author= and |last1= specified (help)
^ Sharma, V., Kaila, V.R.I. and Annila, A. (2009). "Protein folding as an evolutionary process". Physica A. 388 (6): 851–862. doi:10.1016/j.physa.2008.12.004.{{cite journal}}: CS1 maint: multiple names: authors list (link)
^ "Fragment-based Protein Folding Simulations".
^ "Protein folding" (by Molecular Dynamics).
^ Kmiecik S and Kolinski A (2007). "Characterization of protein-folding pathways by reduced-space modeling". Proc. Natl. Acad. Sci. U.S.A. 104 (30): 12330–5. doi:10.1073/pnas.0702265104. PMID 17636132.

External links

[Alberts-1] Alberts, Bruce (2002). "The Shape and Structure of Proteins". Molecular Biology of the Cell; Fourth Edition. New York and London: Garland Science. ISBN 0-8153-3218-1. {{cite book}}: External link in |chapterurl= (help); Unknown parameter |chapterurl= ignored (|chapter-url= suggested) (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)

[Anfinsen-2] Anfinsen C (1972). "The formation and stabilization of protein structure". Biochem. J. 128 (4): 737–49. PMID 4565129.

[3] Jeremy M. Berg, John L. Tymoczko, Lubert Stryer; Web content by Neil D. Clarke (2002). "3. Protein Structure and Function". Biochemistry. San Francisco: W.H. Freeman. ISBN 0-7167-4684-0. {{cite book}}: External link in |chapterurl= (help); Unknown parameter |chapterurl= ignored (|chapter-url= suggested) (help)CS1 maint: multiple names: authors list (link)

[Selkoe:03-4] Template:Cite article

[5] van den Berg B, Wain R, Dobson CM, Ellis RJ (2000). "Macromolecular crowding perturbs protein refolding kinetics: implications for folding inside the cell". EMBO J. 19 (15): 3870–5. doi:10.1093/emboj/19.15.3870. PMC 306593. PMID 10921869. {{cite journal}}: Unknown parameter |month= ignored (help)CS1 maint: multiple names: authors list (link)

[Pace-6] Pace C, Shirley B, McNutt M, Gajiwala K (1 January 1996). "Forces contributing to the conformational stability of proteins". FASEB J. 10 (1): 75–83. PMID 8566551.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[Rose-7] Rose G, Fleming P, Banavar J, Maritan A (2006). "A backbone-based theory of protein folding". Proc. Natl. Acad. Sci. U.S.A. 103 (45): 16623–33. doi:10.1073/pnas.0606843103. PMID 17075053.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[Deechongkit-8] Deechongkit S, Nguyen H, Dawson PE, Gruebele M, Kelly JW (2004). "Context Dependent Contributions of Backbone H-Bonding to β-Sheet Folding Energetics". Nature. 403 (45): 101–5. doi:10.1073/pnas.0606843103. PMID 17075053.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[9] Lee S, Tsai F (2005). "Molecular chaperones in protein quality control". J. Biochem. Mol. Biol. 38 (3): 259–65. PMID 15943899.

[10] Alexander PA, He Y, Chen Y, Orban J, Bryan PN. (2007). "The design and characterization of two proteins with 88% sequence identity but different structure and function". Proc Natl Acad Sci U S A. 104 (29): 11963–8. doi:10.1073/pnas.0700922104. PMC 1906725. PMID 17609385.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[11] Takai K, Nakamura K, Toki T, Tsunogai U, Miyazaki M, Miyazaki J, Hirayama H, Nakagawa S, Nunoura T, Horikoshi K (2008). "Cell proliferation at 122°C and isotopically heavy CH4 production by a hyperthermophilic methanogen under high-pressure cultivation". Proc Natl Acad Sci USA. 105: 10949–54. doi:10.1073/pnas.0712334105.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[Shortle-12] Shortle D (1 January 1996). "The denatured state (the other half of the folding equation) and its role in protein stability". FASEB J. 10 (1): 27–34. PMID 8566543.

[ChitiDobson-13] Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1146/annurev.biochem.75.101304.123901, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1146/annurev.biochem.75.101304.123901 instead.

[14] Kim PS, Baldwin RL (1990). "Intermediates in the folding reactions of small proteins". Annu. Rev. Biochem. 59: 631–60. doi:10.1146/annurev.bi.59.070190.003215. PMID 2197986.

[15] Jackson SE (1998). "How do small single-domain proteins fold?" (^{[dead link‍]}). Fold Des. 3 (4): R81–91. doi:10.1016/S1359-0278(98)00033-9. PMID 9710577. {{cite journal}}: Unknown parameter |month= ignored (help)

[16] Kubelka J, Hofrichter J, Eaton WA (2004). "The protein folding 'speed limit'". Curr. Opin. Struct. Biol. 14 (1): 76–88. doi:10.1016/j.sbi.2004.01.013. PMID 15102453. {{cite journal}}: Unknown parameter |month= ignored (help)CS1 maint: multiple names: authors list (link)

[17] C. Levinthal (1968). "Are there pathways for protein folding?" (PDF). J. Chim. Phys. 65: 44–5.

[18] Bu, Z; Cook, J; Callaway, DJ (2001). "Dynamic regimes and correlated structural dynamics in native and denatured alpha-lactalbuminC". J Mol Biol. 312 (4): 865–873. doi:10.1006/jmbi.2001.5006. PMID 11575938. {{cite journal}}: More than one of |author= and |last1= specified (help)

[19] Sharma, V., Kaila, V.R.I. and Annila, A. (2009). "Protein folding as an evolutionary process". Physica A. 388 (6): 851–862. doi:10.1016/j.physa.2008.12.004.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[20] "Fragment-based Protein Folding Simulations".

[21] "Protein folding" (by Molecular Dynamics).

[Kmiecik-22] Kmiecik S and Kolinski A (2007). "Characterization of protein-folding pathways by reduced-space modeling". Proc. Natl. Acad. Sci. U.S.A. 104 (30): 12330–5. doi:10.1073/pnas.0702265104. PMID 17636132.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

v t e Protein tertiary structure
General	Structural domain Protein folding Structure determination methods
All-α folds:	Helix bundle Globin fold Homeodomain fold Alpha solenoid Death fold
All-β folds:	Immunoglobulin domain Beta barrel Beta-propeller Beta helix
α/β folds:	TIM barrel Leucine-rich repeat Flavodoxin fold Rossmann fold Thioredoxin fold Trefoil knot fold
α+β folds:	DNA clamp Ferredoxin fold Ribonuclease A SH2-like fold
Irregular folds:	Conotoxin

v t e Proteins
Processes	Protein biosynthesis Post-translational modification Protein folding Protein targeting Proteome Protein methods
Structures	Protein structure Protein structural domains Proteasome
Types	List of proteins Membrane protein Globular protein Globulin Edestin Albumin Fibrous protein Chromoprotein Photoreceptor protein Biliprotein Phycobiliprotein Phytochrome Lipocalin