Talk:Protein structure prediction

comparative structure prediction

In the third paragraph, should comparative structure prediction be linked to comparative protein modeling? 98.192.58.208 21:23, 17 October 2007 (UTC)[reply]

chaperonins

I have a couple of suggested amendments to the text.

Davjon, be bold and make the changes! Stewart Adcock 19:05, 21 Mar 2004 (UTC)

Firstly the statement about chaperonins is not correct. The discovery of chaperonins (and the small chaperone proteins) did not affect the statement that amino acid sequence entirely encodes the fold of a protein (i.e. the conclusion reached by Anfinsen). All chaperonins have been shown to do is to assist (i.e. increase the yield of) the folding process - mainly to prevent aggregation of the protein when hydrophobic surface area is exposed during folding. In other words, chaperonins allow a protein to fold into the conformation encoded by its amino acid sequence more efficiently. There is still no reason to doubt that all the information required to reproduce the fold of a protein is in the amino acid sequence, particularly in the context of protein structure prediction where inter-molecular aggregation is clearly not an issue. See the recent review by Saibil & Ranson (Trends Biochem Sci. 2002 Dec;27(12):627-632) for a more thorough discussion of what is currently known about chaperonins. User:Davjon 08:35, 20 March 2004

Yes, the article is not very clear on this point. Stewart Adcock 19:05, 21 Mar 2004 (UTC)

conditional conformation

Having said that, an additional complication is that some proteins are only able to fold into their biologically useful conformation under particular conditions (e.g. high pH or low temperature) or in the presence of other molecules (e.g. metals) or even other protein chains (in the formation of certain protein complexes). User:Davjon 08:35, 20 March 2004

You are, of course, correct again. But the problem is on deciding how much information you want to put in the article while still keeping it simple and understandable. Most readers will have very little background knowledge about proteins. Stewart Adcock 19:05, 21 Mar 2004 (UTC)

I have modified the article to reflect the lack of evidence that primary sequence does not determine protein folding. I noted that chaperones and glycosylation can be important in protein folding. --Antelan 20:52, 4 October 2006 (UTC)[reply]

ab initio / de novo

Lastly, the correct term is ab initio protein structure prediction rather than de novo modelling. Although both Latin terms have a similar meaning, de novo has historically been used in the phrase "de novo protein design" and not in the context of protein structure prediction. User:Davjon 08:35, 20 March 2004

Try arguing with a quantum chemist about this point ;-) When I wrote that, I was almost certianly introducing my own POV. I agree that ab initio is the common term, but not necessarily the correct term, being semantically incorrect. Again, this is probably just my POV and you should be bold. Stewart Adcock 19:05, 21 Mar 2004 (UTC)

PS Davjon, please sign your posts on talk pages. (You can use three or four tilde characters to sign without or with a date stamp, respectively).

crystallography terms

I reverted two recent changes. Here's why:

1. restored link to "X-ray crystallography" over "crystallography" because very few protein structures are solved by electron/neutron diffraction (not surprising since they destroy the sample).

2. first sentence. "Computational biology" is something very different to "Computational molecular biology". The most appropriate article on wikipedia seems to be bioinformatics so I've pointed the link at that.

Stewart Adcock 00:04, 2 Feb 2004 (UTC)

I've added theoretical chemistry because ab initio calculations belong to that field. --Zivilverteidigung

Back in the pre-genome pre-proteomics era, quite a few structures were determined by neutrons. Electrons are essential to do 2D crystallography of membrane proteins, which is all you can do with most of them. The sentence as you have it now is reasonable with "typically" and X-ray crystallography, but I consider crystallography a much better article. Really I think the two should be merged.168... 00:14, 2 Feb 2004 (UTC)

I agree that they should be merged (but as they currently stand, the X-ray cyrstallography one is more relevant to this article). I don't think either article is particularly brilliant though. Stewart Adcock 01:50, 3 Feb 2004 (UTC)

NP-hard modeling

I just reverted User:Micha's changes. Here is the reverted version:

De novo protein modelling methods seek to build three-dimensional protein models "from scratch". There are many possible procedures that either attempt to mimic protein folding or apply some stochastic method to search possible solutions. It has been shown that the complete computation of the structure of a protein is actually NP-hard [1]. Accordingly, these procedures require vast computational resources. Blue Gene is a powerful supercomputer designed to push the frontier to larger structures. Today, approaches that use a Monte Carlo method are most successful. Starting from a randomly assembled protein, a set of new structures is generated by making small random changes. From this set, the best structure is selected with an energy function, and used as a starting point for a new iteration. With many iterations and many initial structures, good structural proposals can be obtained within short times.

I dispute that "computation of the structure of a protein" is NP-hard. The reference given is to a paper (that I know very well) about protein design, which is an NP-hard problem. Protein fold prediction is much more complex.
I also dispute that Monte Carlo methods are the most successful approaches used today (although they are probably amongst the most successful). This would be easy to confirm by looking at the methods applied during the most recent CASP competition. Monte Carlo methods are included in the set of "stochastic methods" already mentioned. I agree that more detail is probably warranted here.

Stewart Adcock 07:20, 6 Feb 2004 (UTC)

Okay, I was perhaps a bit rash about the NP-hard description. However, I'm sure about the Monte Carlo methods: In CASP5 and CASP4 the Rosetta method of Dr. David Baker (UW) was most successful. [2] The description I wrote applies to the Rosetta method. [3]

Micha 05:00, 8 Feb 2004 (UTC)

Hi Micha. You are probably correct in saying that the single best approach to date is the Baker group's Rosetta, which does use a Monte Carlo algorithm. However, the fact that Rosetta uses MC isn't what makes it so good. Feel free to specifically mention Rosetta, but I don't think it is fair to say approaches that use a Monte Carlo method are most successful just because Rosetta does. If I was describing Rosetta, I'd make sure that I mentioned the fact that it starts by deriving a set of short fragments, and then assembles these. One reason why I didn't review some of the more interesting approaches is that I was trying to keep the article as simple as possible for anyone to understand. Do you think we should add more specific details? Stewart Adcock 18:55, 8 Feb 2004 (UTC)

Hello Stewart. I see your point in keeping the general article as simple as possible. Perhaps it would be better to create De novo protein structure prediction, and go into more detail there. Then the current article would stay as it is, and in the other article one could address CASP and other things. (I know CASP really addresses comparative modelling as well, so we could also mention/link it in the general article.) I don't have time to do it right now, but'd do it eventually. Micha 21:06, 8 Feb 2004 (UTC)

That seems like a good idea to me. CASP is probably significant enough to warrant its own article, even. Stewart Adcock 22:49, 8 Feb 2004 (UTC)

Okay. Let's start CASP at first. Micha 02:06, 9 Feb 2004 (UTC)

Ligand binding

Hi everyone, I was wondering whether it is appropriate to mention that another method to determine a partial protein structure are studies investigating ligand binding sites; in particular, the use of ligand analogues to determine the binding site's spacial features. I know it's only a partial structure, but should it be mentioned here? Volantares 12:25, 7 December 2006 (UTC)[reply]

That would be of interest if you have any particular references on the subject, often information gathered from the literature is used as hint by the CASP competitors but in an empirical way. If there is a methodology to retrieve structural information from ligand binding that would be worth mentioning, if it's just personal research no. Blastwizard 18:45, 7 December 2006 (UTC)[reply]

The software section needs a major revamp

The software for structure prediction could use a major overhaul, or should just link to the dedicated software page. —The preceding unsigned comment was added by Mndoci (talk • contribs) 20:55, 5 February 2007 (UTC).[reply]

adding a new reference to "protein structure prediction"

Within the Software section, reference might be made to a new prediction engine called LOMETS from Wu and Zhang at U of Kansas. The announcement article is

Wu S, Zhang Y. LOMETS: a local meta-threading-server for protein structure prediction. Nucleic Acids Res. 2007;35(10):3375-82. Epub 2007 May 3.

LOMETS is a tuned consensus of 9 engines, all supported in a server cluster by the authors. 152.2.146.67 22:19, 17 June 2007 (UTC)[reply]

I think we need to add some textbooks in the reference list.

Now More Important Than Ever

"The practical role of protein structure prediction is now more important than ever." The phrase sounds like an advertisement. Perhaps something more concrete... "Protein structure prediction bridges the increasing divide between the faster modern sequence gathering technologies and the slower protein analysis [...etc]"JeramieHicks (talk) 23:12, 21 November 2008 (UTC)[reply]

How does homochirality factor into modeling?

I am not a scientist, so pardon if this is a stupid question. How does homochirality factor into modeling algorithms? Since every molecule is capable of being arranged into an identical mirror image, but the mirror image is often inactive or toxic, then it would appear that half the search space can be thrown away and not computed. DMahalko (talk) 20:59, 11 May 2009 (UTC)[reply]