User:Niksab/Molecular similarity: Difference between revisions
Line 2: | Line 2: | ||
= Molecular similarity in virtual screening = |
= Molecular similarity in virtual screening = |
||
The similarity-based [[virtual screening]] assumes that all compounds in a database that are similar to a query compound have similar biological activity. Although this hypothesis is not always valid, quite often the set of retrieved compounds is considerably enriched with actives. To achieve high efficacy of similarity-based screening of databases containing millions of compounds, molecular structures are usually represented by ''screens'' (structural keys) or by fixed-size or variable-size ''fingerprints''. Screens and fingerprints can contain both 2D- and 3D-information. However, the 2D-fingerprints, which are a kind of binary fragment descriptors, dominate in this area. Fragment-based structural keys, like MDL keys, are sufficiently good for handling small and medium-sized chemical databases, whereas processing of large databases is performed with fingerprints having much higher information density. Fragment-based Daylight, BCI, and UNITY 2D fingerprints are the best known examples. The most popular similarity measure for comparing chemical structures represented by means of fingerprints is the [[Jaccard_index|Tanimoto (or Jaccard) coefficient]] ''T''. Two structures are usually considered similar if <math>T > 0.85</math> (for Daylight fingerprints). |
The similarity-based [[virtual screening]] (a kind of ligand-based [[virtual screening]]) assumes that all compounds in a database that are similar to a query compound have similar biological activity. Although this hypothesis is not always valid, quite often the set of retrieved compounds is considerably enriched with actives. To achieve high efficacy of similarity-based screening of databases containing millions of compounds, molecular structures are usually represented by ''screens'' (structural keys) or by fixed-size or variable-size ''fingerprints''. Screens and fingerprints can contain both 2D- and 3D-information. However, the 2D-fingerprints, which are a kind of binary fragment descriptors, dominate in this area. Fragment-based structural keys, like MDL keys, are sufficiently good for handling small and medium-sized chemical databases, whereas processing of large databases is performed with fingerprints having much higher information density. Fragment-based Daylight, BCI, and UNITY 2D fingerprints are the best known examples. The most popular similarity measure for comparing chemical structures represented by means of fingerprints is the [[Jaccard_index|Tanimoto (or Jaccard) coefficient]] ''T''. Two structures are usually considered similar if <math>T > 0.85</math> (for Daylight fingerprints). |
Revision as of 01:49, 1 December 2008
The notion of molecular similarity (or chemical similarity) is one of the most important concepts in chemoinformatics. It plays an important role in modern approaches to predicting the properties o chemical compounds, designing chemicals with a predefined set of properties and, especially, in conducting drug design studies by screening large databases containing structures of available (or potentially available) chemicals. These studies are based on the similar property principle of Johnson and Maggiora, which states: similar compounds have similar properties.
Molecular similarity in virtual screening
The similarity-based virtual screening (a kind of ligand-based virtual screening) assumes that all compounds in a database that are similar to a query compound have similar biological activity. Although this hypothesis is not always valid, quite often the set of retrieved compounds is considerably enriched with actives. To achieve high efficacy of similarity-based screening of databases containing millions of compounds, molecular structures are usually represented by screens (structural keys) or by fixed-size or variable-size fingerprints. Screens and fingerprints can contain both 2D- and 3D-information. However, the 2D-fingerprints, which are a kind of binary fragment descriptors, dominate in this area. Fragment-based structural keys, like MDL keys, are sufficiently good for handling small and medium-sized chemical databases, whereas processing of large databases is performed with fingerprints having much higher information density. Fragment-based Daylight, BCI, and UNITY 2D fingerprints are the best known examples. The most popular similarity measure for comparing chemical structures represented by means of fingerprints is the Tanimoto (or Jaccard) coefficient T. Two structures are usually considered similar if (for Daylight fingerprints).