Power graph analysis: Difference between revisions
Line 91: | Line 91: | ||
<ref name="powergraphs"> |
<ref name="powergraphs"> |
||
{{cite journal |
{{cite journal |
||
| author = Royer, |
| author = Royer, Loïc; Reimann, Matthias; Andreopoulos, Bill; Schroeder, Michael; |
||
| title = Unraveling Protein Networks with Power Graph Analysis |
| title = Unraveling Protein Networks with Power Graph Analysis |
||
| journal = PLoS Computational Biology |
| journal = PLoS Computational Biology |
||
Line 122: | Line 122: | ||
<ref name="net_comp"> |
<ref name="net_comp"> |
||
{{cite journal |
{{cite journal |
||
| author = Royer, |
| author = Royer, Loïc; Reimann, Matthias; Stewart, Francis A.; Schroeder, Michael; |
||
| title = Network compression as a quality measure for protein interaction networks. |
| title = Network compression as a quality measure for protein interaction networks. |
||
| journal = Plos One |
| journal = Plos One |
Revision as of 09:46, 19 September 2012
In computational biology, power graph analysis is a method for the analysis and representation of complex networks. Power graph analysis is the computation, analysis and visual representation of a power graph from a graphs (networks).
Power graph analysis can be thought of as a lossless compression algorithm for graphs. It extends graph syntax with representations of cliques, bicliques and stars. Compression levels of up to 95% have been obtained for complex biological networks.
Hypergraphs are a generalization of graphs in which edges are not just couples of nodes but arbitrary n-tuples. Power Graphs are not another generalization of graphs, but instead a novel representation of graphs that proposes a shift from the "node and edge" language to an using cliques, bicliques and stars as primitives.
Power graphs
Graphical representation
Graphs are drawn with circles or points that represent nodes and lines connecting pairs of nodes that represent edges. Power graphs extend the syntax of graphs with power nodes, which are drawn as a circle enclosing nodes or other power nodes, and power edges, which are lines between power nodes.
Bicliques are two sets of nodes with an edge between every member of one set and every member of the other set. In a power graph, a biclique is represented as an edge between two power nodes.
Cliques are a set of nodes with an edge between every pair of nodes. In a power graph, a clique is represented by a power node with a loop.
Stars are a set of nodes with an edge between every member of that set and a single node outside the set. In a power graph, a star is represented by a power edge between a regular node and a power node.
Formal definition
Given a graph where is the set of nodes and is the set of edges, a power graph is a graph defined on the power set of power nodes connected to each other by power edges: . Hence power graphs are defined on the power set of nodes as well as on the power set of edges of the graph .
The semantics of power graphs are as follows: if two power nodes are connected by a power edge, this means that all nodes of the first power node are connected to all nodes of the second power node. Similarly, if a power node is connected to itself by a power edge, this signifies that all nodes in the power node are connected to each other by edges.
The following two conditions are required:
- Power node hierarchy condition: Any two power nodes are either disjoint, or one is included in the other.
- Power edge disjointness condition: There is an onto mapping from edges of the original graph to power edges.[citation needed]
Analogy to Fourier analysis
The Fourier analysis of a function can be seen as a rewriting of the function in terms of harmonic functions instead of pairs. This transformation changes the point of view from time domain to frequency domain and enables many interesting applications in signal analysis, data compression, and filtering. Similarly, Power Graph Analysis is a rewriting or decomposition of a network using bicliques, cliques and stars as primitive elements (just as harmonic functions for Fourier analysis). It can be used to analyze, compress and filter networks. There are, however, several key differences. First, in Fourier analysis the two spaces (time and frequency domains) are the same function space - but stricto sensu, power graphs are not graphs. Second, there is not a unique power graph representing a given graph. Yet a very interesting class of power graphs are minimal power graphs which have the least number of power edges and power nodes necessary to represent a given graph.
Minimal power graphs
In general, there is no unique minimal power graph for a given graph. In this example (right) a graph of four nodes and five edges admits two minimal power graphs of two power edges each. The main difference between these two minimal power graphs is the higher nesting level of the second power graph as well as a loss of symmetry with respect to the underlying graph. Loss of symmetry is only a problem in small toy examples since complex networks rarely exhibit such symmetries in the first place. Additionally, one can minimize the nesting level but even then, there is in general not a unique minimal power graph of minimal nesting level.
Power graph greedy algorithm
The power graph greedy algorithm relies on two simple steps to perform the decomposition:
The first step identifies candidate power nodes through a hierarchical clustering of the nodes in the network based on the similarity of their neighboring nodes. The similarity of two sets of neighbors is taken as the Jaccard index of the two sets.
The second step performs a greedy search for possible power edges between candidate power nodes. Power edges abstracting the most edges in the original network are added first to the power graph. Thus bicliques, cliques and stars are incrementally replaced with power edges, until all remaining single edges are also added. Candidate power nodes that are not the end point of any power edge are ignored.
Modular decomposition
Modular decomposition can be used to compute a power graph by using the strong modules of the modular decomposition. Modules in modular decomposition are groups of nodes in a graph that have identical neighbors. A Strong Module is a module that does not overlap with another module. However, in complex networks strong modules are more the exception than the rule. Therefore the power graphs obtained through modular decomposition are far from minimality. The main difference between modular decomposition and power graph analysis is the emphasis of power graph analysis in decomposing graphs not only using modules of nodes but also modules of edges (cliques, bicliques). Indeed, power graph analysis can be seen as a loss-less simultaneous clustering of both nodes and edges.
Applications
Protein-protein interaction networks
Power Graph Analysis has been showed to be useful for the analysis of several types of biological networks such as Protein-protein interaction networks[1], domain-peptide binding motifs, Gene regulatory networks[2] and Homology/Paralogy networks. Network compression, a new measure derived from Power Graphs, has been proposed as a quality measure for protein interaction networks[3].
Drug repositioning
Power Graphs have been also applied to the analysis of drug-target-disease networks for Drug repositioning[4].
Community mining
Power Graphs have been applied to large-scale data in social networks, for the identification of communities sharing similar properties.
References
- ^
Royer, Loïc; Reimann, Matthias; Andreopoulos, Bill; Schroeder, Michael; (11 Jul 2008). Berg, Johannes (ed.). "Unraveling Protein Networks with Power Graph Analysis" (PDF). PLoS Computational Biology. 4 (7): e1000108. doi:10.1371/journal.pcbi.1000108. PMC 2424176. PMID 18617988.
{{cite journal}}
: CS1 maint: extra punctuation (link) CS1 maint: multiple names: authors list (link) CS1 maint: unflagged free DOI (link) - ^
Martina Maisel, Hans-Jörg Habisch, Loïc Royer, Alexander Herr, Javorina Milosevic, Andreas Hermann, Stefan Liebau, Rolf Brenner, Johannes Schwarz, Michael Schroeder, Alexander Storch (15 Oct 2010). "Genome-wide expression profiling and functional network analysis upon neuroectodermal conversion of human mesenchymal stem cells suggest HIF-1 and miR-124a as important regulators" (PDF). Experimental Cell Research. 16 (17). doi:10.1016/j.yexcr.2010.06.012. PMID 20599952.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) - ^
Royer, Loïc; Reimann, Matthias; Stewart, Francis A.; Schroeder, Michael; (18 Jun 2012). "Network compression as a quality measure for protein interaction networks" (PDF). Plos One. 7 (6). doi:10.1371/journal.pone.0035729. PMC 3377704. PMID 22719828.
{{cite journal}}
: CS1 maint: extra punctuation (link) CS1 maint: multiple names: authors list (link) CS1 maint: unflagged free DOI (link) - ^
Daminelli, Simone; Haupt, Joachim V. (26 Apr 2012). "Drug repositioning through incomplete bi-cliques in an integrated drug–target–disease network" (PDF). Integrative Biology (Camb.) (7). doi:10.1039/C2IB00154C. PMID 22538435.
{{cite journal}}
: Unknown parameter|coauthor=
ignored (|author=
suggested) (help)CS1 maint: extra punctuation (link)