Graph neural network: Difference between revisions
→Applications: ordinary caps (MOS:HEAD) |
Viktor Guer (talk | contribs) |
||
(46 intermediate revisions by 33 users not shown) | |||
Line 1: | Line 1: | ||
⚫ | |||
{{Short description|Class of artificial neural networks}} |
{{Short description|Class of artificial neural networks}} |
||
⚫ | |||
A '''graph neural network''' ('''GNN''') |
A '''graph neural network''' ('''GNN''') belongs to a class of [[artificial neural network]]s for processing data that can be represented as [[Graph (abstract data type)|graphs]].<ref name="wucuipeizhao2022" /><ref name="scarselli2009" /><ref name="micheli2009" /><ref name="sanchez2021" /><ref name="daigavane2021" /> |
||
[[File:GNN building blocks.png|thumb|Basic building blocks of a graph neural network (GNN). <math>(1)</math> Permutation equivariant layer. <math>(2)</math> Local pooling layer. <math>(3)</math> Global pooling (or readout) layer. Colors indicate [[Feature (machine learning)|features]].]] |
[[File:GNN building blocks.png|thumb|Basic building blocks of a graph neural network (GNN). <math>(1)</math> Permutation equivariant layer. <math>(2)</math> Local pooling layer. <math>(3)</math> Global pooling (or readout) layer. Colors indicate [[Feature (machine learning)|features]].]] |
||
In the more general subject of "geometric [[deep learning]]", certain existing neural network architectures can be interpreted as GNNs operating on suitably defined graphs.<ref name=bronstein2021 /> [[ |
In the more general subject of "geometric [[deep learning]]", certain existing neural network architectures can be interpreted as GNNs operating on suitably defined graphs.<ref name=bronstein2021 /> A [[convolutional neural network]] layer, in the context of [[computer vision]], can be considered a GNN applied to graphs whose nodes are [[pixel]]s and only adjacent pixels are connected by edges in the graph. A [[Transformer (machine learning model)|transformer]] layer, in [[natural language processing]], can be considered a GNN applied to [[complete graph]]s whose nodes are [[words]] or tokens in a passage of [[natural language]] text. |
||
The key design element of GNNs is the use of ''pairwise message passing'', such that graph nodes iteratively update their representations by exchanging information with their neighbors. |
The key design element of GNNs is the use of ''pairwise message passing'', such that graph nodes iteratively update their representations by exchanging information with their neighbors. Several GNN architectures have been proposed,<ref name="scarselli2009" /><ref name="micheli2009" /><ref name="kipf2016" /><ref name="hamilton2017" /><ref name="velickovic2018" /> which implement different flavors of message passing,<ref name=bronstein2021 /><ref name=hajij2022></ref> started by recursive<ref name="scarselli2009" /> or convolutional constructive<ref name="micheli2009" /> approaches. {{As of|2022}}, it is an open question whether it is possible to define GNN architectures "going beyond" message passing, or instead every GNN can be built on message passing over suitably defined graphs.<ref name=velickovic2022 /> |
||
Relevant application domains for GNNs include [[social |
Relevant application domains for GNNs include [[Natural Language Processing|natural language processing]],<ref name="wuchen2023" /> [[social networks]],<ref name="ying2018" /> [[Citation graph|citation networks]],<ref name="stanforddata" /> [[molecular biology]],<ref>{{cite journal |last1=Zhang |first1=Weihang |last2=Cui |first2=Yang |last3=Liu |first3=Bowen |last4=Loza |first4=Martin |last5=Park |first5=Sung-Joon |last6=Nakai |first6=Kenta |date=5 April 2024 |title=HyGAnno: Hybrid graph neural network-based cell type annotation for single-cell ATAC sequencing data |url=https://academic.oup.com/bib/article/25/3/bbae152/7641197 |journal=Briefings in Bioinformatics |volume=25 |issue=3 |pages=bbae152 |doi=10.1093/bib/bbae152|pmid=38581422 |pmc=10998639 }}</ref> chemistry,<ref name="gilmer2017" /><ref>{{Cite journal |last1=Coley |first1=Connor W. |last2=Jin |first2=Wengong |last3=Rogers |first3=Luke |last4=Jamison |first4=Timothy F. |last5=Jaakkola |first5=Tommi S. |last6=Green |first6=William H. |last7=Barzilay |first7=Regina |last8=Jensen |first8=Klavs F. |date=2019-01-02 |title=A graph-convolutional neural network model for the prediction of chemical reactivity |journal=Chemical Science |language=en |volume=10 |issue=2 |pages=370–377 |doi=10.1039/C8SC04228D |pmid=30746086 |pmc=6335848 |issn=2041-6539|doi-access=free }}</ref> [[physics]]<ref name=qasim2019 /> and [[NP-hard]] [[combinatorial optimization]] problems.<ref name="li2018" /> |
||
[[Open source]] [[Library (computing)|libraries]] implementing GNNs include PyTorch Geometric<ref name=fey2019 /> ([[PyTorch]]), TensorFlow GNN<ref name=tfgnn2022 /> ([[TensorFlow]]), Deep Graph Library<ref>{{Cite web |last= |title=Deep Graph Library (DGL) |url=https://www.dgl.ai/ |access-date=2024-09-12 |website=}}</ref> (framework agnostic), jraph<ref name=jraph2022/> ([[Google JAX]]), and GraphNeuralNetworks.jl<ref name=Lucibello2021GNN/>/GeometricFlux.jl<ref>{{Citation |title=FluxML/GeometricFlux.jl |date=2024-01-31 |url=https://github.com/FluxML/GeometricFlux.jl |access-date=2024-02-03 |publisher=FluxML}}</ref> ([[Julia (programming language)|Julia]], [[Flux (machine-learning framework)|Flux]]). |
|||
== Architecture == |
== Architecture == |
||
Line 18: | Line 18: | ||
The architecture of a generic GNN implements the following fundamental [[Layer (deep learning)|layers]]:<ref name=bronstein2021 /> |
The architecture of a generic GNN implements the following fundamental [[Layer (deep learning)|layers]]:<ref name=bronstein2021 /> |
||
# <em>Permutation equivariant</em>: a permutation equivariant layer [[Map (mathematics)|maps]] a representation of a graph into an updated representation of the same graph. In the literature, permutation equivariant layers are implemented via pairwise message passing between graph nodes.<ref name="bronstein2021" /><ref name="velickovic2022" /> Intuitively, in a message passing layer, nodes <em>update</em> their representations by <em>aggregating</em> the <em>messages</em> received from their immediate neighbours. As such, each message passing layer increases the receptive field of the GNN by one hop. |
# <em>Permutation equivariant</em>: a permutation equivariant layer [[Map (mathematics)|maps]] a representation of a graph into an updated representation of the same graph. In the literature, permutation equivariant layers are implemented via pairwise message passing between graph nodes.<ref name="bronstein2021" /><ref name="velickovic2022" /> Intuitively, in a message passing layer, nodes <em>update</em> their representations by <em>aggregating</em> the <em>messages</em> received from their immediate neighbours. As such, each message passing layer increases the receptive field of the GNN by one hop. |
||
# <em>Local pooling</em>: a local pooling layer coarsens the graph via [[Downsampling (signal processing)|downsampling]]. Local pooling is used to increase the receptive field of a GNN, in a similar fashion to pooling layers in [[convolutional neural network]]s. Examples include [[Nearest neighbor graph|k-nearest neighbours pooling]], top-k pooling,<ref name="gao2019" /> and self-attention pooling.<ref name="lee2019" /> |
# <em>Local pooling</em>: a local [[pooling layer]] coarsens the graph via [[Downsampling (signal processing)|downsampling]]. Local pooling is used to increase the receptive field of a GNN, in a similar fashion to pooling layers in [[convolutional neural network]]s. Examples include [[Nearest neighbor graph|k-nearest neighbours pooling]], top-k pooling,<ref name="gao2019" /> and self-attention pooling.<ref name="lee2019" /> |
||
# <em>Global pooling</em>: a global pooling layer, also known as ''readout'' layer, provides fixed-size representation of the whole graph. The global pooling layer must be permutation invariant, such that permutations in the ordering of graph nodes and edges do not alter the final output.<ref name="lui2022" /> Examples include element-wise sum, mean or maximum. |
# <em>Global pooling</em>: a global pooling layer, also known as ''readout'' layer, provides fixed-size representation of the whole graph. The global pooling layer must be permutation invariant, such that permutations in the ordering of graph nodes and edges do not alter the final output.<ref name="lui2022" /> Examples include element-wise sum, mean or maximum. |
||
It has been demonstrated that GNNs cannot be more expressive than the |
It has been demonstrated that GNNs cannot be more expressive than the [[Weisfeiler Leman graph isomorphism test|Weisfeiler–Leman Graph Isomorphism Test]].<ref name="douglas2011" /><ref name="xu2019" /> In practice, this means that there exist different graph structures (e.g., [[molecules]] with the same [[atoms]] but different [[Chemical bond|bonds]]) that cannot be distinguished by GNNs. More powerful GNNs operating on higher-dimension geometries such as [[simplicial complex]]es can be designed.<ref name=bronstein2021-2 /><ref name=grady2011discrete /><ref name=hajij2022></ref> {{As of|2022}}, whether or not future architectures will overcome the message passing primitive is an open research question.<ref name=velickovic2022 /> |
||
[[File:GNN representational limits.png|thumb|[[Graph isomorphism|Non-isomorphic]] graphs that cannot be distinguished by a GNN due to the limitations of the Weisfeiler-Lehman Graph Isomorphism Test. Colors indicate node [[Feature (machine learning)|features]].]] |
[[File:GNN representational limits.png|thumb|[[Graph isomorphism|Non-isomorphic]] graphs that cannot be distinguished by a GNN due to the limitations of the Weisfeiler-Lehman Graph Isomorphism Test. Colors indicate node [[Feature (machine learning)|features]].]] |
||
== Message passing layers == |
== Message passing layers == |
||
[[File:Message Passing Neural Network.png|thumb|Node representation update in a Message Passing Neural Network (MPNN) layer. Node <math>\mathbf{x}_0</math> receives messages sent by all of |
[[File:Message Passing Neural Network.png|thumb|Node representation update in a Message Passing Neural Network (MPNN) layer. Node <math>\mathbf{x}_0</math> receives messages sent by all of its immediate neighbours <math>\mathbf{x}_1</math> to <math>\mathbf{x}_4</math>. Messages are computing via the message function <math>\psi</math>, which accounts for the features of both senders and receiver.]] |
||
Message passing layers are permutation-equivariant layers mapping a graph into an updated representation of the same graph. Formally, they can be expressed as message passing neural networks (MPNNs).<ref name="bronstein2021" /> |
Message passing layers are permutation-equivariant layers mapping a graph into an updated representation of the same graph. Formally, they can be expressed as message passing neural networks (MPNNs).<ref name="bronstein2021" /> |
||
Line 34: | Line 34: | ||
:<math>\mathbf{h}_u = \phi \left( \mathbf{x}_u, \bigoplus_{v \in N_u} \psi(\mathbf{x}_u, \mathbf{x}_v, \mathbf{e}_{uv}) \right)</math> |
:<math>\mathbf{h}_u = \phi \left( \mathbf{x}_u, \bigoplus_{v \in N_u} \psi(\mathbf{x}_u, \mathbf{x}_v, \mathbf{e}_{uv}) \right)</math> |
||
where <math>\phi</math> and <math>\psi</math> are [[differentiable functions]] (e.g., [[artificial neural network]]s), and <math>\bigoplus</math> is a [[permutation]] [[Invariant (mathematics)|invariant]] aggregation operator that can accept an arbitrary number of inputs (e.g., element-wise sum, mean, or max). In particular, <math>\phi</math> and <math>\psi</math> are referred to as ''update'' and ''message'' functions, respectively. Intuitively, in an MPNN computational block, graph nodes ''update'' their representations by ''aggregating'' the ''messages'' received from their neighbours. |
where <math>\phi</math> and <math>\psi</math> are [[differentiable functions]] (e.g., [[artificial neural network]]s), and <math>\bigoplus</math> is a [[permutation]] [[Invariant (mathematics)|invariant]] [[aggregation operator]] that can accept an arbitrary number of inputs (e.g., element-wise sum, mean, or max). In particular, <math>\phi</math> and <math>\psi</math> are referred to as ''update'' and ''message'' functions, respectively. Intuitively, in an MPNN computational block, graph nodes ''update'' their representations by ''aggregating'' the ''messages'' received from their neighbours. |
||
The outputs of one or more MPNN layers are node representations <math>\mathbf{h}_u</math> for each node <math>u \in V</math> in the graph. Node representations can be employed for any downstream task, such as node/graph [[Statistical classification|classification]] or edge prediction. |
The outputs of one or more MPNN layers are node representations <math>\mathbf{h}_u</math> for each node <math>u \in V</math> in the graph. Node representations can be employed for any downstream task, such as node/graph [[Statistical classification|classification]] or edge prediction. |
||
Line 77: | Line 77: | ||
Normalized attention coefficients are computed as follows: |
Normalized attention coefficients are computed as follows: |
||
:<math>\alpha_{uv} = \frac{\exp(\text{LeakyReLU}\left(\mathbf{a}^T [\mathbf{W} \mathbf{ |
:<math>\alpha_{uv} = \frac{\exp(\text{LeakyReLU}\left(\mathbf{a}^T [\mathbf{W} \mathbf{x}_u \Vert \mathbf{W} \mathbf{x}_v \Vert \mathbf{e}_{uv}]\right))}{\sum_{z \in N_u}\exp(\text{LeakyReLU}\left(\mathbf{a}^T [\mathbf{W} \mathbf{x}_u \Vert \mathbf{W} \mathbf{x}_z \Vert \mathbf{e}_{uz}]\right))}</math> |
||
where <math>\mathbf{a}</math> is a vector of learnable weights, <math>\cdot^T</math> indicates [[Transpose|transposition]], and <math>\text{LeakyReLU}</math> is a [[Rectifier (neural networks)|modified ReLU]] activation function. Attention coefficients are normalized to make them easily comparable across different nodes.<ref name=velickovic2018 /> |
where <math>\mathbf{a}</math> is a vector of learnable weights, <math>\cdot^T</math> indicates [[Transpose|transposition]], <math>\mathbf{e}_{uv}</math> are the edge features (if present), and <math>\text{LeakyReLU}</math> is a [[Rectifier (neural networks)|modified ReLU]] activation function. Attention coefficients are normalized to make them easily comparable across different nodes.<ref name=velickovic2018 /> |
||
A GCN can be seen as a special case of a GAT where attention coefficients are not learnable, but fixed and equal to the edge weights <math>w_{uv}</math>. |
A GCN can be seen as a special case of a GAT where attention coefficients are not learnable, but fixed and equal to the edge weights <math>w_{uv}</math>. |
||
Line 95: | Line 95: | ||
== Local pooling layers == |
== Local pooling layers == |
||
Local pooling layers coarsen the graph via downsampling. We present here several learnable local pooling strategies that have been proposed.<ref name=lui2022 /> For each |
Local pooling layers coarsen the graph via downsampling. We present here several learnable local pooling strategies that have been proposed.<ref name=lui2022 /> For each case, the input is the initial graph is represented by a matrix <math>\mathbf{X}</math> of node features, and the graph adjacency matrix <math>\mathbf{A}</math>. The output is the new matrix <math>\mathbf{X}'</math>of node features, and the new graph adjacency matrix <math>\mathbf{A}'</math>. |
||
=== Top-k pooling === |
=== Top-k pooling === |
||
We first set |
We first set |
||
<math>\mathbf{y} = \frac{\mathbf{X}\mathbf{p}}{\Vert\mathbf{p}\Vert}</math> |
<math>\mathbf{y} = \frac{\mathbf{X}\mathbf{p}}{\Vert\mathbf{p}\Vert}</math> |
||
Line 117: | Line 117: | ||
where <math>\text{GNN}</math> is a generic permutation equivariant GNN layer (e.g., GCN, GAT, MPNN). |
where <math>\text{GNN}</math> is a generic permutation equivariant GNN layer (e.g., GCN, GAT, MPNN). |
||
The Self-attention pooling layer |
The Self-attention pooling layer<ref name="lee2019" /> can then be formalised as follows: |
||
:<math>\mathbf{X}' = (\mathbf{X} \odot \mathbf{y})_{\mathbf{i}}</math> |
:<math>\mathbf{X}' = (\mathbf{X} \odot \mathbf{y})_{\mathbf{i}}</math> |
||
Line 123: | Line 123: | ||
:<math>\mathbf{A}' = \mathbf{A}_{\mathbf{i}, \mathbf{i}}</math> |
:<math>\mathbf{A}' = \mathbf{A}_{\mathbf{i}, \mathbf{i}}</math> |
||
where <math>\mathbf{i} = \text{top}_k(\mathbf{y})</math> is the subset of nodes with the top-k highest projection scores, <math>\odot</math> denotes element-wise [[matrix multiplication]]. |
where <math>\mathbf{i} = \text{top}_k(\mathbf{y})</math> is the subset of nodes with the top-k highest projection scores, <math>\odot</math> denotes element-wise [[matrix multiplication]]. |
||
The self-attention pooling layer can be seen as an extension of the top-k pooling layer. Differently from top-k pooling, the self-attention scores computed in self-attention pooling account both for the graph features and the graph topology. |
The self-attention pooling layer can be seen as an extension of the top-k pooling layer. Differently from top-k pooling, the self-attention scores computed in self-attention pooling account both for the graph features and the graph topology. |
||
Line 139: | Line 139: | ||
=== Combinatorial optimization === |
=== Combinatorial optimization === |
||
{{See also|Combinatorial |
{{See also|Combinatorial optimization}} |
||
GNNs are used as fundamental building blocks for several combinatorial optimization algorithms.<ref name=cappart2021 /> Examples include computing [[Shortest path problem|shortest paths]] or [[Eulerian path|Eulerian circuits]] for a given graph,<ref name=li2016/> deriving [[Placement (electronic design automation)|chip placements]] superior or competitive to handcrafted human solutions,<ref name=mirhoseini2021 /> and improving expert-designed branching rules in [[ |
GNNs are used as fundamental building blocks for several combinatorial optimization algorithms.<ref name=cappart2021 /> Examples include computing [[Shortest path problem|shortest paths]] or [[Eulerian path|Eulerian circuits]] for a given graph,<ref name=li2016/> deriving [[Placement (electronic design automation)|chip placements]] superior or competitive to handcrafted human solutions,<ref name=mirhoseini2021 /> and improving expert-designed branching rules in [[branch and bound]].<ref name=gasse2019 /> |
||
=== Cyber security === |
=== Cyber security === |
||
{{See also|Intrusion detection system}} |
{{See also|Intrusion detection system}} |
||
When viewed as a graph, a network of computers can be analyzed with GNNs for anomaly detection. Anomalies within provenance graphs often correlate to malicious activity within the network. GNNs have been used to identify these anomalies on individual nodes<ref>{{Cite journal |last1=Wang |first1=Su |last2=Wang |first2=Zhiliang |last3=Zhou |first3=Tao |last4=Sun |first4=Hongbin |last5=Yin |first5=Xia |last6=Han |first6=Dongqi |last7=Zhang |first7=Han |last8=Shi |first8=Xingang |last9=Yang |first9=Jiahai |date=2022 |title=Threatrace: Detecting and Tracing Host-Based Threats in Node Level Through Provenance Graph Learning |url=https://ieeexplore.ieee.org/document/9899459/;jsessionid=NzAXdLahhjEX-xmrFzOROk4qxoaz40aJFvKcZRgjck8-zCOucJi7!380715771 |journal=IEEE Transactions on Information Forensics and Security |volume=17 |pages=3972–3987 |doi=10.1109/TIFS.2022.3208815 |issn=1556-6021|arxiv=2111.04333 |s2cid=243847506 }}</ref> and within paths<ref>{{Cite journal |last1=Wang |first1=Qi |last2=Hassan |first2=Wajih Ul |last3=Li |first3=Ding |last4=Jee |first4=Kangkook |last5=Yu |first5=Xiao |date=2020 |title=You Are What You Do: Hunting Stealthy Malware via Data Provenance Analysis. |journal=Network and Distributed Systems Security (NDSS) Symposium|doi=10.14722/ndss.2020.24167 |isbn=978-1-891562-61-7 |s2cid=211267791 }}</ref> to detect malicious processes, or on the edge level<ref>{{Cite journal |last1=King |first1=Isaiah J. |last2=Huang |first2=H. Howie |date=2022 |title=Euler: Detecting Network Lateral Movement via Scalable Temporal Link Prediction |url=https://www.ndss-symposium.org/wp-content/uploads/2022-107A-paper.pdf |journal=In Proceedings of the 29th Network and Distributed Systems Security Symposium (NDSS)|doi=10.14722/ndss.2022.24107 |s2cid=248221601 }}</ref> to detect [[Network Lateral Movement|lateral movement]]. |
When viewed as a graph, a network of computers can be analyzed with GNNs for anomaly detection. Anomalies within provenance graphs often correlate to malicious activity within the network. GNNs have been used to identify these anomalies on individual nodes<ref>{{Cite journal |last1=Wang |first1=Su |last2=Wang |first2=Zhiliang |last3=Zhou |first3=Tao |last4=Sun |first4=Hongbin |last5=Yin |first5=Xia |last6=Han |first6=Dongqi |last7=Zhang |first7=Han |last8=Shi |first8=Xingang |last9=Yang |first9=Jiahai |date=2022 |title=Threatrace: Detecting and Tracing Host-Based Threats in Node Level Through Provenance Graph Learning |url=https://ieeexplore.ieee.org/document/9899459/;jsessionid=NzAXdLahhjEX-xmrFzOROk4qxoaz40aJFvKcZRgjck8-zCOucJi7!380715771 |journal=IEEE Transactions on Information Forensics and Security |volume=17 |pages=3972–3987 |doi=10.1109/TIFS.2022.3208815 |issn=1556-6021|arxiv=2111.04333 |s2cid=243847506 }}</ref> and within paths<ref>{{Cite journal |last1=Wang |first1=Qi |last2=Hassan |first2=Wajih Ul |last3=Li |first3=Ding |last4=Jee |first4=Kangkook |last5=Yu |first5=Xiao |date=2020 |title=You Are What You Do: Hunting Stealthy Malware via Data Provenance Analysis. |journal=Network and Distributed Systems Security (NDSS) Symposium|doi=10.14722/ndss.2020.24167 |isbn=978-1-891562-61-7 |s2cid=211267791 |doi-access=free }}</ref> to detect malicious processes, or on the edge level<ref>{{Cite journal |last1=King |first1=Isaiah J. |last2=Huang |first2=H. Howie |date=2022 |title=Euler: Detecting Network Lateral Movement via Scalable Temporal Link Prediction |url=https://www.ndss-symposium.org/wp-content/uploads/2022-107A-paper.pdf |journal=In Proceedings of the 29th Network and Distributed Systems Security Symposium (NDSS)|doi=10.14722/ndss.2022.24107 |s2cid=248221601 }}</ref> to detect [[Network Lateral Movement|lateral movement]]. |
||
=== Water distribution networks === |
|||
{{See also|Water distribution system}} |
|||
Water distribution systems can be modelled as graphs, being then a straightforward application of GNN. This kind of algorithm has been applied to water demand forecasting,<ref>{{cite journal |url=https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2022WR032299|title=Graph Convolutional Recurrent Neural Networks for Water Demand Forecasting|last=Zanfei|first=Ariele |display-authors=etal |date=2022|journal=Water Resources Research|volume=58 |issue=7 |publisher=AGU|doi=10.1029/2022WR032299 |bibcode=2022WRR....5832299Z |access-date=June 11, 2024}}</ref> interconnecting District Measuring Areas to improve the forecasting capacity. Other application of this algorithm on water distribution modelling is the development of metamodels.<ref>{{cite journal |url=https://www.sciencedirect.com/science/article/abs/pii/S0043135423007005|title=Shall we always use hydraulic models? A graph neural network metamodel for water system calibration and uncertainty assessment|last=Zanfei|first=Ariele |journal=Water Research |display-authors=etal |date=2023|volume=242 |doi=10.1016/j.watres.2023.120264 |pmid=37393807 |bibcode=2023WatRe.24220264Z |access-date=June 11, 2024}}</ref> |
|||
==References== |
==References== |
||
{{Reflist|refs= |
{{Reflist|refs= |
||
<ref name="wuchen2023">{{Cite journal|last1=Wu|first1=Lingfei|last2=Chen|first2=Yu|last3=Shen |first3=Kai|last4=Guo|first4=Xiaojie|last5=Gao|first5=Hanning|last6=Li|first6=Shucheng|last7=Pei|first7=Jian|last8=Long|first8=Bo|date=2023|title=Graph Neural Networks for Natural Language Processing: A Survey |
|||
|url=https://www.nowpublishers.com/article/Details/MAL-096|journal=Foundations and Trends in Machine Learning|volume=16|issue=2|pages=119–328|doi=10.1561/2200000096 |pmid=19068426|s2cid=206756462|issn=1941-0093|arxiv=2106.06090}}</ref> |
|||
<ref name="wucuipeizhao2022">{{Cite journal|last1=Wu|first1=Lingfei|last2=Cui|first2=Peng|last3=Pei |first3=Jian|last4=Zhao|first4=Liang|date=2022|title=Graph Neural Networks: Foundations, Frontiers, and Applications|url=https://graph-neural-networks.github.io/|journal=Springer Singapore|pages=725|url-access=<!--WP:URLACCESS-->}}</ref> |
|||
<ref name="scarselli2009">{{Cite journal|last1=Scarselli|first1=Franco|last2=Gori|first2=Marco|last3=Tsoi |first3=Ah Chung|last4=Hagenbuchner|first4=Markus|last5=Monfardini|first5=Gabriele|date=2009|title=The Graph Neural Network Model|url=https://ieeexplore.ieee.org/document/4700287|journal=IEEE Transactions on Neural Networks|volume=20|issue=1|pages=61–80|doi=10.1109/TNN.2008.2005605 |pmid=19068426|s2cid=206756462|issn=1941-0093}}</ref> |
<ref name="scarselli2009">{{Cite journal|last1=Scarselli|first1=Franco|last2=Gori|first2=Marco|last3=Tsoi |first3=Ah Chung|last4=Hagenbuchner|first4=Markus|last5=Monfardini|first5=Gabriele|date=2009|title=The Graph Neural Network Model|url=https://ieeexplore.ieee.org/document/4700287|journal=IEEE Transactions on Neural Networks|volume=20|issue=1|pages=61–80|doi=10.1109/TNN.2008.2005605 |pmid=19068426|s2cid=206756462|issn=1941-0093}}</ref> |
||
<ref name="micheli2009">{{Cite journal|last1=Micheli|first1=Alessio|title=Neural Network for Graphs: A Contextual Constructive Approach|url=https://ieeexplore.ieee.org/document/4700287|journal=IEEE Transactions on Neural Networks|volume=20|issue=3|pages= |
<ref name="micheli2009">{{Cite journal|last1=Micheli|first1=Alessio|title=Neural Network for Graphs: A Contextual Constructive Approach|url=https://ieeexplore.ieee.org/document/4700287|journal=IEEE Transactions on Neural Networks|year=2009 |volume=20|issue=3|pages=498–511|doi=10.1109/TNN.2008.2010350 |pmid=19193509|s2cid=17486263|issn=1045-9227}}</ref> |
||
<ref name="sanchez2021">{{Cite journal|last1=Sanchez-Lengeling|first1=Benjamin|last2=Reif|first2=Emily |last3=Pearce|first3=Adam|last4=Wiltschko|first4=Alex|date=2021-09-02|title=A Gentle Introduction to Graph Neural Networks|url=https://distill.pub/2021/gnn-intro|journal=Distill|volume=6|issue=9|pages=e33 |doi=10.23915/distill.00033|issn=2476-0757|doi-access=free}}</ref> |
<ref name="sanchez2021">{{Cite journal|last1=Sanchez-Lengeling|first1=Benjamin|last2=Reif|first2=Emily |last3=Pearce|first3=Adam|last4=Wiltschko|first4=Alex|date=2021-09-02|title=A Gentle Introduction to Graph Neural Networks|url=https://distill.pub/2021/gnn-intro|journal=Distill|volume=6|issue=9|pages=e33 |doi=10.23915/distill.00033|issn=2476-0757|doi-access=free}}</ref> |
||
<ref name="daigavane2021">{{Cite journal|last1=Daigavane|first1=Ameya|last2=Ravindran|first2=Balaraman |last3=Aggarwal|first3=Gaurav|date=2021-09-02|title=Understanding Convolutions on Graphs |url=https://distill.pub/2021/understanding-gnns|journal=Distill|volume=6|issue=9|pages=e32 |doi=10.23915/distill.00032|s2cid=239678898|issn=2476-0757|doi-access=free}}</ref> |
<ref name="daigavane2021">{{Cite journal|last1=Daigavane|first1=Ameya|last2=Ravindran|first2=Balaraman |last3=Aggarwal|first3=Gaurav|date=2021-09-02|title=Understanding Convolutions on Graphs |url=https://distill.pub/2021/understanding-gnns|journal=Distill|volume=6|issue=9|pages=e32 |doi=10.23915/distill.00032|s2cid=239678898|issn=2476-0757|doi-access=free}}</ref> |
||
Line 157: | Line 165: | ||
<ref name="velickovic2018">{{Cite arXiv|last1=Veličković|first1=Petar|last2=Cucurull|first2=Guillem |last3=Casanova|first3=Arantxa|last4=Romero|first4=Adriana|last5=Liò|first5=Pietro|last6=Bengio |first6=Yoshua|date=2018-02-04 |title=Graph Attention Networks|eprint=1710.10903 |class=stat.ML}}</ref> |
<ref name="velickovic2018">{{Cite arXiv|last1=Veličković|first1=Petar|last2=Cucurull|first2=Guillem |last3=Casanova|first3=Arantxa|last4=Romero|first4=Adriana|last5=Liò|first5=Pietro|last6=Bengio |first6=Yoshua|date=2018-02-04 |title=Graph Attention Networks|eprint=1710.10903 |class=stat.ML}}</ref> |
||
<ref name=stanforddata>{{Cite web|title=Stanford Large Network Dataset Collection |url=https://snap.stanford.edu/data/|access-date=2021-07-05|website=snap.stanford.edu}}</ref> |
<ref name=stanforddata>{{Cite web|title=Stanford Large Network Dataset Collection |url=https://snap.stanford.edu/data/|access-date=2021-07-05|website=snap.stanford.edu}}</ref> |
||
<ref name="li2018">{{cite journal |last1=Li |first1=Zhuwen |last2=Chen |first2=Qifeng |last3=Koltun |first3=Vladlen |title=Combinatorial optimization with graph convolutional networks and guided tree search |journal=Neural Information Processing Systems |date=2018 |volume=31 |pages=537–546 |arxiv=1810.10659}}</ref> |
<ref name="li2018">{{cite journal |last1=Li |first1=Zhuwen |last2=Chen |first2=Qifeng |last3=Koltun |first3=Vladlen |title=Combinatorial optimization with graph convolutional networks and guided tree search |journal=Neural Information Processing Systems |date=2018 |volume=31 |pages=537–546 |doi=10.1007/978-3-030-04221-9_48 |arxiv=1810.10659}}</ref> |
||
<ref name="bronstein2021">{{cite arXiv |last1=Bronstein |first1=Michael M. |last2=Bruna |first2=Joan |last3=Cohen |first3=Taco |last4=Veličković |first4=Petar |title=Geometric Deep Learning: Grids, Groups, Graphs Geodesics and Gauges |date=May 4, 2021 |class=cs.LG |eprint=2104.13478}}</ref> |
<ref name="bronstein2021">{{cite arXiv |last1=Bronstein |first1=Michael M. |last2=Bruna |first2=Joan |last3=Cohen |first3=Taco |last4=Veličković |first4=Petar |title=Geometric Deep Learning: Grids, Groups, Graphs Geodesics and Gauges |date=May 4, 2021 |class=cs.LG |eprint=2104.13478}}</ref> |
||
<ref name=douglas2011>{{cite arXiv|last=Douglas|first=B. L.|date=2011-01-27|title=The Weisfeiler–Lehman Method and Graph Isomorphism Testing|class=math.CO|eprint=1101.5211}}</ref> |
<ref name=douglas2011>{{cite arXiv|last=Douglas|first=B. L.|date=2011-01-27|title=The Weisfeiler–Lehman Method and Graph Isomorphism Testing|class=math.CO|eprint=1101.5211}}</ref> |
||
<ref name=xu2019>{{Cite arXiv|last1=Xu|first1=Keyulu|last2=Hu|first2=Weihua|last3=Leskovec|first3=Jure |last4=Jegelka|first4=Stefanie|date=2019-02-22|title=How Powerful are Graph Neural Networks? |eprint=1810.00826 |class=cs.LG}}</ref> |
<ref name=xu2019>{{Cite arXiv|last1=Xu|first1=Keyulu|last2=Hu|first2=Weihua|last3=Leskovec|first3=Jure |last4=Jegelka|first4=Stefanie|author4-link=Stefanie Jegelka|date=2019-02-22|title=How Powerful are Graph Neural Networks? |eprint=1810.00826 |class=cs.LG}}</ref> |
||
<ref name=velickovic2022>{{cite arXiv |last1=Veličković |first1=Petar |title=Message passing all the way up |year=2022 |class=cs.LG |eprint=2202.11097}}</ref> |
<ref name=velickovic2022>{{cite arXiv |last1=Veličković |first1=Petar |title=Message passing all the way up |year=2022 |class=cs.LG |eprint=2202.11097}}</ref> |
||
<ref name=qasim2019>{{cite journal |last1=Qasim |first1=Shah Rukh |last2=Kieseler |first2=Jan |last3=Iiyama |first3=Yutaro |last4=Pierini |first4=Maurizio Pierini |title=Learning representations of irregular particle-detector geometry with distance-weighted graph networks |journal=The European Physical Journal C |date=2019 |volume=79 |issue=7 |doi=10.1140/epjc/s10052-019-7113-9|s2cid=88518244 |doi-access=free }}</ref> |
<ref name=qasim2019>{{cite journal |last1=Qasim |first1=Shah Rukh |last2=Kieseler |first2=Jan |last3=Iiyama |first3=Yutaro |last4=Pierini |first4=Maurizio Pierini |title=Learning representations of irregular particle-detector geometry with distance-weighted graph networks |journal=The European Physical Journal C |date=2019 |volume=79 |issue=7 |page=608 |doi=10.1140/epjc/s10052-019-7113-9|s2cid=88518244 |doi-access=free |arxiv=1902.07987 |bibcode=2019EPJC...79..608Q }}</ref> |
||
<ref name=lui2022>{{cite arXiv |last1=Liu |first1=Chuang |last2=Zhan |first2=Yibing |last3=Li |first3=Chang |last4=Du |first4=Bo |last5=Wu |first5=Jia |last6=Hu |first6=Wenbin |last7=Liu |first7=Tongliang |last8=Tao |first8=Dacheng |title=Graph Pooling for Graph Neural Networks: Progress, Challenges, and Opportunities |date=2022 |class=cs.LG |eprint=2204.07321}}</ref> |
<ref name=lui2022>{{cite arXiv |last1=Liu |first1=Chuang |last2=Zhan |first2=Yibing |last3=Li |first3=Chang |last4=Du |first4=Bo |last5=Wu |first5=Jia |last6=Hu |first6=Wenbin |last7=Liu |first7=Tongliang |last8=Tao |first8=Dacheng |title=Graph Pooling for Graph Neural Networks: Progress, Challenges, and Opportunities |date=2022 |class=cs.LG |eprint=2204.07321}}</ref> |
||
<ref name=hajij2022>{{cite arXiv |last1=Hajij |first1=M. |last2=Zamzmi |first2=G. |last3=Papamarkou |first3=T. |last4=Miolane |first4=N. |last5=Guzmán-Sáenz |first5=A. |last6=Ramamurthy |first6=K. N. |last7=Schaub |first7=M. T. |title=Topological deep learning: Going beyond graph data |date=2022 |class=cs.LG |eprint=2206.00606}}</ref> |
|||
<ref name=bronstein2021-2>{{cite arXiv |last1=Bodnar |first1=Christian |last2=Frasca |first2=Fabrizio |last3=Guang Wang |first3=Yu |last4=Otter |first4=Nina |last5=Montúfar |first5=Guido |last6=Liò |first6=Pietro |last7=Bronstein |first7=Michael |title=Weisfeiler and Lehman Go Topological: Message Passing Simplicial Networks |date=2021 |class=cs.LG |eprint=2103.03212}}</ref> |
<ref name=bronstein2021-2>{{cite arXiv |last1=Bodnar |first1=Christian |last2=Frasca |first2=Fabrizio |last3=Guang Wang |first3=Yu |last4=Otter |first4=Nina |last5=Montúfar |first5=Guido |last6=Liò |first6=Pietro |last7=Bronstein |first7=Michael |title=Weisfeiler and Lehman Go Topological: Message Passing Simplicial Networks |date=2021 |class=cs.LG |eprint=2103.03212}}</ref> |
||
<ref name=gao2019>{{cite arXiv |last1=Gao |first1=Hongyang |last2=Ji |first2=Shuiwang Ji |title=Graph U-Nets |date=2019 |class=cs.LG |eprint=1905.05178}}</ref> |
<ref name=gao2019>{{cite arXiv |last1=Gao |first1=Hongyang |last2=Ji |first2=Shuiwang Ji |title=Graph U-Nets |date=2019 |class=cs.LG |eprint=1905.05178}}</ref> |
||
<ref name=lee2019>{{cite arXiv |last1=Lee |first1=Junhyun |last2=Lee |first2=Inyeop |last3=Kang |first3=Jaewoo |title=Self-Attention Graph Pooling |date=2019 |class=cs.LG |eprint=1904.08082}}</ref> |
<ref name=lee2019>{{cite arXiv |last1=Lee |first1=Junhyun |last2=Lee |first2=Inyeop |last3=Kang |first3=Jaewoo |title=Self-Attention Graph Pooling |date=2019 |class=cs.LG |eprint=1904.08082}}</ref> |
||
<ref name=alon2021>{{cite arXiv |last1=Alon |first1=Uri |last2=Yahav |first2=Eran |title=On the Bottleneck of Graph Neural Networks and its Practical Implications |date=2021 |class=cs.LG |eprint=2006.05205}}</ref> |
<ref name=alon2021>{{cite arXiv |last1=Alon |first1=Uri |last2=Yahav |first2=Eran |title=On the Bottleneck of Graph Neural Networks and its Practical Implications |date=2021 |class=cs.LG |eprint=2006.05205}}</ref> |
||
<ref name=chen2021>{{cite journal |last1=Chen |first1=Deli |last2=Lin |first2=Yankai |last3=Li |first3=Wei |last4=Li |first4=Peng |last5=Zhou |first5=Jie |last6=Sun |first6=Xu |title=Measuring and Relieving the Over- |
<ref name=chen2021>{{cite journal |last1=Chen |first1=Deli |last2=Lin |first2=Yankai |last3=Li |first3=Wei |last4=Li |first4=Peng |last5=Zhou |first5=Jie |last6=Sun |first6=Xu |title=Measuring and Relieving the Over-Smoothing Problem for Graph Neural Networks from the Topological View |journal=Proceedings of the AAAI Conference on Artificial Intelligence |date=2020 |volume=34 |issue=4 |pages=3438–3445 |doi=10.1609/aaai.v34i04.5747 |s2cid=202539008 |arxiv=1909.03211}}</ref> |
||
<ref name=guardian2018>{{cite news|last1=Sample|first1=Ian|date=2 December 2018|title=Google's DeepMind predicts 3D shapes of proteins|work=The Guardian|url=https://www.theguardian.com/science/2018/dec/02/google-deepminds-ai-program-alphafold-predicts-3d-shapes-of-proteins|access-date=30 November 2020}}</ref> |
<ref name=guardian2018>{{cite news|last1=Sample|first1=Ian|date=2 December 2018|title=Google's DeepMind predicts 3D shapes of proteins|work=The Guardian|url=https://www.theguardian.com/science/2018/dec/02/google-deepminds-ai-program-alphafold-predicts-3d-shapes-of-proteins|access-date=30 November 2020}}</ref> |
||
<ref name=mit2020>{{cite web |title=DeepMind's protein-folding AI has solved a 50-year-old grand challenge of biology |url=https://www.technologyreview.com/2020/11/30/1012712/deepmind-protein-folding-ai-solved-biology-science-drugs-disease/ |website=MIT Technology Review |access-date=30 November 2020 |language=en}}</ref> |
<ref name=mit2020>{{cite web |title=DeepMind's protein-folding AI has solved a 50-year-old grand challenge of biology |url=https://www.technologyreview.com/2020/11/30/1012712/deepmind-protein-folding-ai-solved-biology-science-drugs-disease/ |website=MIT Technology Review |access-date=30 November 2020 |language=en}}</ref> |
||
Line 175: | Line 184: | ||
<ref name=gasse2019>{{cite arXiv |last1=Gasse |first1=Maxime |last2=Chételat |first2=Didier |last3=Ferroni |first3=Nicola |last4=Charlin |first4=Laurent |last5=Lodi |first5=Andrea |title=Exact Combinatorial Optimization with Graph Convolutional Neural Networks |date=2019 |class=cs.LG |eprint=1906.01629}}</ref> |
<ref name=gasse2019>{{cite arXiv |last1=Gasse |first1=Maxime |last2=Chételat |first2=Didier |last3=Ferroni |first3=Nicola |last4=Charlin |first4=Laurent |last5=Lodi |first5=Andrea |title=Exact Combinatorial Optimization with Graph Convolutional Neural Networks |date=2019 |class=cs.LG |eprint=1906.01629}}</ref> |
||
<ref name=cappart2021>{{cite arXiv |last1=Cappart |first1=Quentin |last2=Chételat |first2=Didier |last3=Khalil |first3=Elias |last4=Lodi |first4=Andrea |last5=Morris |first5=Christopher |last6=Veličković |first6=Petar |title=Combinatorial optimization and reasoning with graph neural networks |year=2021 |class=cs.LG |eprint=2102.09544}}</ref> |
<ref name=cappart2021>{{cite arXiv |last1=Cappart |first1=Quentin |last2=Chételat |first2=Didier |last3=Khalil |first3=Elias |last4=Lodi |first4=Andrea |last5=Morris |first5=Christopher |last6=Veličković |first6=Petar |title=Combinatorial optimization and reasoning with graph neural networks |year=2021 |class=cs.LG |eprint=2102.09544}}</ref> |
||
<ref name=mirhoseini2021>{{cite journal |last1=Mirhoseini |first1=Azalia |last2=Goldie |first2=Anna |last3=Yazgan |first3=Mustafa |last4=Jiang |first4=Joe Wenjie |last5=Songhori |first5=Ebrahim |last6=Wang |first6=Shen |last7=Lee |first7=Young-Joon |last8=Johnson |first8=Eric |last9=Pathak |first9=Omkar |last10=Nazi |first10=Azade |last11=Pak |first11=Jiwoo |last12=Tong |first12=Andy |last13=Srinivasa |first13=Kavya |last14=Hang |first14=William |last15=Tuncer |first15=Emre |last16=Le |first16=Quoc V. |last17=Laudon |first17=James |last18=Ho |first18=Richard |last19=Carpenter |first19=Roger |last20=Dean |first20=Jeff |title=A graph placement methodology for fast chip design |journal=Nature |date=2021 |volume=594 |issue=7862 |pages=207–212 |doi=10.1038/s41586-021-03544-w|pmid=34108699 |s2cid=235395490 }}</ref> |
<ref name=mirhoseini2021>{{cite journal |last1=Mirhoseini |first1=Azalia |last2=Goldie |first2=Anna |last3=Yazgan |first3=Mustafa |last4=Jiang |first4=Joe Wenjie |last5=Songhori |first5=Ebrahim |last6=Wang |first6=Shen |last7=Lee |first7=Young-Joon |last8=Johnson |first8=Eric |last9=Pathak |first9=Omkar |last10=Nazi |first10=Azade |last11=Pak |first11=Jiwoo |last12=Tong |first12=Andy |last13=Srinivasa |first13=Kavya |last14=Hang |first14=William |last15=Tuncer |first15=Emre |last16=Le |first16=Quoc V. |last17=Laudon |first17=James |last18=Ho |first18=Richard |last19=Carpenter |first19=Roger |last20=Dean |first20=Jeff |title=A graph placement methodology for fast chip design |journal=Nature |date=2021 |volume=594 |issue=7862 |pages=207–212 |doi=10.1038/s41586-021-03544-w|pmid=34108699 |bibcode=2021Natur.594..207M |s2cid=235395490 }}</ref> |
||
<ref name=fey2019>{{cite arXiv |last1=Matthias |first1=Fey |last2=Lenssen |first2=Jan E. |title=Fast Graph Representation Learning with PyTorch Geometric |date=2019 |class=cs.LG |eprint=1903.02428}}</ref> |
<ref name=fey2019>{{cite arXiv |last1=Matthias |first1=Fey |last2=Lenssen |first2=Jan E. |title=Fast Graph Representation Learning with PyTorch Geometric |date=2019 |class=cs.LG |eprint=1903.02428}}</ref> |
||
<ref name=tfgnn2022>{{cite web |title=Tensorflow GNN |website=[[GitHub]] |url=https://github.com/tensorflow/gnn |access-date=30 June 2022}}</ref> |
<ref name=tfgnn2022>{{cite web |title=Tensorflow GNN |website=[[GitHub]] |url=https://github.com/tensorflow/gnn |access-date=30 June 2022}}</ref> |
||
<ref name=jraph2022>{{cite web |title=jraph |website=[[GitHub]] |url=https://github.com/deepmind/jraph |access-date=30 June 2022}}</ref> |
<ref name=jraph2022>{{cite web |title=jraph |website=[[GitHub]] |url=https://github.com/deepmind/jraph |access-date=30 June 2022}}</ref> |
||
<ref name=xu2021>{{cite arXiv |last1=Xu |first1=Keyulu |last2=Zhang |first2=Mozhi |last3=Jegelka |first3=Stephanie |last4=Kawaguchi |first4=Kenji |title=Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth |date=2021 |class=cs.LG |eprint=2105.04550}}</ref> |
<ref name=xu2021>{{cite arXiv |last1=Xu |first1=Keyulu |last2=Zhang |first2=Mozhi |last3=Jegelka |first3=Stephanie|author3-link=Stefanie Jegelka |last4=Kawaguchi |first4=Kenji |title=Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth |date=2021 |class=cs.LG |eprint=2105.04550}}</ref> |
||
<ref name=li2016>{{cite arXiv |last1=Li |first1=Yujia |last2=Tarlow |first2=Daniel |last3=Brockschmidt |first3=Mark |last4=Zemel |first4=Richard |title=Gated Graph Sequence Neural Networks |date=2016 |class=cs.LG |eprint=1511.05493 }}</ref> |
<ref name=li2016>{{cite arXiv |last1=Li |first1=Yujia |last2=Tarlow |first2=Daniel |last3=Brockschmidt |first3=Mark |last4=Zemel |first4=Richard |title=Gated Graph Sequence Neural Networks |date=2016 |class=cs.LG |eprint=1511.05493 }}</ref> |
||
<ref name= |
<ref name=grady2011discrete>{{cite book |last1=Grady |first1=Leo |last2=Polimeni |first2=Jonathan |title=Discrete Calculus: Applied Analysis on Graphs for Computational Science |url=http://leogrady.net/wp-content/uploads/2017/01/grady2010discrete.pdf |date=2011 |publisher=Springer }}</ref> |
||
<ref name=xu2018>{{cite arXiv |last1=Xu |first1=Keyulu |last2=Li |first2=Chengtao |last3=Tian |first3=Yonglong |last4=Sonobe |first4=Tomohiro |last5=Kawarabayashi |first5=Ken-ichi |last6=Jegelka |first6=Stefanie|author6-link=Stefanie Jegelka |title=Representation Learning on Graphs with Jumping Knowledge Networks |date=2018 |class=cs.LG |eprint=1806.03536}}</ref> |
|||
<ref name=Lucibello2021GNN>{{cite web |last=Lucibello |first=Carlo |title=GraphNeuralNetworks.jl |website=[[GitHub]] |url=https://github.com/CarloLucibello/GraphNeuralNetworks.jl |year=2021 |access-date=2023-09-21}}</ref> |
|||
}} |
}} |
||
== External |
== External links == |
||
https://distill.pub/2021/gnn-intro/ |
* https://distill.pub/2021/gnn-intro/ |
||
{{Artificial intelligence (AI)}} |
|||
[[Category:Semisupervised learning]] |
[[Category:Semisupervised learning]] |
||
[[Category:Supervised learning]] |
[[Category:Supervised learning]] |
Latest revision as of 22:20, 14 November 2024
Part of a series on |
Machine learning and data mining |
---|
A graph neural network (GNN) belongs to a class of artificial neural networks for processing data that can be represented as graphs.[1][2][3][4][5]
In the more general subject of "geometric deep learning", certain existing neural network architectures can be interpreted as GNNs operating on suitably defined graphs.[6] A convolutional neural network layer, in the context of computer vision, can be considered a GNN applied to graphs whose nodes are pixels and only adjacent pixels are connected by edges in the graph. A transformer layer, in natural language processing, can be considered a GNN applied to complete graphs whose nodes are words or tokens in a passage of natural language text.
The key design element of GNNs is the use of pairwise message passing, such that graph nodes iteratively update their representations by exchanging information with their neighbors. Several GNN architectures have been proposed,[2][3][7][8][9] which implement different flavors of message passing,[6][10] started by recursive[2] or convolutional constructive[3] approaches. As of 2022[update], it is an open question whether it is possible to define GNN architectures "going beyond" message passing, or instead every GNN can be built on message passing over suitably defined graphs.[11]
Relevant application domains for GNNs include natural language processing,[12] social networks,[13] citation networks,[14] molecular biology,[15] chemistry,[16][17] physics[18] and NP-hard combinatorial optimization problems.[19]
Open source libraries implementing GNNs include PyTorch Geometric[20] (PyTorch), TensorFlow GNN[21] (TensorFlow), Deep Graph Library[22] (framework agnostic), jraph[23] (Google JAX), and GraphNeuralNetworks.jl[24]/GeometricFlux.jl[25] (Julia, Flux).
Architecture
[edit]The architecture of a generic GNN implements the following fundamental layers:[6]
- Permutation equivariant: a permutation equivariant layer maps a representation of a graph into an updated representation of the same graph. In the literature, permutation equivariant layers are implemented via pairwise message passing between graph nodes.[6][11] Intuitively, in a message passing layer, nodes update their representations by aggregating the messages received from their immediate neighbours. As such, each message passing layer increases the receptive field of the GNN by one hop.
- Local pooling: a local pooling layer coarsens the graph via downsampling. Local pooling is used to increase the receptive field of a GNN, in a similar fashion to pooling layers in convolutional neural networks. Examples include k-nearest neighbours pooling, top-k pooling,[26] and self-attention pooling.[27]
- Global pooling: a global pooling layer, also known as readout layer, provides fixed-size representation of the whole graph. The global pooling layer must be permutation invariant, such that permutations in the ordering of graph nodes and edges do not alter the final output.[28] Examples include element-wise sum, mean or maximum.
It has been demonstrated that GNNs cannot be more expressive than the Weisfeiler–Leman Graph Isomorphism Test.[29][30] In practice, this means that there exist different graph structures (e.g., molecules with the same atoms but different bonds) that cannot be distinguished by GNNs. More powerful GNNs operating on higher-dimension geometries such as simplicial complexes can be designed.[31][32][10] As of 2022[update], whether or not future architectures will overcome the message passing primitive is an open research question.[11]
Message passing layers
[edit]Message passing layers are permutation-equivariant layers mapping a graph into an updated representation of the same graph. Formally, they can be expressed as message passing neural networks (MPNNs).[6]
Let be a graph, where is the node set and is the edge set. Let be the neighbourhood of some node . Additionally, let be the features of node , and be the features of edge . An MPNN layer can be expressed as follows:[6]
where and are differentiable functions (e.g., artificial neural networks), and is a permutation invariant aggregation operator that can accept an arbitrary number of inputs (e.g., element-wise sum, mean, or max). In particular, and are referred to as update and message functions, respectively. Intuitively, in an MPNN computational block, graph nodes update their representations by aggregating the messages received from their neighbours.
The outputs of one or more MPNN layers are node representations for each node in the graph. Node representations can be employed for any downstream task, such as node/graph classification or edge prediction.
Graph nodes in an MPNN update their representation aggregating information from their immediate neighbours. As such, stacking MPNN layers means that one node will be able to communicate with nodes that are at most "hops" away. In principle, to ensure that every node receives information from every other node, one would need to stack a number of MPNN layers equal to the graph diameter. However, stacking many MPNN layers may cause issues such as oversmoothing[33] and oversquashing.[34] Oversmoothing refers to the issue of node representations becoming indistinguishable. Oversquashing refers to the bottleneck that is created by squeezing long-range dependencies into fixed-size representations. Countermeasures such as skip connections[8][35] (as in residual neural networks), gated update rules[36] and jumping knowledge[37] can mitigate oversmoothing. Modifying the final layer to be a fully-adjacent layer, i.e., by considering the graph as a complete graph, can mitigate oversquashing in problems where long-range dependencies are required.[34]
Other "flavours" of MPNN have been developed in the literature,[6] such as graph convolutional networks[7] and graph attention networks,[9] whose definitions can be expressed in terms of the MPNN formalism.
Graph convolutional network
[edit]The graph convolutional network (GCN) was first introduced by Thomas Kipf and Max Welling in 2017.[7]
A GCN layer defines a first-order approximation of a localized spectral filter on graphs. GCNs can be understood as a generalization of convolutional neural networks to graph-structured data.
The formal expression of a GCN layer reads as follows:
where is the matrix of node representations , is the matrix of node features , is an activation function (e.g., ReLU), is the graph adjacency matrix with the addition of self-loops, is the graph degree matrix with the addition of self-loops, and is a matrix of trainable parameters.
In particular, let be the graph adjacency matrix: then, one can define and , where denotes the identity matrix. This normalization ensures that the eigenvalues of are bounded in the range , avoiding numerical instabilities and exploding/vanishing gradients.
A limitation of GCNs is that they do not allow multidimensional edge features .[7] It is however possible to associate scalar weights to each edge by imposing , i.e., by setting each nonzero entry in the adjacency matrix equal to the weight of the corresponding edge.
Graph attention network
[edit]The graph attention network (GAT) was introduced by Petar Veličković et al. in 2018.[9]
Graph attention network is a combination of a graph neural network and an attention layer. The implementation of attention layer in graphical neural networks helps provide attention or focus to the important information from the data instead of focusing on the whole data.
A multi-head GAT layer can be expressed as follows:
where is the number of attention heads, denotes vector concatenation, is an activation function (e.g., ReLU), are attention coefficients, and is a matrix of trainable parameters for the -th attention head.
For the final GAT layer, the outputs from each attention head are averaged before the application of the activation function. Formally, the final GAT layer can be written as:
Attention in Machine Learning is a technique that mimics cognitive attention. In the context of learning on graphs, the attention coefficient measures how important is node to node .
Normalized attention coefficients are computed as follows:
where is a vector of learnable weights, indicates transposition, are the edge features (if present), and is a modified ReLU activation function. Attention coefficients are normalized to make them easily comparable across different nodes.[9]
A GCN can be seen as a special case of a GAT where attention coefficients are not learnable, but fixed and equal to the edge weights .
Gated graph sequence neural network
[edit]The gated graph sequence neural network (GGS-NN) was introduced by Yujia Li et al. in 2015.[36] The GGS-NN extends the GNN formulation by Scarselli et al.[2] to output sequences. The message passing framework is implemented as an update rule to a gated recurrent unit (GRU) cell.
A GGS-NN can be expressed as follows:
where denotes vector concatenation, is a vector of zeros, is a matrix of learnable parameters, is a GRU cell, and denotes the sequence index. In a GGS-NN, the node representations are regarded as the hidden states of a GRU cell. The initial node features are zero-padded up to the hidden state dimension of the GRU cell. The same GRU cell is used for updating representations for each node.
Local pooling layers
[edit]Local pooling layers coarsen the graph via downsampling. We present here several learnable local pooling strategies that have been proposed.[28] For each case, the input is the initial graph is represented by a matrix of node features, and the graph adjacency matrix . The output is the new matrix of node features, and the new graph adjacency matrix .
Top-k pooling
[edit]We first set
where is a learnable projection vector. The projection vector computes a scalar projection value for each graph node.
The top-k pooling layer [26] can then be formalised as follows:
where is the subset of nodes with the top-k highest projection scores, denotes element-wise matrix multiplication, and is the sigmoid function. In other words, the nodes with the top-k highest projection scores are retained in the new adjacency matrix . The operation makes the projection vector trainable by backpropagation, which otherwise would produce discrete outputs.[26]
Self-attention pooling
[edit]We first set
where is a generic permutation equivariant GNN layer (e.g., GCN, GAT, MPNN).
The Self-attention pooling layer[27] can then be formalised as follows:
where is the subset of nodes with the top-k highest projection scores, denotes element-wise matrix multiplication.
The self-attention pooling layer can be seen as an extension of the top-k pooling layer. Differently from top-k pooling, the self-attention scores computed in self-attention pooling account both for the graph features and the graph topology.
Applications
[edit]Protein folding
[edit]Graph neural networks are one of the main building blocks of AlphaFold, an artificial intelligence program developed by Google's DeepMind for solving the protein folding problem in biology. AlphaFold achieved first place in several CASP competitions.[38][39][37]
Social networks
[edit]Social networks are a major application domain for GNNs due to their natural representation as social graphs. GNNs are used to develop recommender systems based on both social relations and item relations.[40][13]
Combinatorial optimization
[edit]GNNs are used as fundamental building blocks for several combinatorial optimization algorithms.[41] Examples include computing shortest paths or Eulerian circuits for a given graph,[36] deriving chip placements superior or competitive to handcrafted human solutions,[42] and improving expert-designed branching rules in branch and bound.[43]
Cyber security
[edit]When viewed as a graph, a network of computers can be analyzed with GNNs for anomaly detection. Anomalies within provenance graphs often correlate to malicious activity within the network. GNNs have been used to identify these anomalies on individual nodes[44] and within paths[45] to detect malicious processes, or on the edge level[46] to detect lateral movement.
Water distribution networks
[edit]Water distribution systems can be modelled as graphs, being then a straightforward application of GNN. This kind of algorithm has been applied to water demand forecasting,[47] interconnecting District Measuring Areas to improve the forecasting capacity. Other application of this algorithm on water distribution modelling is the development of metamodels.[48]
References
[edit]- ^ Wu, Lingfei; Cui, Peng; Pei, Jian; Zhao, Liang (2022). "Graph Neural Networks: Foundations, Frontiers, and Applications". Springer Singapore: 725.
- ^ a b c d Scarselli, Franco; Gori, Marco; Tsoi, Ah Chung; Hagenbuchner, Markus; Monfardini, Gabriele (2009). "The Graph Neural Network Model". IEEE Transactions on Neural Networks. 20 (1): 61–80. doi:10.1109/TNN.2008.2005605. ISSN 1941-0093. PMID 19068426. S2CID 206756462.
- ^ a b c Micheli, Alessio (2009). "Neural Network for Graphs: A Contextual Constructive Approach". IEEE Transactions on Neural Networks. 20 (3): 498–511. doi:10.1109/TNN.2008.2010350. ISSN 1045-9227. PMID 19193509. S2CID 17486263.
- ^ Sanchez-Lengeling, Benjamin; Reif, Emily; Pearce, Adam; Wiltschko, Alex (2021-09-02). "A Gentle Introduction to Graph Neural Networks". Distill. 6 (9): e33. doi:10.23915/distill.00033. ISSN 2476-0757.
- ^ Daigavane, Ameya; Ravindran, Balaraman; Aggarwal, Gaurav (2021-09-02). "Understanding Convolutions on Graphs". Distill. 6 (9): e32. doi:10.23915/distill.00032. ISSN 2476-0757. S2CID 239678898.
- ^ a b c d e f g Bronstein, Michael M.; Bruna, Joan; Cohen, Taco; Veličković, Petar (May 4, 2021). "Geometric Deep Learning: Grids, Groups, Graphs Geodesics and Gauges". arXiv:2104.13478 [cs.LG].
- ^ a b c d Kipf, Thomas N; Welling, Max (2016). "Semi-supervised classification with graph convolutional networks". IEEE Transactions on Neural Networks. 5 (1): 61–80. arXiv:1609.02907. doi:10.1109/TNN.2008.2005605. PMID 19068426. S2CID 206756462.
- ^ a b Hamilton, William; Ying, Rex; Leskovec, Jure (2017). "Inductive Representation Learning on Large Graphs" (PDF). Neural Information Processing Systems. 31. arXiv:1706.02216 – via Stanford.
- ^ a b c d Veličković, Petar; Cucurull, Guillem; Casanova, Arantxa; Romero, Adriana; Liò, Pietro; Bengio, Yoshua (2018-02-04). "Graph Attention Networks". arXiv:1710.10903 [stat.ML].
- ^ a b Hajij, M.; Zamzmi, G.; Papamarkou, T.; Miolane, N.; Guzmán-Sáenz, A.; Ramamurthy, K. N.; Schaub, M. T. (2022). "Topological deep learning: Going beyond graph data". arXiv:2206.00606 [cs.LG].
- ^ a b c Veličković, Petar (2022). "Message passing all the way up". arXiv:2202.11097 [cs.LG].
- ^ Wu, Lingfei; Chen, Yu; Shen, Kai; Guo, Xiaojie; Gao, Hanning; Li, Shucheng; Pei, Jian; Long, Bo (2023). "Graph Neural Networks for Natural Language Processing: A Survey". Foundations and Trends in Machine Learning. 16 (2): 119–328. arXiv:2106.06090. doi:10.1561/2200000096. ISSN 1941-0093. PMID 19068426. S2CID 206756462.
- ^ a b Ying, Rex; He, Ruining; Chen, Kaifeng; Eksombatchai, Pong; Hamilton, William L.; Leskovec, Jure (2018). Graph Convolutional Neural Networks for Web-Scale Recommender Systems. pp. 974–983. arXiv:1806.01973. doi:10.1145/3219819.3219890. ISBN 9781450355520. S2CID 46949657.
- ^ "Stanford Large Network Dataset Collection". snap.stanford.edu. Retrieved 2021-07-05.
- ^ Zhang, Weihang; Cui, Yang; Liu, Bowen; Loza, Martin; Park, Sung-Joon; Nakai, Kenta (5 April 2024). "HyGAnno: Hybrid graph neural network-based cell type annotation for single-cell ATAC sequencing data". Briefings in Bioinformatics. 25 (3): bbae152. doi:10.1093/bib/bbae152. PMC 10998639. PMID 38581422.
- ^ Gilmer, Justin; Schoenholz, Samuel S.; Riley, Patrick F.; Vinyals, Oriol; Dahl, George E. (2017-07-17). "Neural Message Passing for Quantum Chemistry". Proceedings of Machine Learning Research: 1263–1272. arXiv:1704.01212.
- ^ Coley, Connor W.; Jin, Wengong; Rogers, Luke; Jamison, Timothy F.; Jaakkola, Tommi S.; Green, William H.; Barzilay, Regina; Jensen, Klavs F. (2019-01-02). "A graph-convolutional neural network model for the prediction of chemical reactivity". Chemical Science. 10 (2): 370–377. doi:10.1039/C8SC04228D. ISSN 2041-6539. PMC 6335848. PMID 30746086.
- ^ Qasim, Shah Rukh; Kieseler, Jan; Iiyama, Yutaro; Pierini, Maurizio Pierini (2019). "Learning representations of irregular particle-detector geometry with distance-weighted graph networks". The European Physical Journal C. 79 (7): 608. arXiv:1902.07987. Bibcode:2019EPJC...79..608Q. doi:10.1140/epjc/s10052-019-7113-9. S2CID 88518244.
- ^ Li, Zhuwen; Chen, Qifeng; Koltun, Vladlen (2018). "Combinatorial optimization with graph convolutional networks and guided tree search". Neural Information Processing Systems. 31: 537–546. arXiv:1810.10659. doi:10.1007/978-3-030-04221-9_48.
- ^ Matthias, Fey; Lenssen, Jan E. (2019). "Fast Graph Representation Learning with PyTorch Geometric". arXiv:1903.02428 [cs.LG].
- ^ "Tensorflow GNN". GitHub. Retrieved 30 June 2022.
- ^ "Deep Graph Library (DGL)". Retrieved 2024-09-12.
- ^ "jraph". GitHub. Retrieved 30 June 2022.
- ^ Lucibello, Carlo (2021). "GraphNeuralNetworks.jl". GitHub. Retrieved 2023-09-21.
- ^ FluxML/GeometricFlux.jl, FluxML, 2024-01-31, retrieved 2024-02-03
- ^ a b c Gao, Hongyang; Ji, Shuiwang Ji (2019). "Graph U-Nets". arXiv:1905.05178 [cs.LG].
- ^ a b Lee, Junhyun; Lee, Inyeop; Kang, Jaewoo (2019). "Self-Attention Graph Pooling". arXiv:1904.08082 [cs.LG].
- ^ a b Liu, Chuang; Zhan, Yibing; Li, Chang; Du, Bo; Wu, Jia; Hu, Wenbin; Liu, Tongliang; Tao, Dacheng (2022). "Graph Pooling for Graph Neural Networks: Progress, Challenges, and Opportunities". arXiv:2204.07321 [cs.LG].
- ^ Douglas, B. L. (2011-01-27). "The Weisfeiler–Lehman Method and Graph Isomorphism Testing". arXiv:1101.5211 [math.CO].
- ^ Xu, Keyulu; Hu, Weihua; Leskovec, Jure; Jegelka, Stefanie (2019-02-22). "How Powerful are Graph Neural Networks?". arXiv:1810.00826 [cs.LG].
- ^ Bodnar, Christian; Frasca, Fabrizio; Guang Wang, Yu; Otter, Nina; Montúfar, Guido; Liò, Pietro; Bronstein, Michael (2021). "Weisfeiler and Lehman Go Topological: Message Passing Simplicial Networks". arXiv:2103.03212 [cs.LG].
- ^ Grady, Leo; Polimeni, Jonathan (2011). Discrete Calculus: Applied Analysis on Graphs for Computational Science (PDF). Springer.
- ^ Chen, Deli; Lin, Yankai; Li, Wei; Li, Peng; Zhou, Jie; Sun, Xu (2020). "Measuring and Relieving the Over-Smoothing Problem for Graph Neural Networks from the Topological View". Proceedings of the AAAI Conference on Artificial Intelligence. 34 (4): 3438–3445. arXiv:1909.03211. doi:10.1609/aaai.v34i04.5747. S2CID 202539008.
- ^ a b Alon, Uri; Yahav, Eran (2021). "On the Bottleneck of Graph Neural Networks and its Practical Implications". arXiv:2006.05205 [cs.LG].
- ^ Xu, Keyulu; Zhang, Mozhi; Jegelka, Stephanie; Kawaguchi, Kenji (2021). "Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth". arXiv:2105.04550 [cs.LG].
- ^ a b c Li, Yujia; Tarlow, Daniel; Brockschmidt, Mark; Zemel, Richard (2016). "Gated Graph Sequence Neural Networks". arXiv:1511.05493 [cs.LG].
- ^ a b Xu, Keyulu; Li, Chengtao; Tian, Yonglong; Sonobe, Tomohiro; Kawarabayashi, Ken-ichi; Jegelka, Stefanie (2018). "Representation Learning on Graphs with Jumping Knowledge Networks". arXiv:1806.03536 [cs.LG].
- ^ Sample, Ian (2 December 2018). "Google's DeepMind predicts 3D shapes of proteins". The Guardian. Retrieved 30 November 2020.
- ^ "DeepMind's protein-folding AI has solved a 50-year-old grand challenge of biology". MIT Technology Review. Retrieved 30 November 2020.
- ^ Fan, Wenqi; Ma, Yao; Li, Qing; He, Yuan; Zhao, Eric; Tang, Jiliang; Yin, Dawei (2019). Graph Neural Networks for Social Recommendation. pp. 417–426. arXiv:1902.07243. doi:10.1145/3308558.3313488. hdl:10397/81232. ISBN 9781450366748. S2CID 67769538.
- ^ Cappart, Quentin; Chételat, Didier; Khalil, Elias; Lodi, Andrea; Morris, Christopher; Veličković, Petar (2021). "Combinatorial optimization and reasoning with graph neural networks". arXiv:2102.09544 [cs.LG].
- ^ Mirhoseini, Azalia; Goldie, Anna; Yazgan, Mustafa; Jiang, Joe Wenjie; Songhori, Ebrahim; Wang, Shen; Lee, Young-Joon; Johnson, Eric; Pathak, Omkar; Nazi, Azade; Pak, Jiwoo; Tong, Andy; Srinivasa, Kavya; Hang, William; Tuncer, Emre; Le, Quoc V.; Laudon, James; Ho, Richard; Carpenter, Roger; Dean, Jeff (2021). "A graph placement methodology for fast chip design". Nature. 594 (7862): 207–212. Bibcode:2021Natur.594..207M. doi:10.1038/s41586-021-03544-w. PMID 34108699. S2CID 235395490.
- ^ Gasse, Maxime; Chételat, Didier; Ferroni, Nicola; Charlin, Laurent; Lodi, Andrea (2019). "Exact Combinatorial Optimization with Graph Convolutional Neural Networks". arXiv:1906.01629 [cs.LG].
- ^ Wang, Su; Wang, Zhiliang; Zhou, Tao; Sun, Hongbin; Yin, Xia; Han, Dongqi; Zhang, Han; Shi, Xingang; Yang, Jiahai (2022). "Threatrace: Detecting and Tracing Host-Based Threats in Node Level Through Provenance Graph Learning". IEEE Transactions on Information Forensics and Security. 17: 3972–3987. arXiv:2111.04333. doi:10.1109/TIFS.2022.3208815. ISSN 1556-6021. S2CID 243847506.
- ^ Wang, Qi; Hassan, Wajih Ul; Li, Ding; Jee, Kangkook; Yu, Xiao (2020). "You Are What You Do: Hunting Stealthy Malware via Data Provenance Analysis". Network and Distributed Systems Security (NDSS) Symposium. doi:10.14722/ndss.2020.24167. ISBN 978-1-891562-61-7. S2CID 211267791.
- ^ King, Isaiah J.; Huang, H. Howie (2022). "Euler: Detecting Network Lateral Movement via Scalable Temporal Link Prediction" (PDF). In Proceedings of the 29th Network and Distributed Systems Security Symposium (NDSS). doi:10.14722/ndss.2022.24107. S2CID 248221601.
- ^ Zanfei, Ariele; et al. (2022). "Graph Convolutional Recurrent Neural Networks for Water Demand Forecasting". Water Resources Research. 58 (7). AGU. Bibcode:2022WRR....5832299Z. doi:10.1029/2022WR032299. Retrieved June 11, 2024.
- ^ Zanfei, Ariele; et al. (2023). "Shall we always use hydraulic models? A graph neural network metamodel for water system calibration and uncertainty assessment". Water Research. 242. Bibcode:2023WatRe.24220264Z. doi:10.1016/j.watres.2023.120264. PMID 37393807. Retrieved June 11, 2024.