Knowledge graph: Difference between revisions
Tag: Reverted |
m v2.05b - Bot T20 CW#61 - Fix errors for CW project (Reference before punctuation) |
||
(81 intermediate revisions by 43 users not shown) | |||
Line 1: | Line 1: | ||
⚫ | |||
{{short description|Type of knowledge base}} |
{{short description|Type of knowledge base}} |
||
⚫ | |||
[[File:Conceptual Diagram - Example.svg|thumb|Example conceptual diagram]] |
[[File:Conceptual Diagram - Example.svg|thumb|Example conceptual diagram]] |
||
In [[knowledge representation and reasoning]], '''knowledge graph''' is a [[knowledge base]] that uses a graph-structured [[data model]] or [[topology]] to |
In [[knowledge representation and reasoning]], a '''knowledge graph''' is a [[knowledge base]] that uses a [[Graph (discrete mathematics)|graph]]-structured [[data model]] or [[topology]] to represent and operate on [[data]]. Knowledge graphs are often used to store interlinked descriptions of [[Named entity|entities]]{{snd}} objects, events, situations or abstract concepts{{snd}} while also encoding the free-form [[semantics]] or relationships underlying these entities.<ref>{{Cite web|date=2018|title=What is a Knowledge Graph?|url=https://ontotext.com/knowledgehub/fundamentals/what-is-a-knowledge-graph}}</ref><ref>{{Cite web|date=2020|title=What defines a knowledge graph?|url=https://www.atulhost.com/what-is-knowledge-graph}}</ref> |
||
Since the development of the [[Semantic Web]], knowledge graphs |
Since the development of the [[Semantic Web]], knowledge graphs have often been associated with [[linked data|linked open data]] projects, focusing on the connections between [[concept]]s and entities.<ref name="Ref1">{{cite conference|last1=Ehrlinger|first1=Lisa|last2=Wöß|first2=Wolfram|year=2016|title=Towards a Definition of Knowledge Graphs|url=http://ceur-ws.org/Vol-1695/paper4.pdf|conference=SEMANTiCS2016|location=Leipzig|publisher=Joint Proceedings of the Posters and Demos Track of 12th International Conference on Semantic Systems – SEMANTiCS2016 and 1st International Workshop on Semantic Change & Evolving Semantics (SuCCESS16)|pages=13–16}}</ref><ref>{{Cite book|last=Soylu|first=Ahmet|title=The Semantic Web – ISWC 2020 |chapter=Enhancing Public Procurement in the European Union Through Constructing and Exploiting an Integrated Knowledge Graph |date=2020|chapter-url=https://doi.org/10.1007/978-3-030-62466-8_27|series=Lecture Notes in Computer Science|volume=12507|language=en|pages=430–446|doi=10.1007/978-3-030-62466-8_27|isbn=978-3-030-62465-1|s2cid=226229398}}</ref> They are also historically associated with and used by [[search engine]]s such as [[Google Knowledge Graph|Google]], [[Bing (search engine)|Bing]], [[Yext]] and [[Yahoo]]; [[Knowledge Engine (Wikimedia Foundation)|knowledge-engines]] and question-answering services such as [[WolframAlpha]], Apple's [[Siri]], and Amazon [[Amazon Alexa|Alexa]]; and [[social network]]s such as [[LinkedIn]] and [[Facebook]]. |
||
Recent developments in data science and machine learning, particularly in graph neural networks and representation learning and also in machine learning, have broadened the scope of knowledge graphs beyond their traditional use in search engines and recommender systems. They are increasingly used in scientific research, with notable applications in fields such as genomics, proteomics, and systems biology.<ref>{{Cite journal |last1=Mohamed |first1=Sameh K. |last2=Nounu |first2=Aayah |last3=Nováček |first3=Vít |date=2021 |title=Biological applications of knowledge graph embedding models |journal=Briefings in Bioinformatics |volume=22 |issue=2 |pages=1679–1693 |doi=10.1093/bib/bbaa012 |pmid=32065227 |via=Oxford Academic|doi-access=free |hdl=1983/919db5c6-6e10-4277-9ff9-f86bbcedcee8 |hdl-access=free }}</ref> |
|||
== History == |
== History == |
||
The term was coined as early as 1972, in a discussion of how to build modular instructional systems for courses.<ref>Edward W. Schneider. 1973. Course Modularization Applied: The Interface System and Its Implications For Sequence Control and Data Analysis. In Association for the Development of Instructional Systems (ADIS), Chicago, Illinois, April 1972</ref> In the late 1980s, [[University of Groningen]] and [[University of Twente]] jointly began a project called Knowledge Graphs, focusing on the design of [[semantic network]]s with edges restricted to a limited set of relations, to facilitate [[algebra |
The term was coined as early as 1972 by the Austrian [[Linguistics|linguist]] [[Edgar W. Schneider]], in a discussion of how to build modular instructional systems for courses.<ref>Edward W. Schneider. 1973. Course Modularization Applied: The Interface System and Its Implications For Sequence Control and Data Analysis. In Association for the Development of Instructional Systems (ADIS), Chicago, Illinois, April 1972</ref> In the late 1980s, the [[University of Groningen]] and [[University of Twente]] jointly began a project called Knowledge Graphs, focusing on the design of [[semantic network]]s with edges restricted to a limited set of relations, to facilitate [[graph algebra|algebras on the graph]]. In subsequent decades, the distinction between semantic networks and knowledge graphs was blurred. |
||
Some early knowledge graphs were topic-specific. |
Some early knowledge graphs were topic-specific. In 1985, [[Wordnet]] was founded, capturing semantic relationships between words and meanings{{snd}} an application of this idea to language itself. In 2005, Marc Wirk founded [[Geonames]] to capture relationships between different geographic names and locales and associated entities. In 1998 Andrew Edmonds of Science in Finance Ltd in the UK created a system called ThinkBase that offered [[Fuzzy logic|fuzzy-logic]] based reasoning in a graphical context.<ref>{{cite web| title=US Trademark no 75589756 | url= http://tmsearch.uspto.gov/bin/showfield?f=doc&state=4809:rjqm9h.2.1}}</ref> ThinkBase LLC<ref>{{cite web|title=ThinkBase|url=https://thinkbase.ai/kgraphs/ |access-date=25 December 2024}}</ref> |
||
In 2007, both [[DBpedia]] and [[Freebase (database)|Freebase]] were founded as graph-based knowledge [[Repository (version control)|repositories]] for general-purpose knowledge. DBpedia focused exclusively on data extracted from Wikipedia, while Freebase also included a range of public datasets. Neither described themselves as a 'knowledge graph' but developed and described related concepts. |
In 2007, both [[DBpedia]] and [[Freebase (database)|Freebase]] were founded as graph-based knowledge [[Repository (version control)|repositories]] for general-purpose knowledge. DBpedia focused exclusively on data extracted from Wikipedia, while Freebase also included a range of public datasets. Neither described themselves as a 'knowledge graph' but developed and described related concepts. |
||
In 2012, Google introduced their [[Knowledge Graph]],<ref name=" |
In 2012, Google introduced their [[Knowledge Graph]],<ref name="Singhal-2012">{{Cite web|last=Singhal|first=Amit|date=May 16, 2012|title=Introducing the Knowledge Graph: things, not strings|url=https://googleblog.blogspot.com/2012/05/introducing-knowledge-graph-things-not.html|access-date=21 March 2017|website=Official Google Blog}}</ref> building on DBpedia and Freebase among other sources. They later incorporated [[RDFa]], [[Microdata (HTML)|Microdata]], [[JSON-LD]] content extracted from indexed web pages, including the ''[[The World Factbook|CIA World Factbook]]'', [[Wikidata]], and [[Wikipedia]].<ref name="Singhal-2012" /><ref>{{cite web|last=Schwartz|first=Barry|date=December 17, 2014|title=Google's Freebase To Close After Migrating To Wikidata: Knowledge Graph Impact?|url=https://www.seroundtable.com/google-freebase-wikidata-knowledge-graph-19591.html|access-date=December 10, 2017|website=[[Search Engine Roundtable]]}}</ref> Entity and relationship types associated with this knowledge graph have been further organized using terms from the [[schema.org]]<ref name="McCusker">{{Cite web|last1=McCusker|first1=James P.|last2=McGuiness|first2=Deborah L.|title=What is a Knowledge Graph?|url=https://www.authorea.com/users/6341/articles/107281-what-is-a-knowledge-graph/_show_article|access-date=21 March 2017|website=www.authorea.com}}</ref> vocabulary. The Google Knowledge Graph became a successful complement to string-based search within Google, and its popularity online brought the term into more common use.<ref name="McCusker" /> |
||
Since then, several large multinationals have advertised their knowledge graphs use, further popularising the term. These include Facebook, LinkedIn, [[Airbnb]], [[Microsoft]], [[Amazon.com|Amazon]], [[Uber]] and [[eBay]].<ref>{{Cite web|date=2020|title=Knowledge Graph Enterprises|url=https://kgkg.factnexus.com/@3782~167.html}}</ref> |
Since then, several large multinationals have advertised their knowledge graphs use, further popularising the term. These include Facebook, LinkedIn, [[Airbnb]], [[Microsoft]], [[Amazon.com|Amazon]], [[Uber]] and [[eBay]].<ref>{{Cite web|date=2020|title=Knowledge Graph Enterprises|url=https://kgkg.factnexus.com/@3782~167.html}}</ref> |
||
Line 24: | Line 25: | ||
There is no single commonly accepted definition of a knowledge graph. Most definitions view the topic through a Semantic Web lens and include these features:<ref>{{cite journal|last1=Hogan|first1=Aidan|last2=Blomqvist|first2=Eva|last3=Cochez|first3=Michael|last4=d'Amato|first4=Claudia|last5=de Melo|first5=Gerard|last6=Gutierrez|first6=Claudio|last7=Labra Gayo|first7=José Emilio|last8=Kirrane|first8=Sabrina|last9=Neumaier|first9=Sebastian|last10=Polleres|first10=Axel|last11=Navigli|first11=Roberto|last12=Ngonga Ngomo|first12=Axel-Cyrille|last13=Rashid|first13=Sabbir M.|last14=Rula|first14=Anisa|last15=Schmelzeisen|first15=Lukas|last16=Sequeda|first16=Juan|last17=Staab|first17=Steffen|last18=Zimmermann|first18=Antoine|date=2021-01-24|title=Knowledge Graphs|journal=ACM Computing Surveys|volume=54|issue=4|pages=1–37|doi=10.1145/3447772| issn=0360-0300|arxiv=2003.02320|s2cid=235716181}}</ref> |
There is no single commonly accepted definition of a knowledge graph. Most definitions view the topic through a Semantic Web lens and include these features:<ref>{{cite journal|last1=Hogan|first1=Aidan|last2=Blomqvist|first2=Eva|last3=Cochez|first3=Michael|last4=d'Amato|first4=Claudia|last5=de Melo|first5=Gerard|last6=Gutierrez|first6=Claudio|last7=Labra Gayo|first7=José Emilio|last8=Kirrane|first8=Sabrina|last9=Neumaier|first9=Sebastian|last10=Polleres|first10=Axel|last11=Navigli|first11=Roberto|last12=Ngonga Ngomo|first12=Axel-Cyrille|last13=Rashid|first13=Sabbir M.|last14=Rula|first14=Anisa|last15=Schmelzeisen|first15=Lukas|last16=Sequeda|first16=Juan|last17=Staab|first17=Steffen|last18=Zimmermann|first18=Antoine|date=2021-01-24|title=Knowledge Graphs|journal=ACM Computing Surveys|volume=54|issue=4|pages=1–37|doi=10.1145/3447772| issn=0360-0300|arxiv=2003.02320|s2cid=235716181}}</ref> |
||
*''Flexible relations among knowledge in topical domains'': A knowledge graph (i) defines [[ |
*''Flexible relations among knowledge in topical domains'': A knowledge graph (i) defines [[abstract class]]es and relations of entities in a schema, (ii) mainly describes real world entities and their interrelations, organized in a graph, (iii) allows for potentially interrelating arbitrary entities with each other, and (iv) covers various topical domains.<ref>{{cite journal|last1=Paulheim|first1=Heiko|date=2017|title=Knowledge Graph Refinement: A Survey of Approaches and Evaluation Methods|url=http://www.semantic-web-journal.net/system/files/swj1083.pdf|journal=Semantic Web|pages=489–508|access-date=21 March 2017}}</ref> |
||
* ''General structure'': A network of entities, their semantic types, properties, and relationships.<ref>{{cite journal|last1=Krötsch|first1=Markus|last2=Weikum|first2=Gerhard|title=Editorial of the Special Issue on Knowledge Graphs|journal=Journal of Web Semantics|date=March 2016|volume=37-38|pages=53–54|doi=10.1016/j.websem.2016.04.002|url=https://doi.org/10.1016/j.websem.2016.04.002|access-date=10 February 2021}}</ref><ref>{{Cite web|title=What is a Knowledge Graph?{{!}}Ontotext|url=https://www.ontotext.com/knowledgehub/fundamentals/what-is-a-knowledge-graph|access-date=2020-07-01|website=Ontotext|language=en-US}}</ref> |
* ''General structure'': A network of entities, their semantic types, properties, and relationships.<ref>{{cite journal|last1=Krötsch|first1=Markus|last2=Weikum|first2=Gerhard|title=Editorial of the Special Issue on Knowledge Graphs|journal=Journal of Web Semantics|date=March 2016|volume=37-38|pages=53–54|doi=10.1016/j.websem.2016.04.002|url=https://doi.org/10.1016/j.websem.2016.04.002|access-date=10 February 2021}}</ref><ref>{{Cite web|title=What is a Knowledge Graph?{{!}}Ontotext|url=https://www.ontotext.com/knowledgehub/fundamentals/what-is-a-knowledge-graph|access-date=2020-07-01|website=Ontotext|language=en-US}}</ref> To represent properties, categorical or numerical values are often used. |
||
* ''Supporting reasoning over inferred ontologies'': A knowledge graph acquires and integrates information into an ontology and applies a reasoner to derive new knowledge.<ref name="Ref1" /> |
* ''Supporting reasoning over inferred ontologies'': A knowledge graph acquires and integrates information into an ontology and applies a reasoner to derive new knowledge.<ref name="Ref1" /> |
||
There are, however, many knowledge graph representations for which some of these features are not relevant. For those knowledge graphs this simpler definition may be more useful: |
There are, however, many knowledge graph representations for which some of these features are not relevant. For those knowledge graphs, this simpler definition may be more useful: |
||
* A digital structure that represents knowledge as concepts and the relationships between them (facts). A knowledge graph can include an ontology that allows both humans and machines to understand and reason about its contents.<ref>{{Cite web|date=2020|title=The Knowledge Graph about Knowledge Graphs|url=https://kgkg.factnexus.com/@3782~6.html}}</ref> |
* A digital structure that represents knowledge as concepts and the relationships between them (facts). A knowledge graph can include an ontology that allows both humans and machines to understand and reason about its contents.<ref>{{cite journal|last1=Peng|first1=Ciyuan|last2=Feng|first2=Xia|last3=Naseriparsa|first3=Mehdi|last4=Osborne|first4=Francesco|date=2023|title=Knowledge Graphs: Opportunities and Challenges|url=https://doi.org/10.1007/s10462-023-10465-9| journal=Artificial Intelligence Review|volume=56|issue=11 |pages=13071–13102|doi=10.1007/s10462-023-10465-9|pmid=37362886 |pmc=10068207 | issn=1573-7462|arxiv=2303.13948}}</ref><ref>{{Cite web|date=2020|title=The Knowledge Graph about Knowledge Graphs|url=https://kgkg.factnexus.com/@3782~6.html}}</ref> |
||
=== Implementations === |
=== Implementations === |
||
In addition to the above examples, the term has been used to describe open knowledge projects such as [[YAGO (database)|YAGO]] and Wikidata; federations like the Linked Open Data cloud;<ref>{{Cite web|title=The Linked Open Data Cloud|url=https://lod-cloud.net/|access-date=2020-06-30|website=lod-cloud.net}}</ref> a range of commercial search tools, including Yahoo's semantic search assistant Spark, Google's [[Knowledge Graph]], and Microsoft's Satori; and the LinkedIn and Facebook entity graphs.<ref name="Ref1" /> |
In addition to the above examples, the term has been used to describe open knowledge projects such as [[YAGO (database)|YAGO]] and Wikidata; federations like the Linked Open Data cloud;<ref>{{Cite web|title=The Linked Open Data Cloud|url=https://lod-cloud.net/|access-date=2020-06-30|website=lod-cloud.net}}</ref> a range of commercial search tools, including Yahoo's semantic search assistant Spark, Google's [[Knowledge Graph]], and Microsoft's Satori; and the LinkedIn and Facebook entity graphs.<ref name="Ref1" /> |
||
The term is also used in the context of [[note-taking software]] applications that allow a user to build a [[personal knowledge graph]].<ref>{{cite journal |last1=Pyne |first1=Yvette |last2=Stewart |first2=Stuart |date=March 2022 |title=Meta-work: how we research is as important as what we research |journal=[[British Journal of General Practice]] |volume=72 |issue=716 |pages=130–131 |pmid=35210247 |pmc=8884432 |doi=10.3399/bjgp22X718757}}</ref> |
The term is also used in the context of [[note-taking software]] applications that allow a user to build a [[personal knowledge graph]].<ref>{{cite journal |last1=Pyne |first1=Yvette |last2=Stewart |first2=Stuart |date=March 2022 |title=Meta-work: how we research is as important as what we research |journal=[[British Journal of General Practice]] |volume=72 |issue=716 |pages=130–131 |pmid=35210247 |pmc=8884432 |doi=10.3399/bjgp22X718757}}</ref> |
||
The popularization of knowledge graphs and their accompanying methods have led to the development of graph databases such as Neo4j<ref>{{Cite web |title=Neo4j Graph Database & Analytics {{!}} Graph Database Management System |url=https://neo4j.com/ |access-date=8 November 2023 |website=Neo4j}}</ref> and GraphDB.<ref>{{Cite web |title=Ontotext GraphDB |url=https://www.ontotext.com/products/graphdb/ |access-date=8 November 2023 |website=Ontotext}}</ref> These graph databases allow users to easily store data as entities and their interrelationships, and facilitate operations such as data reasoning, node embedding, and ontology development on knowledge bases. |
|||
== Using a knowledge graph for reasoning over data == |
== Using a knowledge graph for reasoning over data == |
||
{{main|Ontology (information science)}} |
{{main|Ontology (information science)}} |
||
A knowledge graph formally represents semantics by describing entities and their relationships.<ref>{{Cite web|date=2022-04-05|title=How do knowledge graphs work?|url=https://www.stardog.com/knowledge-graph/|access-date=2022-04-05|website=Stardog|language=en-US}}</ref> Knowledge graphs may make use of [[Ontology (information science)|ontologies]] as a schema layer. By doing this, they allow [[Inference|logical inference]] for retrieving [[implicit knowledge]] rather than only allowing queries requesting explicit knowledge.<ref>{{Cite web|date= |
A knowledge graph formally represents semantics by describing entities and their relationships.<ref>{{Cite web|date=2022-04-05|title=How do knowledge graphs work?|url=https://www.stardog.com/knowledge-graph/|access-date=2022-04-05|website=Stardog|language=en-US}}</ref> Knowledge graphs may make use of [[Ontology (information science)|ontologies]] as a schema layer. By doing this, they allow [[Inference|logical inference]] for retrieving [[implicit knowledge]] rather than only allowing queries requesting explicit knowledge.<ref>{{Cite web |date=2023-09-01 |title=Unlocking the Power of Google Knowledge Panel: How to Obtain and Claim Yours in 2023 – RH Razu |url=https://rhrazu.com/google-knowledge-panel-obtain-and-claim-yours-in-2023/ |access-date=2023-09-05 |website=rhrazu.com |language=en-US}}</ref> |
||
In order to allow the use of knowledge graphs in various machine learning tasks, several methods for deriving latent feature representations of entities and relations have been devised. These knowledge graph embeddings allow them to be connected to machine learning methods that require feature vectors like [[word embedding]]s. This can complement other estimates of conceptual similarity.<ref>{{Cite |
In order to allow the use of knowledge graphs in various machine learning tasks, several methods for deriving latent feature representations of entities and relations have been devised. These knowledge graph embeddings allow them to be connected to machine learning methods that require feature vectors like [[word embedding]]s. This can complement other estimates of conceptual similarity.<ref>{{Cite book|author=Hongwei Wang|title=Proceedings of the 27th ACM International Conference on Information and Knowledge Management |chapter=RippleNet: Propagating User Preferences on the Knowledge Graph for Recommender Systems |date=October 2018|pages=417–426|doi=10.1145/3269206.3271739|arxiv=1803.03467|isbn=9781450360142 |s2cid=3766110}}</ref><ref>{{Citation |last1=Ristoski |first1=Petar |pages=498–514 |year=2016 |last2=Paulheim |first2=Heiko |chapter=RDF2Vec: RDF Graph Embeddings for Data Mining |title=The Semantic Web – ISWC 2016 |series=Lecture Notes in Computer Science |volume=9981 |doi=10.1007/978-3-319-46523-4_30|isbn=978-3-319-46522-7 |chapter-url=https://madoc.bib.uni-mannheim.de/41307/1/Ristoski_RDF2Vec.pdf |doi-access=free }}</ref> |
||
Models for generating useful knowledge graph embeddings are commonly the domain of graph neural networks (GNNs).<ref>{{Cite journal |last1=Zhou |first1=Jie |last2=Cui |first2=Ganqu |display-authors=1 |date=2020 |title=Graph neural networks: A review of methods and applications. |journal=AI Open |volume=1 |issue=1 |pages=57–81 |doi=10.1016/j.aiopen.2021.01.001 |s2cid=56517517 |via=Elsevier Science Direct|doi-access=free |arxiv=1812.08434 }}</ref> GNNs are deep learning architectures that comprise edges and nodes, which correspond well to the entities and relationships of knowledge graphs. The topology and data structures afforded by GNNs provides a convenient domain for semi-supervised learning, wherein the network is trained to predict the value of a node embedding (provided a group of adjacent nodes and their edges) or edge (provided a pair of nodes). These tasks serve as fundamental abstractions for more complex tasks such as knowledge graph reasoning and alignment.<ref>{{Cite journal |last1=Ye |first1=Zi |last2=Kumar |first2=Yogan Jaya |last3=Sing |first3=Goh Ong |last4=Song |first4=Fengyan |last5=Wang |first5=Junsong |date=2022 |title=A comprehensive survey of graph neural networks for knowledge graphs. |journal=IEEE Access |volume=10 |pages=75729–7574 |doi=10.1109/ACCESS.2022.3191784 |bibcode=2022IEEEA..1075729Y |s2cid=250654689 |via=IEEE Xplore|doi-access=free }}</ref> |
|||
=== Entity alignment === |
|||
[[File:Knowledge graph entity alignment.png|thumb|Two hypothetical knowledge graphs representing disparate topics contain a node that corresponds to the same entity in the real world. Entity alignment is the process of identifying such nodes across multiple graphs.|upright 2]] |
|||
As new knowledge graphs are produced across a variety of fields and contexts, the same entity will inevitably be represented in multiple graphs. However, because no single standard for the construction or representation of knowledge graph exists, resolving which entities from disparate graphs correspond to the same real world subject is a non-trivial task. This task is known as ''knowledge graph entity alignment'', and is an active area of research.<ref>{{Cite conference |last1=Berrendorf |first1=Max |last2=Faerman |first2=Evgeniy |last3=Melnychuk |first3=Valentyn |last4=Tresp |first4=Volker |last5=Seidl |first5=Thomas |date=April 14–17, 2020 |title=Knowledge graph entity alignment with graph convolutional networks: lessons learned |conference=Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal |series=Lecture Notes in Computer Science |volume=Proceedings, Part II |pages=3–11 |doi=10.1007/978-3-030-45442-5_1 |arxiv=1911.08342 |isbn=978-3-030-45441-8 |s2cid=208158314 |via=Springer International Publishing}}</ref> |
|||
Strategies for entity alignment generally seek to identify similar substructures, semantic relationships, shared attributes, or combinations of all three between two distinct knowledge graphs. Entity alignment methods use these structural similarities between generally non-isomorphic graphs to predict which nodes corresponds to the same entity.<ref>{{Cite arXiv |last1=Chaurasiya |first1=Deepak |last2=Surisetty |first2=Anil |last3=Kumar |first3=Nitish |last4=Singh |first4=Alok |last5=Dey |first5=Vikrant |last6=Malhotra |first6=Aakarsh |last7=Dhama |first7=Gaurav |last8=Arora |first8=Ankur |date=2022 |title=Entity alignment for knowledge graphs: progress, challenges, and empirical studies |class=cs.AI |eprint=2205.08777 }}</ref> |
|||
The recent successes of large language models (LLMs), in particular their effectiveness at producing syntactically meaningful embeddings, has spurred the use of LLMs in the task of entity alignment.<ref>{{Cite journal |last1=Hogan |first1=Aidan |last2=Lippolis |first2=Anna Sofia |last3=Klironomos |first3=Antonis |last4=Milon-Flores |first4=Daniela F. |last5=Zheng |first5=Heng |last6=Jouglar |first6=Alexane |last7=Norouzi |first7=Ebrahim |date=2023 |title=Enhancing Entity Alignment Between Wikidata and ArtGraph using LLMs |url=https://aidanhogan.com/docs/art_wikidata_kgs_llms.pdf |journal=Proceedings of the International Workshop on Semantic Web and Ontology Design for Cultural Heritage |via=International Workshop on Semantic Web and Ontology Design for Cultural Heritage (SWODCH), Athens, Greece}}</ref> |
|||
As the amount of data stored in knowledge graphs grows, developing dependable methods for knowledge graph entity alignment becomes an increasingly crucial step in the integration and cohesion of knowledge graph data. |
|||
== See also == |
== See also == |
||
* [[Concept map]] |
|||
* {{Annotated link |Concept map}} |
|||
⚫ | |||
* {{Annotated link |Formal semantics (natural language)}} |
|||
⚫ | |||
* {{Annotated link |Graph database}} |
|||
* [[Vadalog]] |
|||
⚫ | |||
⚫ | |||
* {{Annotated link |Logical graph}} |
|||
* {{Annotated link |Semantic integration}} |
|||
⚫ | |||
* {{Annotated link |Topic map}} |
|||
* {{Annotated link |Vadalog}} |
|||
⚫ | |||
== References == |
== References == |
||
Line 54: | Line 76: | ||
==External links== |
==External links== |
||
{{subject bar|d=y|auto=y}} |
|||
*{{cite news|url=https://www.technologyreview.com/2020/09/04/1008156/knowledge-graph-ai-reads-web-machine-learning-natural-language-processing/ | title= This know-it-all AI learns by reading the entire web nonstop | quote=Diffbot is building the biggest-ever knowledge graph by applying image recognition and natural-language processing to billions of web pages. | work = MIT Technology Review | author = Will Douglas Heaven | date = 4 September 2020 | access-date = 5 September 2020}} |
*{{cite news|url=https://www.technologyreview.com/2020/09/04/1008156/knowledge-graph-ai-reads-web-machine-learning-natural-language-processing/ | title= This know-it-all AI learns by reading the entire web nonstop | quote=Diffbot is building the biggest-ever knowledge graph by applying image recognition and natural-language processing to billions of web pages. | work = MIT Technology Review | author = Will Douglas Heaven | date = 4 September 2020 | access-date = 5 September 2020}} |
||
{{Scholia|topic}} |
{{Scholia|topic}} |
||
{{ |
{{Authority control}} |
||
[[Category:Knowledge graphs| ]] |
[[Category:Knowledge graphs| ]] |
||
[[Category:Ontology (information science)]] |
[[Category:Ontology (information science)]] |
||
[[Category:Formal semantics (natural language)]] |
|||
[[Category:Information science]] |
[[Category:Information science]] |
Latest revision as of 10:17, 28 December 2024
In knowledge representation and reasoning, a knowledge graph is a knowledge base that uses a graph-structured data model or topology to represent and operate on data. Knowledge graphs are often used to store interlinked descriptions of entities – objects, events, situations or abstract concepts – while also encoding the free-form semantics or relationships underlying these entities.[1][2]
Since the development of the Semantic Web, knowledge graphs have often been associated with linked open data projects, focusing on the connections between concepts and entities.[3][4] They are also historically associated with and used by search engines such as Google, Bing, Yext and Yahoo; knowledge-engines and question-answering services such as WolframAlpha, Apple's Siri, and Amazon Alexa; and social networks such as LinkedIn and Facebook.
Recent developments in data science and machine learning, particularly in graph neural networks and representation learning and also in machine learning, have broadened the scope of knowledge graphs beyond their traditional use in search engines and recommender systems. They are increasingly used in scientific research, with notable applications in fields such as genomics, proteomics, and systems biology.[5]
History
[edit]The term was coined as early as 1972 by the Austrian linguist Edgar W. Schneider, in a discussion of how to build modular instructional systems for courses.[6] In the late 1980s, the University of Groningen and University of Twente jointly began a project called Knowledge Graphs, focusing on the design of semantic networks with edges restricted to a limited set of relations, to facilitate algebras on the graph. In subsequent decades, the distinction between semantic networks and knowledge graphs was blurred.
Some early knowledge graphs were topic-specific. In 1985, Wordnet was founded, capturing semantic relationships between words and meanings – an application of this idea to language itself. In 2005, Marc Wirk founded Geonames to capture relationships between different geographic names and locales and associated entities. In 1998 Andrew Edmonds of Science in Finance Ltd in the UK created a system called ThinkBase that offered fuzzy-logic based reasoning in a graphical context.[7] ThinkBase LLC[8]
In 2007, both DBpedia and Freebase were founded as graph-based knowledge repositories for general-purpose knowledge. DBpedia focused exclusively on data extracted from Wikipedia, while Freebase also included a range of public datasets. Neither described themselves as a 'knowledge graph' but developed and described related concepts.
In 2012, Google introduced their Knowledge Graph,[9] building on DBpedia and Freebase among other sources. They later incorporated RDFa, Microdata, JSON-LD content extracted from indexed web pages, including the CIA World Factbook, Wikidata, and Wikipedia.[9][10] Entity and relationship types associated with this knowledge graph have been further organized using terms from the schema.org[11] vocabulary. The Google Knowledge Graph became a successful complement to string-based search within Google, and its popularity online brought the term into more common use.[11]
Since then, several large multinationals have advertised their knowledge graphs use, further popularising the term. These include Facebook, LinkedIn, Airbnb, Microsoft, Amazon, Uber and eBay.[12]
In 2019, IEEE combined its annual international conferences on "Big Knowledge" and "Data Mining and Intelligent Computing" into the International Conference on Knowledge Graph.[13]
Definitions
[edit]There is no single commonly accepted definition of a knowledge graph. Most definitions view the topic through a Semantic Web lens and include these features:[14]
- Flexible relations among knowledge in topical domains: A knowledge graph (i) defines abstract classes and relations of entities in a schema, (ii) mainly describes real world entities and their interrelations, organized in a graph, (iii) allows for potentially interrelating arbitrary entities with each other, and (iv) covers various topical domains.[15]
- General structure: A network of entities, their semantic types, properties, and relationships.[16][17] To represent properties, categorical or numerical values are often used.
- Supporting reasoning over inferred ontologies: A knowledge graph acquires and integrates information into an ontology and applies a reasoner to derive new knowledge.[3]
There are, however, many knowledge graph representations for which some of these features are not relevant. For those knowledge graphs, this simpler definition may be more useful:
- A digital structure that represents knowledge as concepts and the relationships between them (facts). A knowledge graph can include an ontology that allows both humans and machines to understand and reason about its contents.[18][19]
Implementations
[edit]In addition to the above examples, the term has been used to describe open knowledge projects such as YAGO and Wikidata; federations like the Linked Open Data cloud;[20] a range of commercial search tools, including Yahoo's semantic search assistant Spark, Google's Knowledge Graph, and Microsoft's Satori; and the LinkedIn and Facebook entity graphs.[3]
The term is also used in the context of note-taking software applications that allow a user to build a personal knowledge graph.[21]
The popularization of knowledge graphs and their accompanying methods have led to the development of graph databases such as Neo4j[22] and GraphDB.[23] These graph databases allow users to easily store data as entities and their interrelationships, and facilitate operations such as data reasoning, node embedding, and ontology development on knowledge bases.
Using a knowledge graph for reasoning over data
[edit]A knowledge graph formally represents semantics by describing entities and their relationships.[24] Knowledge graphs may make use of ontologies as a schema layer. By doing this, they allow logical inference for retrieving implicit knowledge rather than only allowing queries requesting explicit knowledge.[25]
In order to allow the use of knowledge graphs in various machine learning tasks, several methods for deriving latent feature representations of entities and relations have been devised. These knowledge graph embeddings allow them to be connected to machine learning methods that require feature vectors like word embeddings. This can complement other estimates of conceptual similarity.[26][27]
Models for generating useful knowledge graph embeddings are commonly the domain of graph neural networks (GNNs).[28] GNNs are deep learning architectures that comprise edges and nodes, which correspond well to the entities and relationships of knowledge graphs. The topology and data structures afforded by GNNs provides a convenient domain for semi-supervised learning, wherein the network is trained to predict the value of a node embedding (provided a group of adjacent nodes and their edges) or edge (provided a pair of nodes). These tasks serve as fundamental abstractions for more complex tasks such as knowledge graph reasoning and alignment.[29]
Entity alignment
[edit]As new knowledge graphs are produced across a variety of fields and contexts, the same entity will inevitably be represented in multiple graphs. However, because no single standard for the construction or representation of knowledge graph exists, resolving which entities from disparate graphs correspond to the same real world subject is a non-trivial task. This task is known as knowledge graph entity alignment, and is an active area of research.[30]
Strategies for entity alignment generally seek to identify similar substructures, semantic relationships, shared attributes, or combinations of all three between two distinct knowledge graphs. Entity alignment methods use these structural similarities between generally non-isomorphic graphs to predict which nodes corresponds to the same entity.[31]
The recent successes of large language models (LLMs), in particular their effectiveness at producing syntactically meaningful embeddings, has spurred the use of LLMs in the task of entity alignment.[32]
As the amount of data stored in knowledge graphs grows, developing dependable methods for knowledge graph entity alignment becomes an increasingly crucial step in the integration and cohesion of knowledge graph data.
See also
[edit]- Concept map – Diagram showing relationships among concepts
- Formal semantics (natural language) – Study of meaning in natural languages
- Graph database – Database using graph structures for queries
- Knowledge graph embedding – Dimensionality reduction of graph-based semantic data objects [machine learning task]
- Logical graph – Type of diagrammatic notation for propositional logic
- Semantic integration – Interrelating info from diverse sources
- Semantic technology – Technology to help machines understand data
- Topic map – Knowledge organization system
- Vadalog – Type of Knowledge Graph Management System
- YAGO (database) – Open-source information repository
References
[edit]- ^ "What is a Knowledge Graph?". 2018.
- ^ "What defines a knowledge graph?". 2020.
- ^ a b c Ehrlinger, Lisa; Wöß, Wolfram (2016). Towards a Definition of Knowledge Graphs (PDF). SEMANTiCS2016. Leipzig: Joint Proceedings of the Posters and Demos Track of 12th International Conference on Semantic Systems – SEMANTiCS2016 and 1st International Workshop on Semantic Change & Evolving Semantics (SuCCESS16). pp. 13–16.
- ^ Soylu, Ahmet (2020). "Enhancing Public Procurement in the European Union Through Constructing and Exploiting an Integrated Knowledge Graph". The Semantic Web – ISWC 2020. Lecture Notes in Computer Science. Vol. 12507. pp. 430–446. doi:10.1007/978-3-030-62466-8_27. ISBN 978-3-030-62465-1. S2CID 226229398.
- ^ Mohamed, Sameh K.; Nounu, Aayah; Nováček, Vít (2021). "Biological applications of knowledge graph embedding models". Briefings in Bioinformatics. 22 (2): 1679–1693. doi:10.1093/bib/bbaa012. hdl:1983/919db5c6-6e10-4277-9ff9-f86bbcedcee8. PMID 32065227 – via Oxford Academic.
- ^ Edward W. Schneider. 1973. Course Modularization Applied: The Interface System and Its Implications For Sequence Control and Data Analysis. In Association for the Development of Instructional Systems (ADIS), Chicago, Illinois, April 1972
- ^ "US Trademark no 75589756".
- ^ "ThinkBase". Retrieved 25 December 2024.
- ^ a b Singhal, Amit (May 16, 2012). "Introducing the Knowledge Graph: things, not strings". Official Google Blog. Retrieved 21 March 2017.
- ^ Schwartz, Barry (December 17, 2014). "Google's Freebase To Close After Migrating To Wikidata: Knowledge Graph Impact?". Search Engine Roundtable. Retrieved December 10, 2017.
- ^ a b McCusker, James P.; McGuiness, Deborah L. "What is a Knowledge Graph?". www.authorea.com. Retrieved 21 March 2017.
- ^ "Knowledge Graph Enterprises". 2020.
- ^ "2021 IEEE International Conference on Knowledge Graph (ICKG)*". KMedu Hub. 2017-07-09. Retrieved 2021-03-22.
- ^ Hogan, Aidan; Blomqvist, Eva; Cochez, Michael; d'Amato, Claudia; de Melo, Gerard; Gutierrez, Claudio; Labra Gayo, José Emilio; Kirrane, Sabrina; Neumaier, Sebastian; Polleres, Axel; Navigli, Roberto; Ngonga Ngomo, Axel-Cyrille; Rashid, Sabbir M.; Rula, Anisa; Schmelzeisen, Lukas; Sequeda, Juan; Staab, Steffen; Zimmermann, Antoine (2021-01-24). "Knowledge Graphs". ACM Computing Surveys. 54 (4): 1–37. arXiv:2003.02320. doi:10.1145/3447772. ISSN 0360-0300. S2CID 235716181.
- ^ Paulheim, Heiko (2017). "Knowledge Graph Refinement: A Survey of Approaches and Evaluation Methods" (PDF). Semantic Web: 489–508. Retrieved 21 March 2017.
- ^ Krötsch, Markus; Weikum, Gerhard (March 2016). "Editorial of the Special Issue on Knowledge Graphs". Journal of Web Semantics. 37–38: 53–54. doi:10.1016/j.websem.2016.04.002. Retrieved 10 February 2021.
- ^ "What is a Knowledge Graph?|Ontotext". Ontotext. Retrieved 2020-07-01.
- ^ Peng, Ciyuan; Feng, Xia; Naseriparsa, Mehdi; Osborne, Francesco (2023). "Knowledge Graphs: Opportunities and Challenges". Artificial Intelligence Review. 56 (11): 13071–13102. arXiv:2303.13948. doi:10.1007/s10462-023-10465-9. ISSN 1573-7462. PMC 10068207. PMID 37362886.
- ^ "The Knowledge Graph about Knowledge Graphs". 2020.
- ^ "The Linked Open Data Cloud". lod-cloud.net. Retrieved 2020-06-30.
- ^ Pyne, Yvette; Stewart, Stuart (March 2022). "Meta-work: how we research is as important as what we research". British Journal of General Practice. 72 (716): 130–131. doi:10.3399/bjgp22X718757. PMC 8884432. PMID 35210247.
- ^ "Neo4j Graph Database & Analytics | Graph Database Management System". Neo4j. Retrieved 8 November 2023.
- ^ "Ontotext GraphDB". Ontotext. Retrieved 8 November 2023.
- ^ "How do knowledge graphs work?". Stardog. 2022-04-05. Retrieved 2022-04-05.
- ^ "Unlocking the Power of Google Knowledge Panel: How to Obtain and Claim Yours in 2023 – RH Razu". rhrazu.com. 2023-09-01. Retrieved 2023-09-05.
- ^ Hongwei Wang (October 2018). "RippleNet: Propagating User Preferences on the Knowledge Graph for Recommender Systems". Proceedings of the 27th ACM International Conference on Information and Knowledge Management. pp. 417–426. arXiv:1803.03467. doi:10.1145/3269206.3271739. ISBN 9781450360142. S2CID 3766110.
- ^ Ristoski, Petar; Paulheim, Heiko (2016), "RDF2Vec: RDF Graph Embeddings for Data Mining" (PDF), The Semantic Web – ISWC 2016, Lecture Notes in Computer Science, vol. 9981, pp. 498–514, doi:10.1007/978-3-319-46523-4_30, ISBN 978-3-319-46522-7
- ^ Zhou, Jie; et al. (2020). "Graph neural networks: A review of methods and applications". AI Open. 1 (1): 57–81. arXiv:1812.08434. doi:10.1016/j.aiopen.2021.01.001. S2CID 56517517 – via Elsevier Science Direct.
- ^ Ye, Zi; Kumar, Yogan Jaya; Sing, Goh Ong; Song, Fengyan; Wang, Junsong (2022). "A comprehensive survey of graph neural networks for knowledge graphs". IEEE Access. 10: 75729–7574. Bibcode:2022IEEEA..1075729Y. doi:10.1109/ACCESS.2022.3191784. S2CID 250654689 – via IEEE Xplore.
- ^ Berrendorf, Max; Faerman, Evgeniy; Melnychuk, Valentyn; Tresp, Volker; Seidl, Thomas (April 14–17, 2020). Knowledge graph entity alignment with graph convolutional networks: lessons learned. Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal. Lecture Notes in Computer Science. Vol. Proceedings, Part II. pp. 3–11. arXiv:1911.08342. doi:10.1007/978-3-030-45442-5_1. ISBN 978-3-030-45441-8. S2CID 208158314 – via Springer International Publishing.
- ^ Chaurasiya, Deepak; Surisetty, Anil; Kumar, Nitish; Singh, Alok; Dey, Vikrant; Malhotra, Aakarsh; Dhama, Gaurav; Arora, Ankur (2022). "Entity alignment for knowledge graphs: progress, challenges, and empirical studies". arXiv:2205.08777 [cs.AI].
- ^ Hogan, Aidan; Lippolis, Anna Sofia; Klironomos, Antonis; Milon-Flores, Daniela F.; Zheng, Heng; Jouglar, Alexane; Norouzi, Ebrahim (2023). "Enhancing Entity Alignment Between Wikidata and ArtGraph using LLMs" (PDF). Proceedings of the International Workshop on Semantic Web and Ontology Design for Cultural Heritage – via International Workshop on Semantic Web and Ontology Design for Cultural Heritage (SWODCH), Athens, Greece.
External links
[edit]- Will Douglas Heaven (4 September 2020). "This know-it-all AI learns by reading the entire web nonstop". MIT Technology Review. Retrieved 5 September 2020.
Diffbot is building the biggest-ever knowledge graph by applying image recognition and natural-language processing to billions of web pages.