Jump to content

GermaNet: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Added content to description, as well as re-worded complicated explanations and added licensing information.
Added citations.
 
(16 intermediate revisions by 11 users not shown)
Line 1: Line 1:
{{primary sources|date=November 2011}}
{{primary sources|date=November 2011}}
'''GermaNet''' is a lexical-semantic net for the [[German language]] that relates [[noun]]s, [[verb]]s, and [[adjective]]s semantically by grouping lexical units that express the same concept into ''[[synset]]s'' and by defining [[semantic]] relations between these synsets.<ref name="Storjohann2010">{{cite book|author=Petra Storjohann|title=Lexical-semantic relations: theoretical and practical perspectives|url=https://books.google.com/books?id=OYBWObJ547AC&pg=PA165|accessdate=16 November 2011|date=23 June 2010|publisher=John Benjamins Publishing Company|isbn=978-90-272-3138-3|pages=165–}}</ref> GermaNet has much in common with the English [[WordNet]] and can be viewed as an on-line [[thesaurus]] or a light-weight [[ontology (information science)|ontology]]. GermaNet has been developed and maintained within various projects at the research group for General and Computational Linguistics, [[University of Tübingen]] since 1997. It has been integrated into the [[EuroWordNet]], a multilingual lexical-semantic database.<ref name="homepage">[http://www.sfs.uni-tuebingen.de/lsd/index.shtml GermaNet homepage]</ref>
'''GermaNet''' is a [[semantic network]] for the [[German language]]. It relates [[noun]]s, [[verb]]s, and [[adjective]]s semantically by grouping lexical units that express the same concept into ''[[synset]]s'' and by defining [[semantic]] relations between these synsets.<ref name="Storjohann2010">{{cite book|author=Petra Storjohann|title=Lexical-semantic relations: theoretical and practical perspectives|url=https://books.google.com/books?id=OYBWObJ547AC&pg=PA165|accessdate=16 November 2011|date=23 June 2010|publisher=John Benjamins Publishing Company|isbn=978-90-272-3138-3|pages=165–}}</ref> GermaNet is free for academic use, after signing a license. GermaNet shares much in common with the English [[WordNet]] and can be viewed as an online [[thesaurus]] or a light-weight [[ontology]].<ref name="Kunze">{{Cite journal|last1=Kunze|first1=Claudia|last2=Lemnitzer|first2=Lothar|title=GermaNet representation, visualization, application|journal=Proceedings of LREC 2002|year=2002|url=https://aclanthology.org/L02-1073/|access-date=1 January 2025}}</ref> GermaNet has been developed and maintained at the [[University of Tübingen]] since 1997 within the research group for General and Computational Linguistics. It has been integrated into the [[EuroWordNet]], a multilingual lexical-semantic database.<ref name="homepage">{{Cite web|url=https://uni-tuebingen.de/en/142806|title=GermaNet - an Introduction|website=uni-tuebingen.de|accessdate=October 1, 2020}}</ref>


==Database==
==Database==


===Contents===
===Contents===
GermaNet partitions the lexical space into a set of concepts that are interlinked by semantic relations. A semantic concept is modeled by a ''[[synset]]''. A synset is a set of words (called lexical units) where all the words are taken to have the same or almost the same meaning.Thus a synset is a set of synonyms grouped under one definition, or "gloss".
GermaNet partitions the lexical space into a set of concepts that are interlinked by semantic relations. A semantic concept is modeled by a ''[[synset]]''. A synset is a set of words (called lexical units) where all the words are taken to have the same or almost the same meaning. Thus, a synset is a set of synonyms grouped under one definition, or "gloss".<ref name="Kunze"></ref>


In addition to the gloss, synsets are labeled with their syntactic function and accompanied by example sentences for each distinct meaning in the synset.<ref name="GernEdiT">V. Henrich, E. Hinrichs. 2010. [http://www.lrec-conf.org/proceedings/lrec2010/pdf/264_Paper.pdf GernEdiT - The GermaNet Editing Tool]. In: ''Proceedings of the Seventh Conference on International Language Resources and Evaluation''.</ref>
In addition to the gloss, synsets are labeled with their syntactic function and accompanied by example sentences for each distinct meaning in the synset.<ref name="GernEdiT">V. Henrich, E. Hinrichs. 2010. [http://www.lrec-conf.org/proceedings/lrec2010/pdf/264_Paper.pdf GernEdiT - The GermaNet Editing Tool]. In: ''Proceedings of the Seventh Conference on International Language Resources and Evaluation''.</ref> Just as in [[WordNet]], for each word category the semantic space is divided into a number of [[semantic field]]s closely related to major nodes in the semantic network: ''Ort'', or "location", ''Körper'', or "body", etc.<ref name="homepage" />
Just as in WordNet, for each word category the semantic space is divided into a number of [[semantic field]]s closely related to major nodes in the semantic network: ''Ort'', or "location", ''Körper'', or "body", etc.<ref name="homepage" />


The following is an up-to-date statistics of GermaNet's version 11.0 contents (release May 2016):
As of version 15.0 (release May 2020), GermaNet contains:<ref name="homepage" />

* Synsets: 144113
*Number of synsets: 110167
*Number of Lexical Units: 142814
* Lexical Units: 185000
*Number of Literals: 126348
* Literals: 169521
*Number of Conceptual Relations: 123678
* Conceptual Relations: 157921
*Number of Lexical Relations (synonymy excluded): 4203
* Lexical Relations (synonymy excluded): 12203
*Number of Split Compounds: 66047
* Split Compounds: 98905
*Number of Interlingual Index (ILI) Records: 28567
* Interlingual Index (ILI) Records: 28564
*Number of Wikitionary Sense Descriptions: 28552<ref name="homepage" />
* Wiktionary Sense Descriptions: 29548


===Format===
===Format===
All GermaNet data is stored in a relational [[PostgreSQL]] 5 database. The database model follows the internal structure of GermaNet: there are tables to store synsets, lexical units, conceptual and lexical relations, etc.<ref name="GernEdiT"/> The distribution format of all GermaNet data is [[XML]]. The two types of files, one for synsets and the other for relations, represent all data that is available in the GermaNet database.
All GermaNet data is stored in a [[PostgreSQL]] [[relational database]]. The database schema follows the internal structure of GermaNet: there are tables to store synsets, lexical units, conceptual and lexical relations, etc.<ref name="GernEdiT"/> GermaNet data is distributed both in this database format and as [[XML]] files. In the XML data, two types of files, one for synsets and the other for relations, represent all data available in the GermaNet database.<ref name="dataformat">{{Cite web|url=https://uni-tuebingen.de/en/142817|title=Data format|accessdate=October 1, 2020}}</ref>


==Interfaces==
==Interfaces==
There are several [[Application Programming Interface]]s (API) available for [[Java (programming language)|Java]]<ref name="api">[http://www.sfs.uni-tuebingen.de/lsd/tools.shtml GermaNet APIs in Java]</ref> and for [[Perl]]. These APIs are distributed freely and provide easy access to all information in various versions of GermaNet.
There are software libraries and [[Application Programming Interface|APIs]] available for [[Java (programming language)|Java]], [[Python (programming language)|Python]], [[JavaScript]], and [[Perl]].<ref name="api">{{Cite web|url=https://uni-tuebingen.de/en/142818|title=Applications and Tools|website=uni-tuebingen.de|accessdate=October 1, 2020}}</ref><ref>{{Cite web|url=https://metacpan.org/pod/GermaNet::Flat|title=GermaNet::Flat|website=metacpan.org|accessdate=October 1, 2020}}</ref> These programs are distributed under [[free-software license]]s and provide easy access to all information in various versions of GermaNet.

[https://weblicht.sfs.uni-tuebingen.de/rover GermaNet Rover] is an on-line application that can be used to search for synsets in GermaNet, explore the data associated with them, and calculate the [[semantic similarity]] of pairs of synsets. It features visualizations of the [[Hyponymy and hypernymy|hypernym]] relation and advanced filtering options for synset searching.


==Licenses==
==Licenses==
GermaNet 11.0 (released May 2016) is free for academic. It can be distributed under one of the following types of [[software license agreement|license agreements]]:
GermaNet 15.0 (released May 2020) can be distributed under one of the following types of [[software license agreement|license agreements]]:<ref>{{Cite web|url=https://uni-tuebingen.de/en/142828|title=Licenses|website=uni-tuebingen.de|access-date=October 1, 2020}}</ref>
* ''Academic Research Agreement'': free for the research purposes of academic institutions. Licenses are not given to individuals, and those seeking a license are required to talk to an academic advisor.


* ''Research and Development Agreement'': applies to non-academic institutions and research consortia. To be used strictly for technology development and internal research.
* ''Academic Research License Agreement'': for the purpose of research at academic institutions. There is no license fee for academic use. Licenses are not given to individual students, and those seeking a license are required to talk to an academic advisor.


* ''Commercial Agreement'': applies to non-academic institutions and commercial enterprises. It permits technology development and internal research, as well as giving the non-exclusive right to distribute and market any derived product or service.<ref>{{Cite web|url=http://www.sfs.uni-tuebingen.de/lsd/licenses.shtml|title=Licenses|website=www.sfs.uni-tuebingen.de|access-date=2017-03-26}}</ref>
* ''Research and Development License Agreement'': applies to non-academic institutions and research consortia. To be used strictly for technology development and internal research.


* ''Commercial License Agreement'': applies to non-academic institutions and commercial enterprises. It permits technology development and internal research, as well as giving the non-exclusive right to distribute and market any derived product or service.
==Applications==

GermaNet has been used for a variety of applications, including semantic analysis, shallow recognition of implicit document structure, compound analysis;<ref>Manuela Kunze and Dietmar Rösner. 2004. Issues in Exploiting GermaNet as a Resource in Real Applications.</ref> for analyzing sectional preferences,<ref>Sabine Schulte im Walde, 2004. GermaNet Synsets as Selectional Preferences in Semantic Verb Clustering.</ref> for word sense disambiguation,<ref>Saito et al., 2002. Evaluation of GermanNet: Problems Using GermaNet for Automatic Word Sense Disambiguation.</ref> etc.
== See also==
==Alternatives==
Open-de-WordNet is a freely available alternative to GermaNet which is compatible with [[WordNet]].<ref name="odenet">{{Cite web|url=https://github.com/hdaSprachtechnologie/odenet|title=GitHub - hdaSprachtechnologie/odenet: Open German WordNet|date=November 14, 2019|accessdate=November 20, 2019|via=GitHub}}</ref>

==Linguistic Applications==
GermaNet has been used for a variety of applications, including:

* semantic analysis<ref name="KunzeRoesner2004">Manuela Kunze and Dietmar Rösner. 2004. Issues in Exploiting GermaNet as a Resource in Real Applications.</ref>
* shallow recognition of implicit document structure<ref name="KunzeRoesner2004"/>
* compound analysis<ref name="KunzeRoesner2004"/>
* analyzing sectional preferences<ref name="Schulte2004">Sabine Schulte im Walde, 2004. GermaNet Synsets as Selectional Preferences in Semantic Verb Clustering.</ref>
* word sense disambiguation<ref>Saito et al., 2002. Evaluation of GermanNet: Problems Using GermaNet for Automatic Word Sense Disambiguation.</ref>

==See also==
* [[Hyponym]]
* [[Hyponym]]
* [[Is-a]]
* [[Is-a]]
Line 46: Line 58:
* [[Synonym Ring]]
* [[Synonym Ring]]
* [[Taxonomy (general)|Taxonomy]]
* [[Taxonomy (general)|Taxonomy]]
* [[ThoughtTreasure]]
* [[UBY-LMF]]
* [[UBY-LMF]]
* [[Word sense disambiguation]]
* [[Word sense disambiguation]]
Line 52: Line 63:
==References==
==References==
{{Reflist}}
{{Reflist}}

== External links ==
* {{Official website|https://uni-tuebingen.de/en/142806}}
* [https://weblicht.sfs.uni-tuebingen.de/rover/ GermaNet Rover online browser]


{{Authority control}}
{{Authority control}}

Latest revision as of 14:05, 2 January 2025

GermaNet is a semantic network for the German language. It relates nouns, verbs, and adjectives semantically by grouping lexical units that express the same concept into synsets and by defining semantic relations between these synsets.[1] GermaNet is free for academic use, after signing a license. GermaNet shares much in common with the English WordNet and can be viewed as an online thesaurus or a light-weight ontology.[2] GermaNet has been developed and maintained at the University of Tübingen since 1997 within the research group for General and Computational Linguistics. It has been integrated into the EuroWordNet, a multilingual lexical-semantic database.[3]

Database

[edit]

Contents

[edit]

GermaNet partitions the lexical space into a set of concepts that are interlinked by semantic relations. A semantic concept is modeled by a synset. A synset is a set of words (called lexical units) where all the words are taken to have the same or almost the same meaning. Thus, a synset is a set of synonyms grouped under one definition, or "gloss".[2]

In addition to the gloss, synsets are labeled with their syntactic function and accompanied by example sentences for each distinct meaning in the synset.[4] Just as in WordNet, for each word category the semantic space is divided into a number of semantic fields closely related to major nodes in the semantic network: Ort, or "location", Körper, or "body", etc.[3]

As of version 15.0 (release May 2020), GermaNet contains:[3]

  • Synsets: 144113
  • Lexical Units: 185000
  • Literals: 169521
  • Conceptual Relations: 157921
  • Lexical Relations (synonymy excluded): 12203
  • Split Compounds: 98905
  • Interlingual Index (ILI) Records: 28564
  • Wiktionary Sense Descriptions: 29548

Format

[edit]

All GermaNet data is stored in a PostgreSQL relational database. The database schema follows the internal structure of GermaNet: there are tables to store synsets, lexical units, conceptual and lexical relations, etc.[4] GermaNet data is distributed both in this database format and as XML files. In the XML data, two types of files, one for synsets and the other for relations, represent all data available in the GermaNet database.[5]

Interfaces

[edit]

There are software libraries and APIs available for Java, Python, JavaScript, and Perl.[6][7] These programs are distributed under free-software licenses and provide easy access to all information in various versions of GermaNet.

GermaNet Rover is an on-line application that can be used to search for synsets in GermaNet, explore the data associated with them, and calculate the semantic similarity of pairs of synsets. It features visualizations of the hypernym relation and advanced filtering options for synset searching.

Licenses

[edit]

GermaNet 15.0 (released May 2020) can be distributed under one of the following types of license agreements:[8]

  • Academic Research License Agreement: for the purpose of research at academic institutions. There is no license fee for academic use. Licenses are not given to individual students, and those seeking a license are required to talk to an academic advisor.
  • Research and Development License Agreement: applies to non-academic institutions and research consortia. To be used strictly for technology development and internal research.
  • Commercial License Agreement: applies to non-academic institutions and commercial enterprises. It permits technology development and internal research, as well as giving the non-exclusive right to distribute and market any derived product or service.

Alternatives

[edit]

Open-de-WordNet is a freely available alternative to GermaNet which is compatible with WordNet.[9]

Linguistic Applications

[edit]

GermaNet has been used for a variety of applications, including:

  • semantic analysis[10]
  • shallow recognition of implicit document structure[10]
  • compound analysis[10]
  • analyzing sectional preferences[11]
  • word sense disambiguation[12]

See also

[edit]

References

[edit]
  1. ^ Petra Storjohann (23 June 2010). Lexical-semantic relations: theoretical and practical perspectives. John Benjamins Publishing Company. pp. 165–. ISBN 978-90-272-3138-3. Retrieved 16 November 2011.
  2. ^ a b Kunze, Claudia; Lemnitzer, Lothar (2002). "GermaNet – representation, visualization, application". Proceedings of LREC 2002. Retrieved 1 January 2025.
  3. ^ a b c "GermaNet - an Introduction". uni-tuebingen.de. Retrieved October 1, 2020.
  4. ^ a b V. Henrich, E. Hinrichs. 2010. GernEdiT - The GermaNet Editing Tool. In: Proceedings of the Seventh Conference on International Language Resources and Evaluation.
  5. ^ "Data format". Retrieved October 1, 2020.
  6. ^ "Applications and Tools". uni-tuebingen.de. Retrieved October 1, 2020.
  7. ^ "GermaNet::Flat". metacpan.org. Retrieved October 1, 2020.
  8. ^ "Licenses". uni-tuebingen.de. Retrieved October 1, 2020.
  9. ^ "GitHub - hdaSprachtechnologie/odenet: Open German WordNet". November 14, 2019. Retrieved November 20, 2019 – via GitHub.
  10. ^ a b c Manuela Kunze and Dietmar Rösner. 2004. Issues in Exploiting GermaNet as a Resource in Real Applications.
  11. ^ Sabine Schulte im Walde, 2004. GermaNet Synsets as Selectional Preferences in Semantic Verb Clustering.
  12. ^ Saito et al., 2002. Evaluation of GermanNet: Problems Using GermaNet for Automatic Word Sense Disambiguation.
[edit]