Structured learning: Difference between revisions
correction (see the overview of the book "Predicting Structured Data") |
added an example |
||
Line 2: | Line 2: | ||
Algorithms and models for structured learning include [[inductive logic programming]], [[structured SVM]]s, [[conditional random field]]s, [[Markov logic network]]s and [[Constrained Conditional Models]]. |
Algorithms and models for structured learning include [[inductive logic programming]], [[structured SVM]]s, [[conditional random field]]s, [[Markov logic network]]s and [[Constrained Conditional Models]]. |
||
==Example: sequence tagging== |
|||
Sequence tagging is a class of problems prevalent in [[natural language processing]], where input data are often sequences (e.g. sentences of text). The sequence tagging problem appears in several guises, e.g. [[part-of-speech tagging]] and [[named entity recognition]]. In POS tagging, each word in a sequence must receive a "tag" (class label) that expresses its "type" of word: |
|||
:This [[Determiner|DT]] |
|||
:is [[Verb|VBZ]] |
|||
:a DT |
|||
:tagged [[Adjective|JJ]] |
|||
:sentence [[Noun|NN]] |
|||
:. . |
|||
The main challenge in this problem is to resolve [[ambiguity]]: the word "sentence" can also be [[:wikt:sentence#Verb|a verb]] in English, and so can "tagged". |
|||
While this problem can be solved by simply performing [[statistical classification|classification]] of individual tokens, that approach does not take into account the empirical fact that tags do not occur independently; instead, each tag displays a strong [[conditional dependence]] on the tag of the previous word. This fact can be exploited in a sequence model such as a [[hidden Markov model]] or conditional random field that predicts the entire tag sequence for a sentence, rather than just individual tags, by means of the [[Viterbi algorithm]]. |
|||
==References== |
==References== |
Revision as of 17:11, 21 March 2013
Structured learning is the subfield of machine learning concerned with computer programs that learn to map inputs to outputs, both from arbitrarily complex sets. This stands in contrast to the simpler approaches of classification, where input data (instances) are mapped to "atomic" labels, i.e. symbols without any internal structure, and regression, where inputs are mapped to scalar numbers or vectors.[1]
Algorithms and models for structured learning include inductive logic programming, structured SVMs, conditional random fields, Markov logic networks and Constrained Conditional Models.
Example: sequence tagging
Sequence tagging is a class of problems prevalent in natural language processing, where input data are often sequences (e.g. sentences of text). The sequence tagging problem appears in several guises, e.g. part-of-speech tagging and named entity recognition. In POS tagging, each word in a sequence must receive a "tag" (class label) that expresses its "type" of word:
The main challenge in this problem is to resolve ambiguity: the word "sentence" can also be a verb in English, and so can "tagged".
While this problem can be solved by simply performing classification of individual tokens, that approach does not take into account the empirical fact that tags do not occur independently; instead, each tag displays a strong conditional dependence on the tag of the previous word. This fact can be exploited in a sequence model such as a hidden Markov model or conditional random field that predicts the entire tag sequence for a sentence, rather than just individual tags, by means of the Viterbi algorithm.
References
- ^ Gökhan BakIr, Ben Taskar, Thomas Hofmann, Bernhard Schölkopf, Alex Smola and SVN Vishwanathan (2007), Predicting Structured Data, MIT Press.