Pivot language: Difference between revisions

Content deleted Content added

Inline

Latest revision as of 16:16, 14 April 2024

A pivot language, sometimes also called a bridge language, is an artificial or natural language used as an intermediary language for translation between many different languages – to translate between any pair of languages A and B, one translates A to the pivot language P, then from P to B. Using a pivot language avoids the combinatorial explosion of having translators across every combination of the supported languages, as the number of combinations of language is linear ( $n-1$ ), rather than quadratic $\left(\textstyle {\binom {n}{2}}={\frac {n^{2}-n}{2}}\right)$ – one need only know the language A and the pivot language P (and someone else the language B and the pivot P), rather than needing a different translator for every possible combination of A and B.

The disadvantage of a pivot language is that each step of retranslation introduces possible mistakes and ambiguities – using a pivot language involves two steps, rather than one. For example, when Hernán Cortés communicated with Mesoamerican Indians, he spoke Spanish to Gerónimo de Aguilar, who spoke Mayan to Malintzin, who spoke Nahuatl to the locals.

Examples

English, French, Russian, and Arabic are often used as pivot languages. Interlingua has been used as a pivot language in international conferences and has been proposed as a pivot language for the European Union.^[1] Esperanto was proposed as a pivot language in the Distributed Language Translation project and has been used in this way in the Majstro Tradukvortaro at the Esperanto website Majstro.com. The Universal Networking Language is an artificial language specifically designed for use as a pivot language.

In computing

Pivot coding is also a common method of translating data for computer systems. For example, the Internet Protocol, XML and high level languages are pivot codings of computer data which are then often rendered into internal binary formats for particular computer systems.

Unicode was designed to be usable as a pivot coding between various major existing character encodings, though its widespread adoption as a coding in its own right has made this usage unimportant.

References

^ Breinstrup, Thomas. "Linguaphobos? Non in le UE". [Linguaphobes? Not in the EU]. Panorama in Interlingua, 2006, Issue 5.

Hua Wu and Haifeng Wang. 2009. Revisiting Pivot Language Approach for Machine Translation. ACL-09.
Utiyama, M. & H. Isahara (2006) A comparison of pivot methods for phrase-based statistical machine translation. In Proceedings of NAACL/HLT, 484{491.

[1] Breinstrup, Thomas. "Linguaphobos? Non in le UE". [Linguaphobes? Not in the EU]. Panorama in Interlingua, 2006, Issue 5.

[1]

@@ Line 1: / Line 1: @@
+{{Short description|Intermediary language between different languages}}
-A '''pivot language''', sometimes also called a '''bridge language''', is an [[artificial language|artificial]] or [[natural language]] used as an intermediary language for translation between many different languages – to translate between any pair of languages A and B, one translates A to the pivot language P, then from P to B. Using a pivot language avoids the [[combinatorial explosion (communication)|combinatorial explosion]] of having translators across every combination of the supported languages, as the number of combinations of language is linear (<math>n-1</math>), rather than quadratic (<math>\textstyle{\binom{n}{2}}=n^2-n</math>) – one need only know the language A and the pivot language P (and someone else the language B and the pivot P), rather than needing a different translator for every possible combination of A and B.
+{{refimprove|date=July 2018}}
+A '''pivot language''', sometimes also called a '''bridge language''', is an [[artificial language|artificial]] or [[natural language]] used as an intermediary language for translation between many different languages – to translate between any pair of languages A and B, one translates A to the pivot language P, then from P to B. Using a pivot language avoids the [[combinatorial explosion (communication)|combinatorial explosion]] of having translators across every combination of the supported languages, as the number of combinations of language is linear (<math>n-1</math>), rather than quadratic <math>\left(\textstyle{\binom{n}{2}}=\frac{n^2-n}{2}\right)</math> – one need only know the language A and the pivot language P (and someone else the language B and the pivot P), rather than needing a different translator for every possible combination of A and B.
-The disadvantage of a pivot language is that each step of retranslation introduces possible mistakes and ambiguities – using a pivot language involves two steps, rather than one. For example, when [[Hernán Cortés]] communicated with [[Mesoamerican]] Indians, he would speak Spanish to [[Gerónimo de Aguilar]], who would speak [[Mayan language|Mayan]] to [[Malintzin]], who would speak [[Nahuatl language|Nahuatl]] to the locals.
+The disadvantage of a pivot language is that each step of retranslation introduces possible mistakes and ambiguities – using a pivot language involves two steps, rather than one. For example, when [[Hernán Cortés]] communicated with [[Mesoamerican]] Indians, he spoke Spanish to [[Gerónimo de Aguilar]], who spoke [[Mayan language|Mayan]] to [[Malintzin]], who spoke [[Nahuatl language|Nahuatl]] to the locals.
 == Examples ==
@@ Line 8: / Line 10: @@
 == In computing ==
-{{also|Intermediate language}}
+{{see also|Intermediate language|Data conversion#Pivotal conversion}}
-Pivot coding is also a common method of translating data for computer systems.  For example, the [[internet protocol]], [[XML]] and [[high level language]]s are pivot codings of computer data which are then often rendered into internal binary formats for particular computer systems.
+Pivot coding is also a common method of translating data for computer systems.  For example, the [[Internet Protocol]], [[XML]] and [[high level language]]s are pivot codings of computer data which are then often rendered into internal binary formats for particular computer systems.
 [[Unicode]] was designed to be usable as a pivot coding between various major existing character encodings, though its widespread adoption as a coding in its own right has made this usage unimportant.
-== In machine translation (MT) ==
-Current statistical machine translation ([[Statistical machine translation|SMT]]) systems use [[parallel corpora]] to achieve their good results, but good parallel corpora are not available for all languages. Pivot language (p) enables the bridge between two languages, to which existing parallel corpora are entirely or partially not yet at hand.
-The problematic for pivot translation concerns the fidelity of information forwarded in the use of different corpora. From the use of two bilingual corpora (s-p & p-t) to set the bridge between s-t, linguistic data is inevitably lost. Rule-based machine translation ([[RBMT]]) helps the system rescue this information, so the system does not entirely rely on statistics, but also on structural linguistic information.
-Three basic techniques are used to employ pivot language in MT: (1) ''triangulation'', which focuses on phrase paralleling of source-pivot (s-p), pivot-target (p-t); (2) ''[[Transfer-based machine translation|transfer]]'' translates the whole sentence of the source language to one pivot language and then to the target language; (3) ''synthetic'' builds a corpus of its own for system training.
-The '''triangulation''' method (also called ''phrase table multiplication'') calculates the probability of both translation correspondances and lexical weight in s-p and p-t, to try to induce a new s-t phrase table.
-The '''transfer''' method (also called ''sentence translation strategy'') simply carries a straigthforward translation of s into p and then another translation of p into t without using probabilistic tests (like in triangulation).
-The '''synthetic''' method uses an existing corpus of s and tries to build an own synthetic corpus out of it that is used by the system to train itself. Than a synthetic bilangual corpus is built between s-p, so that translation p-t may be carried out.
-It has been shown, in the direct comparison between triangulation and transfer methods, that triangulation achieves much better results for SMT systems than transfer.
-All three pivot language techniques enhance the performance of SMT systems. However the ''synthetic'' technique  doesn't work well with RBMT, and systems' performances are lower than expected. Hybrid SMT/RBMT systems achieves better translation quality than strict-SMT systems that rely on bad parallel corpora.
-The key role of RBMT systems is that they help fill the gap left in the translation process of s-p → p-t, in the sense that these parallels are included in the SMT model for s-t.
 ==References==
-<references/>
-<ref>Hua Wu and Haifeng Wang. 2009. Revisiting Pivot Language Approach for Machine Translation.  ACL-09.</ref>
-<ref>Utiyama, M. & H. Isahara (2006) A comparison of pivot methods for phrase-based statistical machine translation. In Proceedings of NAACL/HLT, 484{491.</ref>
 {{Reflist}}
+* Hua Wu and Haifeng Wang. 2009. [http://www.aclweb.org/anthology/P09-1018 Revisiting Pivot Language Approach for Machine Translation].  ACL-09.
+* Utiyama, M. & H. Isahara (2006) [http://www.aclweb.org/anthology/N07-1061 A comparison of pivot methods for phrase-based statistical machine translation]. In Proceedings of NAACL/HLT, 484{491.
+{{DEFAULTSORT:Pivot Language}}
 [[Category:Translation]]
+[[Category:Constructed languages]]
+[[Category:Language]]
-{{ling-stub}}
-[[br:Yezh paoell]]
-[[fr:Interlangue (traduction automatique)]]
-[[it:Lingua pivot]]
-[[ru:Язык-мост]]
-[[th:ภาษาแกน]]