T9 (predictive text)
T9, which stands for Text on 9 keys, is a patented[1] predictive text technology for mobile phones, originally developed by Tegic Communications, now part of Nuance Communications[2].
T9 is used on phones from LG, NEC, Nokia, Samsung Electronics, Siemens, Sony Ericsson, Sanyo, Sagem and others. It was also used by Texas Instruments PDA Avigo during the late 90s. Its main competitors are iTap, created by Motorola, SureType, created by RIM and Eatoni's LetterWise and WordWise.
Design
T9's objective is to make it easier to type text messages. It allows words to be entered by a single keypress for each letter, as opposed to the multi-tap approach used in the older generation of mobile phones in which several letters are associated with each key, and selecting one letter often requires multiple keypresses.
It combines the groups of letters on each phone key with a fast-access dictionary of words. It looks up in the dictionary all words corresponding to the sequence of keypresses and orders them by frequency of use.
As it gains familiarity with the words and phrases the user commonly uses, it speeds up the process by offering the most frequently used words first and then lets the user access other choices with one or more presses of a predefined Next key.
The dictionary can be expanded by adding missing words, enabling them to be recognized in the future. After introducing a new word, the next time the user tries to produce that word T9 will add it to the predictive dictionary.
The user can manually add words to the UDB (user database) via multi-tap. The implementation of the user database depends on the version of T9 and how T9 is actually integrated on the device. Some phone manufacturers implement a permanent user database, while others implement one for the duration of the session.
Features
Some T9 implementations feature smart punctuation. This feature allows the user to insert sentence and word punctuation using the '1'-key. Depending on the context, smart punctuation inserts sentence punctuation (period) or embedded punctuation (period or hyphen) or word punctuation (apostrophe in can't, won't, isn't as well as the possessive -'s-). Depending on the language, T9 also supports word breaking after punctuation to support the -l'-, -n'- etc in French or the -'s- behavior for possessives in English.
The UDB is an optional feature which allows words that were explicitly entered by the user to be stored for future reference. The number of words stored depends on the implementation as well as the language.
In later versions of T9, the order of the words presented adapts to the usage pattern. For instance, in English, 4663 matches "good", "home", "gone", "hood", etc. Such combinations are known as isotaps; e.g., "home" is referred to as an isotap of "good". When the user uses "home" more often than "good", eventually the two words will switch position. Information about common word combinations can also be learned from the user and stored for future predictions.
For words entered by the user, word completion can be enabled. When the user enters matching key-presses, in addition to words and stems, the system will also provide completions.
In later versions of T9, the user can select a primary and secondary language and matches from both languages are presented. This enables users to write messages in their native as well as a foreign language.
Algorithm
In order to achieve compression ratios of close to 1 byte per word, T9 uses an optimized algorithm which maintains the order of words, and partial words (also known as stems) but because of this compression, it over-generates words which are sometimes visible to the user as 'junk words'. This is a side effect of the requirements for small database sizes on the lower end embedded devices.
See also
References
- ^ Reduced keyboard disambiguating computer. US Patent 5818437 (1998)
- ^ Nuance Communications press release (2007).