Jump to content

Pseudolocalization: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Undid false vandalism reversion by ClueBot NG (talk)
m Inserting a missing preposition.
Line 8: Line 8:


* Translated text that is significantly longer than the base language, and does not fit within the UI constraints, or which causes text breaks at awkward positions.
* Translated text that is significantly longer than the base language, and does not fit within the UI constraints, or which causes text breaks at awkward positions.
* Font glyphs that are significantly larger, or possess [[diacritic]] marks not found in, a base language, and which may be cut off vertically.
* Font glyphs that are significantly larger than, or possess [[diacritic]] marks not found in, a base language, and which may be cut off vertically.
* Languages for which the reading order is not [[Writing_system#Directionality|left-to-right]], which is especially problematic for user input.
* Languages for which the reading order is not [[Writing_system#Directionality|left-to-right]], which is especially problematic for user input.
* Languages for which the character set does not fit within a single 8-bit character, which can produce actual logic bugs if left uncaught.
* Languages for which the character set does not fit within a single 8-bit character, which can produce actual logic bugs if left uncaught.

Revision as of 17:29, 24 October 2012

Pseudo-localization (or pseudolocalization) is a software testing method that is used to test internationalization aspects of software. It is similar to the process of localization, in that the textual elements of the application (for example, text in a graphical user interface) are replaced with equivalent elements in different languages. Unlike true localization, pseudo-localization uses a fictitious language that is designed specifically to include the most problematic characteristics of text from a wide variety of other languages.

Localization Process

Traditionally, localization of software was done independent of the software development process. In a typical scenario, software would be built and tested in one base language (such as English), with any localizable elements being extracted into external resources. Those resources were then handed off to a localization team for translation into different target languages.[1] The problem with this approach is that many subtle software bugs may be found during the process of localization, when it is too late (or more likely, too expensive) to fix them.[1]

The types of problems that can arise during localization involve differences in how written text appears in different languages. These problems include:

  • Translated text that is significantly longer than the base language, and does not fit within the UI constraints, or which causes text breaks at awkward positions.
  • Font glyphs that are significantly larger than, or possess diacritic marks not found in, a base language, and which may be cut off vertically.
  • Languages for which the reading order is not left-to-right, which is especially problematic for user input.
  • Languages for which the character set does not fit within a single 8-bit character, which can produce actual logic bugs if left uncaught.

In addition, the localization process may uncover places where an element should be localizable, but is hard coded in a base language. Similarly, there may be elements that were designed to be localized, but should not be (e.g. the element names in an XML or HTML document.) [2]

Pseudo-localization is designed to catch these types of bugs during the development cycle, by mechanically replacing all localizable elements with a pseudo-language that is readable by native speakers of the base language, but which contains all of the troublesome elements of other languages or scripts.

Pseudo Locales

Pseudolocalization was introduced at Microsoft during the Windows Vista development cycle.[3] The type of pseudo-language invented for this purpose is called a pseudo locale in Windows parlance. These locales were designed to use character sets and scripts characteristics from one of the three broad classes of foreign languages used by Windows at the time—basic ("Western"), mirrored ("Near-Eastern"), and CJK ("Far-Eastern").[1] Prior to Vista, each of these three languages had their own separate builds of Windows, with potentially different code bases (and thus, different behaviors and bugs.) The pseudo locales created for each of these language families would produce text that still "read" as English, but was made up of script from another language. For example, the text string

Edit program settings

would be rendered in the "basic" pseudo-locale as

[!!! εÐiţ Þr0ģЯãm səTτıИğ§ !!!]

This process produces translated strings that were longer, included non-ASCII characters, and (in the case of the "mirrored" pseudo-locale) was written right-to-left.[3]

Note that the brackets on either side of the text in this example helps to spot the following issues:

  • text that is cut off
  • concatenated strings
  • hard-coded strings

Pseudolocalization process

Michael Kaplan (program manager and head of the Microsoft internationalization team) explains the process or pseudo-localization as

an eager and hardworking yet naive intern localizer, who is eager to prove himself [or herself] and who going to translate every single string that you don't say shouldn't get translated.[2]

One of the key features of this pseudo-localization process is that it happens mechanically, during the development cycle, as part of a routine build. The process is almost identical to the process used to produce true localized builds, but is done before a build is tested, much earlier in the development cycle. This leaves time for any bugs that are found to be fixed in the base code, which is much easier than bugs not found until a release date is near.[1]

The builds that are produced by the pseudo-localization process are tested using the same QA cycle as a non-localized build. Since the pseudo-locales are mimicking English text, they are usable by a native English speaker. Recently, beta version of Windows (7 and 8) have been released with some pseudo-localized strings intact.[4][5] For these recent version of Windows, the pseudo-localized build is the primary staging build (the one created routinely for testing), and the final English language build is a "localized" version of that.[2]

Besides the tools used internally by Microsoft, other internationalization tools now include pseudo-localization options. These tools include Alchemy Catalyst from Alchemy Software Development, and SDL Passolo from SDL. Such tools include pseudo-localization capability, including ability to view rendered Pseudo-localized dialog's and forms in the tools themselves.

See also

References

  1. ^ a b c d Raymond Chen (26 July 2012). "A brief and also incomplete history of Windows localization". Retrieved 26 July 2012.
  2. ^ a b c Michael Kaplan (11 April 2011). "One of my colleagues is the "Pseudo Man"". Retrieved 26 July 2012.
  3. ^ a b Shawn Steele (27 June 2006). "Pseudo Locales in Windows Vista Beta 2". Retrieved 26 July 2012.
  4. ^ Steven Sinofsky (7 July 2009). "Engineering Windows 7 for a Global Market". Retrieved 26 July 2012.
  5. ^ Kriti Jindal (16 March 2012). "Install PowerShell Web Access on non-English machines". Retrieved 26 July 2012.