Jump to content

Ligature (writing): Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Line 113: Line 113:
This table below shows discrete letter pairs on the left, the corresponding [[Unicode]] ligature in the middle column, and the Unicode code point on the right. Provided you are using an [[operating system]] and [[Web browser|browser]] that can handle Unicode, and have the correct Unicode [[typeface|fonts]] installed, some or all of these will display correctly. See also the provided graphic.
This table below shows discrete letter pairs on the left, the corresponding [[Unicode]] ligature in the middle column, and the Unicode code point on the right. Provided you are using an [[operating system]] and [[Web browser|browser]] that can handle Unicode, and have the correct Unicode [[typeface|fonts]] installed, some or all of these will display correctly. See also the provided graphic.


Unicode maintain that ligaturing is a presentation issue rather than a character definition issue, and that, for example, "if a modern font is asked to display 'h' followed by 'r', and the font has an 'hr' ligature in it, it can display the ligature." Accordingly, the use of the special Unicode ligature characters is "discouraged"<ref>[http://unicode.org/faq/ligature_digraph.html Ligatures, Digraphs and Presentation Forms], Unicode FAQ</ref>. Note however that ligatures such as æ and œ are never used to replace arbitrary 'ae' or 'oe' sequences – 'does' can never be written 'dœs'.
[[Unicode]] maintains that ligaturing is a presentation issue rather than a character definition issue, and that, for example, "if a modern font is asked to display 'h' followed by 'r', and the font has an 'hr' ligature in it, it can display the ligature." Accordingly, the use of the special Unicode ligature characters is "discouraged"<ref>[http://unicode.org/faq/ligature_digraph.html Ligatures, Digraphs and Presentation Forms], Unicode FAQ</ref>. Note however that ligatures such as æ and œ are never used to replace arbitrary 'ae' or 'oe' sequences – 'does' can never be written 'dœs'.


====Ligatures in Unicode (Latin-derived alphabets)====
====Ligatures in Unicode (Latin-derived alphabets)====

Revision as of 22:13, 8 September 2009

Template:Contains Indic text

ſi ligature type, size 12pt Garamond.

In writing and typography, a ligature occurs where two or more graphemes are joined as a single glyph. Ligatures usually replace consecutive characters sharing common components and are part of a more general class of glyphs called "contextual forms" where the specific shape of a letter depends on context such as surrounding letters or proximity to the end of a line.

History

At the origin of typographical ligatures is the simple running together of letters in manuscripts. Already the earliest known script, Sumerian cuneiform, includes many cases of character combinations that over the script's history gradually evolve from a ligature into an independent character in its own right. Ligatures figure prominently in many historical scripts, notably the Brahmic abugidas, or the bind rune in Migration Period Germanic inscriptions.

Medieval scribes, writing in Latin, increased writing speed by combining characters and by introduction of scribal abbreviation. For example, in blackletter, letters with right-facing bowls (b, o, and p) and those with left-facing bowls (c, e, o, and q) were written with the facing edges of the bowls superimposed. In many script forms characters such as h, m, and n had their vertical strokes superimposed. Scribes also used scribal abbreviations to avoid having to write a whole character at a stroke. Manuscripts in the fourteenth century employed hundreds of such abbreviations.

In hand writing, a ligature is made by joining two or more characters in a way they wouldn't usually be, either by merging their parts, writing one above another or one inside another; while in printing, a ligature is a group of characters that is typeset as a unit, and the characters don't have to be joined — for example, in some cases fi ligature prints letters f and i more separated than when they are typeset as separate letters.

When printing with movable type was invented around 1450,[1] typefaces included many ligatures. However they began to fall out of use with the advent of the wide use of sans serif machine-set body text in the 1950s and the development of inexpensive phototypesetting machines in the 1970s, which did not require journeyman knowledge or training to operate. One of the first computer typesetting programs to take advantage of computer driven typesetting (and later laser printers) was the TeX program of Donald Knuth (see below for more on this). This trend was further strengthened by the desktop publishing revolution around 1985. Early computer software in particular (except for TeX) had no way to allow for ligature substitution (the automatic use of ligatures where appropriate), and in any case most new digital fonts did not include any ligatures. As most of the early PC development was designed for and in the English language, which already saw ligatures as optional at best, a need for ligatures was not seen. Ligature use fell as the number of employed, traditionally-trained hand compositors and hot metal typesetting machine operators dropped.

With the increased support for other languages and alphabets in modern computing, and the resulting improved digital typesetting techniques such as OpenType, ligatures are slowly coming back in use.

Latin alphabet

Stylistic ligatures

Two common ligatures: fi and fl
Two common ligatures: fi and fl

Many ligatures combine f with an adjacent letter. The most prominent example is (or fi, rendered with two normal letters). The dot above the i in many typefaces collides with the hood of the f when placed beside each other in a word, and are combined into a single glyph with the dot absorbed into the f. Other ligatures with the letter f includes fj,[2] fl (fl), ff (ff), ffi (ffi), and ffl (ffl). Ligatures for fa, fe, fo, fr, fs, ft, fb, fh, fu, fy, and for f followed by a full stop, comma, or hyphen, as well as the equivalent set for the doubled ff and fft are also used, though are less common.

Sometimes, a ligature crossing the boundary of a composite word (e.g., ff in shelfful[3]) is considered undesirable, and computer programs (such as TeX) provide a means of suppressing ligatures.

Some fonts include an fff ligature (the Requiem Italic font by Jonathan Hoefler contains even an fffl ligature), intended for German compound words like Sauerstoffflasche ("oxygen tank") and Schifffahrt ("boat trip") (the latter word is written with fff only if the writer follows the spelling reform of 1996). Official German orthography as outlined in the Duden however prohibits ligatures across composition boundaries, and since the sequence fff in German only ever occurs across such boundaries (Schiff-fahrt, Sauerstoff-flasche), these ligatures cannot be correctly employed for German.[4]

Turkish has a dotted and dotless "I", with next to each other words like fırın ("oven") and fikir ("idea"). The fi ligature would obscure the distinction and is therefore not used in Turkish typography, and neither are other ligatures like that for fl, which correspond to rare letter combinations anyway.

"ß" in the form of a "ſʒ" ligature on a street sign in Berlin ("Petersburger Straße"). The sign on the right ("Bersarinplatz") ends with a "tʒ"-ligature.

A remnants of "ſʒ ("sz") and ("tz") ligatures from Fraktur, a family of German blackletter typefaces, originally mandatory in Fraktur but now employed only stylistically, can be seen to this day on street signs for city squares whose name contains Platz or ends in -platz.

Sometimes ligatures for st (st), ſt (ſt), ch, ct, and Qu are used (e.g. in the typeface Linux Libertine).

German ß

The German esszett ligature (also called the scharfes s (sharp s)), ß evolved from the ligature "long s over round s" or, in Fraktur, "long s and z". Even though "long s" ſ has otherwise disappeared from German orthography, ß is still considered a ligature, and is replaced by 'SS' in capitalized spelling and in alphabetic ordering.

Letters and diacritics originating as ligatures

The ligatures of Adobe Caslon Pro.

As the letter W is an addition to the Latin alphabet which originated in the seventh century, the phoneme it represents was formerly written in various ways. In Old English the Runic letter Wynn (Ƿ) was used, but Norman influence forced Wynn out of use. By the 14th century, the "new" letter W, originated as two Vs or Us joined together, developed into a legitimate letter with its own position in the alphabet. Because of its relative youth compared to other letters of the alphabet, only a few European languages (English, Dutch, German, Polish, Welsh, and Maltese) use the letter in native words.

The character Æ (æ, or aesc) when used in the Danish, Norwegian, or Icelandic languages, or Old English, is not a typographic ligature. It is a distinct letter—a vowel—and when alphabetised, is given a different place in the alphabetic order. In modern English orthography Æ is not considered an independent letter but a spelling variant, for example: "encyclopædia" versus "encyclopaedia" or "encyclopedia".

Æ comes from Medieval Latin, where it was an optional ligature in some words, for example, "Æneas". It is still found as a variant in English and French, but the trend has recently been towards printing the A and E separately.[5] Similarly, Œ and œ, while normally printed as ligatures in French, can be replaced by component letters if technical restrictions require it.

In German orthography, the umlauted vowels ä, ö, and ü historically arose from ae, oe, ue ligatures (strictly, from superscript e, viz. a ͤ, o ͤ, u ͤ). It is still acceptable to replace them with ae, oe ue digraphs when the diacritics are unavailable, while in alphabetic order, they are equivalent not to ae, oe, ue, but to simple a, o, u (except in phone books), unlike the convention in Scandinavian languages, where the umlaut vowels are treated as independent letters with positions at the end of the alphabet.

The ring diacritic used in vowels such as å likewise originated as an o-ligature.[citation needed] The uo ligature ů in particular saw use in Early Modern High German, but it merged in later Germanic languages with u (e.g. MHG fuosz, ENHG fuͦß, Modern German Fuß "foot"). It survives in Czech, where it is called kroužek.

The tilde diacritic as used in Spanish and Portuguese, now representing the palatal nasal sound in the letter ñ and nasalization of the affected vowel, respectively, originated as an nn ligature[6] (Espanna = España, anno = año). Similarly, the circumflex in French spelling stems from the ligature of a silent s.[7] The French, Portuguese, Catalan and old Spanish letter ç represents a "c" over a "z".

The letter hwair (ƕ), used only in transliteration of the Gothic language, resembles a hw ligature. It was introduced by philologists around 1900 to replace the digraph hv formerly used to express the phoneme in question, e.g. by Migne in the 1860s (Patrologia Latina vol. 18).

The Byzantines had a unique o-u ligature (Ȣ) that, while originally based on the Greek alphabet's ο-υ, carried over into Latin-based alphabets as well.

Gha (ƣ), a rarely used letter based on Q and G, was misconstrued by the ISO to be an O-I ligature due to its appearance, and is thus known (to the ISO and, in turn, Unicode) as "Oi."

The International Phonetic Alphabet formerly used ligatures to represent affricate consonants, of which six are encoded in Unicode: ʣ, ʤ, ʥ, ʦ, ʧ and ʨ. One fricative consonant is still represented with a ligature: ɮ, and the Extensions to the IPA contain three more: ʩ , ʪ and ʫ.

Rarer ligatures also exist, such as Ꜳꜳ, Ꜵꜵ, Ꜷꜷ, Ꜹꜹ, Ꜻꜻ, Ꜽꜽ, Ꝏꝏ, ᵫ, ᵺ, Ỻỻ, Ꜩꜩ ᴂ and ᴔ.

Symbols originating as ligatures

Et ligature in Insular Minuscule script.

The most common ligature is the ampersand &. This was originally a ligature of E and t, forming the Latin word "et", meaning "and". It has exactly the same use (except for pronunciation) in French, and is used in the English language. The ampersand comes in many different forms. Because of its ubiquity, it is generally no longer considered a ligature, but a logogram.

Like many other ligatures, it has at times been considered a letter (e.g. in early Modern English); In English it is pronounced "and", not "et," except in the case of &c, pronounced "et cetera." In most fonts, it does not immediately resemble the two letters used to form it, although certain typefaces (such as Trebuchet MS) design & in the form of a ligature.

Similarly, the dollar sign, $, possibly originated as a ligature (for "pesos", although there are other theories as well) but is now a logogram.[8]

Digraphs

Uppercase IJ glyph appearing as the distinctive "broken-U" ligature in Helvetica rendered by Omega TeX
Comparison of ij and y in various forms

Digraphs, such as ll in Spanish or Welsh, are not ligatures in the general case as the two letters are displayed as separate glyphs: although written together, when they are joined in handwriting or italic fonts the base form of the letters is not changed and the individual glyphs remain separate. Like some ligatures discussed above, these digraphs may or may not be considered individual letters in their respective languages. Until the 1994 spelling reform, the digraphs ch and ll were considered separate letters in Spanish for collation purposes.

The difference can be illustrated with the French digraph œu, which is composed of the ligature œ and the simplex letter u.

Dutch ij, however, is somewhat more ambiguous. Depending on the standard used, it can be considered a digraph, ligature or letter in itself, and its uppercase and lowercase forms are often available as a single glyph with a distinctive ligature in several professional fonts (e.g. Zapfino). Sans serif uppercase IJ glyphs, popular in the Netherlands, typically use a ligature resembling a U with a broken left-hand stroke. Adding to the confusion, Dutch handwriting can render the y as a ij-glyph without the dots in its lowercase form and the IJ in its uppercase form (also without dots) looking virtually identical (only slightly bigger). The Y is not found in natively Dutch words.

Latin-derived alphabets that use special ligatures

Non-Latin alphabets

See also Complex Text Layout.
The Devanagari ddhrya-ligature (द् + ध् + र् + य = द्ध्र्य) of JanaSanskritSans.

Ligatures are not limited to Latin script:

  • The Brahmic abugidas make frequent use of ligatures in consonant clusters. The number of ligatures employed may be language-dependent, thus in Devanagari, many more ligatures are conventionally used when writing Sanskrit than when writing Hindi. Having 37 consonants in total, the total number of ligatures that can be formed in Devanagari using only two letters is 1369, though few fonts are able to render all of them. In particular, Mangal.ttf, which is included with Microsoft Windows' Indic support, does not correctly handle ligatures with consonants attached to the right of the characters द, ट, ठ, ड, and ढ, leaving the virama attached to them and displaying the following consonant in its standard form.
  • A number of ligatures have been employed in the Greek alphabet, in particular a combination of omicron (Ο) and upsilon (Υ) which later gave rise to one of the letters of the Cyrillic alphabet — see Ou (letter).
  • Cyrillic ligatures: Љ, Њ, Ы, Ѿ. Iotified Cyrillic letters are ligatures of the early Cyrillic decimal I and another vowel: (ancestor of Я), Ѥ, Ѩ, Ѭ, Ю (descended from another ligature, Оу, an early version of У). Two letters of the Macedonian and Serbian Cyrillic alphabets, lje and nje (љ, њ), were developed in the nineteenth century as ligatures of Cyrillic El and En (л, н) with the soft sign (ь). A ligature of ya (Я) and e also exists: Ԙԙ, as do some more ligatures: Ꚅꚅ and Ꚉꚉ.
  • Some forms of the Glagolitic script, used from Middle Ages to the 19th century to write some Slavic languages, have a box-like shape that lends itself to more frequent use of ligatures.
  • In the Hebrew alphabet, the letters aleph and lamed can form a ligature in some pre-modern texts (mainly religious), or in Judeo-Arabic texts, where that combination is very frequent, since [ʔ][a]l- (written aleph plus lamed, in the Hebrew script) is the definite article in Arabic.
  • The Arabic alphabet, historically a cursive derived from the Nabataean alphabet, most letters take a variant shape depending on which they are followed (word-initial), preceded (word-final) or both (medial) by other letters. For example, Arabic mīm, isolated م, tripled (mmm, rendering as initial, medial and final): ممم. Notable are the shapes taken by lām + ʼalif isolated: , and lām + ʼalif medial or final: . Unicode has a special Allah ligature at U+FDF2: .
  • Urdu (one of the main languages of South Asia) which uses a calligraphic version of the Arabic based (Nasta`līq Script) , requires a great number of ligatures in digital typography. InPage which is a widely used Desktop Publishing tool for Urdu, uses Nasta`līq Script fonts with over 20,000 ligatures.

Computer typesetting

TeX is an example of a computer typesetting system that makes use of ligatures automatically. The Computer Modern Roman typeface provided with TeX includes the five common ligatures ff, fi, fl, ffi, and ffl. When TeX finds these combinations in a text it substitutes the appropriate ligature, unless overridden by the typesetter. Opinion is divided over whether it is the job of writers or typesetters to decide where to use ligatures.

The OpenType font format includes features for associating multiple glyphs with a single glyph, used for ligature substitution. Typesetting software may or may not implement this feature, even if it is explicitly present in the font's metadata. XeTeX is a TeX typesetting engine designed to make the most of such advanced features. This type of substitution used to be needed mainly for typesetting Arabic texts, but ligature lookups and substitutions are being put into all kinds of Western Latin OpenType fonts.

Typical ligatures in Latin script

This table below shows discrete letter pairs on the left, the corresponding Unicode ligature in the middle column, and the Unicode code point on the right. Provided you are using an operating system and browser that can handle Unicode, and have the correct Unicode fonts installed, some or all of these will display correctly. See also the provided graphic.

Unicode maintains that ligaturing is a presentation issue rather than a character definition issue, and that, for example, "if a modern font is asked to display 'h' followed by 'r', and the font has an 'hr' ligature in it, it can display the ligature." Accordingly, the use of the special Unicode ligature characters is "discouraged"[9]. Note however that ligatures such as æ and œ are never used to replace arbitrary 'ae' or 'oe' sequences – 'does' can never be written 'dœs'.

Ligatures in Unicode (Latin-derived alphabets)

This list is incomplete; several medieval ligatures in the U+A732 to U+A73D range, as well as a few others in that vicinity, are not yet listed.
Non-ligature Ligature Unicode
Et & U+0026
ſs, ſz ß U+00DF
AE, ae Æ, æ U+00C6, U+00E6
OE, oe Œ, œ U+0152, U+0153
IJ, ij IJ, ij U+0132, U+0133
ue U+1D6B
ff U+FB00
fi U+FB01
fl U+FB02
ffi U+FB03
ffl U+FB04
ſt U+FB05
st U+FB06

Also, there are separate code points for the digraph DZ and for the Croatian digraphs DŽ, LJ, and NJ. They are not ligatures but digraphs. See Digraphs in Unicode.

Ligatures used only in phonetic transcription:

Non-ligature Ligature Unicode
db ȸ U+0238
qp (cp) ȹ U+0239
dz ʣ U+02A3
dʑ (or dz curl) ʥ U+02A5
dʒ (or dezh) ʤ U+02A4
ʩ U+02A9
ls ʪ U+02AA
lz ʫ U+02AB
lʒ (or lezh) ɮ U+026E
tɕ (or tc curl) ʨ U+02A8
ts ʦ U+02A6
tʃ (or tesh) ʧ U+02A7

U+0238 and U+0239 are called digraphs, but are actually ligatures.[10]

Notes and references

  1. ^ Johannes Gutenberg and the Printing Press
  2. ^ The combination fj is represented in English only in "fjord" and "fjeld", but is encountered in Esperanto, Norwegian, and other languages where j represents a vocalic or semi-vocalic sound
  3. ^ Helmut Kopka (1999). A Guide to LaTeX, 3rd Ed. Addison-Wesley. p. 22. ISBN 0201398257. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  4. ^ Duden 1, Mannheim 1996, p. 69.
  5. ^ The Chicago Manual of Style, 14th Ed. Chicago: The University of Chicago Press. 1993. p. 6.61.
  6. ^ http://www.aulahispanica.com/origen-de-la-enie.html
  7. ^ Teach Yourself French. Collier's Cyclopedia, 1901.
  8. ^ Cajori, Florian (1993). A History of Mathematical Notations. New York: Dover (reprint). ISBN 0-486-67766-4. - contains section on the history of the dollar sign, with much documentary evidence supporting the theory $ began as a ligature for "pesos".
  9. ^ Ligatures, Digraphs and Presentation Forms, Unicode FAQ
  10. ^ Freytag, Asmus (2006-05-08). "Known Anomalies in Unicode Character Names". Unicode Technical Note #27. Unicode Inc. Retrieved 2009-05-29. {{cite web}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

See also