Jump to content

Syntax

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 84.78.78.41 (talk) at 17:38, 13 April 2007. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In linguistics, syntax (from Ancient Greek συν- syn-, “together”, and τάξις táxis, “arrangement”) is the study of the rules that govern the way sentences are formed by the combination of lexical items into phrases. The term syntax can also be used to refer to these rules themselves, as in “the syntax of a language”. Modern research in syntax attempts to describe the structure of natural languages in terms of such rules, and, for many practitioners, to find general rules that apply to all languages. Being a science, it does not offer judgement on grammatical usage, and is hence unconcerned with linguistic prescription.

Though all theories of syntax take human language as their object of focus, they differ as to whether they try to model the mental processes behind the production of language, or the external expressions of language. The former type of theory is known as formal syntax, the latter as empirical syntax. Generative grammar and its descendants are all examples of formal syntax.

From a biological and neurobiological perspective syntax has recently played a crucial role. On the one hand, it has been proven that syntax (in that it involves recursion rules) is a specific characteristic of all and only human language; on the other, experiments in neuroimaging have shown that a dedicated network in the human brain (crucially involving Broca's area, a portion of the left inferior frontal gyrus), is selectively activated by those languages that meet the Universal Grammar requirements characterizing all and only human languages as shown by generative grammar in the pioneering work of Noam Chomsky.[citation needed]

History

Works on syntax were being written long before modern theories were proposed—the Aṣṭādhyāyī of Pāṇini has been said to resemble “nothing so much as a modern computational grammar”.[1] However, the origins of modern syntax lie in the Grammaire générale et raisonnée of Antoine Arnauld (also known as the Port-Royal grammar, for the location of publication), which was the first treatise that sought to find universal principles behind the syntax of all languages.[citation needed]

The Port-Royal grammar modelled the study of syntax on that of logic (indeed, large parts of the Port-Royal Logic were copied or adapted from the Grammaire générale[2]). Syntactic categories were identified with logical ones, and all sentences were analysed into the form "Subject-Copula-Predicate". Initially, this view was adopted even by the early comparative linguists (e.g., Bopp),

The central role of syntax within theoretical linguistics became clear only in the last century which could reasonably called the "century of syntactic theory" as far as linguistics is concerned. For a detailed and critical survey of the history of syntax in the last two centuries see the monumental work by Graffi 2001.

Formal syntax

There are two features shared by most theories of formal syntax. First, they hierarchically group subunits into constituent units (phrases). Second, they provide some system of rules to explain patterns of acceptability/grammaticality and unacceptability/ungrammaticality. Most formal theories of syntax offer explanations of the systematic relationships between syntactic form and semantic meaning. The earliest framework of semiotics was established by Charles W. Morris in his 1938 book Foundations of the Theory of Signs. Syntax is defined within the study of signs as the first of its three subfields, specifically the study of the interrelation of the signs. The second subfield is semantics and is the study of the relation between the signs and the objects to which they apply. The third is pragmatics which studies the relationship between the sign system and the user.

In the framework of transformational-generative grammar (of which government and binding theory and minimalism are recent developments), the structure of a sentence is represented by phrase structure trees, otherwise known as phrase markers or tree diagrams. Such trees provide information about the sentences they represent by showing the hierarchical relations between their component parts.

There are various theories for designing the best grammars such that by systematic application of the rules, one can arrive at every phrase marker in a language and hence every sentence in the language. The most common are Phrase structure grammars, preferred by Noam Chomsky's MIT school of linguistics, and ID/LP grammars, the latter of which some argue has an explanatory advantage (especially those in opposition to the MIT school of linguistics, such as Ivan Sag and Geoffrey Pullum.) Dependency grammar is a class of syntactic theories separate from generative grammar in which structure is determined by the relation between a word (a head) and its dependents. One difference from phrase structure grammar is that dependency grammar does not have phrasal categories. Algebraic syntax is a type of dependency grammar.

A modern approach to combining accurate descriptions of the grammatical patterns of language with their function in context is that of systemic functional grammar, an approach originally developed by Michael A.K. Halliday in the 1960s and now pursued actively on all continents. Systemic-functional grammar is related both to feature-based approaches such as Head-driven phrase structure grammar and to the older functional traditions of European schools of linguistics such as British Contextualism and the Prague School.

Tree-adjoining grammar is a grammar formalism with interesting mathematical properties which has sometimes been used as the basis for the syntactic description of natural language. In monotonic and monostratal frameworks, variants of unification grammar are often preferred formalisms.

With the publication of Gold's Theorem[3] 1967 it was claimed that grammars for natural languages governed by deterministic rules could not be learned based on positive instances alone. This was part of the argument from the poverty of stimulus, first presented in 1980[4]. This led to the nativist view, that a form of grammar (including a complete conceptual lexicon in certain versions) were hardwired from birth.

Empirical approaches to syntax

A grammar is a description of the syntax of a language. Theoretical models rarely consider the language in use, as revealed by corpus linguistics, but focus on a mental language or i-language as its "proper" object of study. In contrast, the "empirically responsible"[5] approach to syntax seeks to construct grammars that will explain language in use. A key class of grammars in the latter tradition are the stochastic context-free grammars.

A problem faced in any formal syntax is that often more than one production rule may apply to a structure, thus resulting in a conflict. The greater the coverage, the higher this conflict, and all grammarians (starting with Panini) have spent considerable effort devising a prioritization for the rules, which usually turn out to be defeasible. Another difficulty is overgeneration, where unlicensed structures are also generated. Probabilistic grammars circumvent these problems by using the frequency of various productions to order them, resulting in a "most likely" (winner-take-all) interpretation, which by definition, is defeasible given additional data. As usage patterns are altered in diachronic shifts, these probabilistic rules can be re-learned, thus upgrading the grammar.

One may construct a probabilistic grammar from a traditional formal syntax by assigning each non-terminal a probability taken from some distribution, to be eventually estimated from usage data. On most samples of broad language, probabilistic grammars that tune these probabilities from data typically outperform hand-crafted grammars (although some rule-based grammars are now approaching the accuracies of PCFG).

Recently, probabilistic grammars appear to have gained some cognitive plausibility. It is well known that there are degrees of difficulty in accessing different syntactic structures (e.g. the Accessibility Hierarchy for relative clauses). Probabilistic versions of minimalist grammars have been used to compute information-theoretic entropy values which appear to correlate well with psycholinguistic data on understandability and production difficulty.[6]

Statistical grammars are not subject to Gold's theorem since the learning is incremental.

See also

Syntactic terms

Notes

  1. ^ Ostler, Nicholas (2005). Empires of the Word: A Language History of the World. HarperCollins.
  2. ^ Arnauld, Antoine (1683). La logique (5th ed. ed.). Paris: G. Desprez. p. 137. Nous avons emprunté…ce que nous avons dit…d'un petit Livre…sous le titre de Grammaire générale. {{cite book}}: |edition= has extra text (help)
  3. ^ Gold, E. (1967). Language identification in the limit. Information and Control 10, 447-474.
  4. ^ Chomsky, N. (1980). Rules and representations. Oxford: Basil Blackwell.
  5. ^ George Lakoff and Mark Johnson (1999). Philosophy in the Flesh: The embodied mind and its challenge to Western thought. Part IV. New York: Basic Books.
  6. ^ John Hale (2006). "Uncertainty About the Rest of the Sentence". Cognitive Science. 30. Dept Linguistics, Michigan State University: 643–672. doi:doi:10.1207/s15516709cog0000_64. {{cite journal}}: Check |doi= value (help)

References

  • Freidin, Robert (2006). Syntax. Critical Concepts in Linguistics. New York: Routledge. ISBN 0-415-24672-5. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  • Graffi, Giorgio (2001). 200 Years of Syntax. A Critical Survey. Studies in the History of the Language Sciences 98. Amsterdam: Benjamins. ISBN 90-272-4587-8.