Diagonal lemma: Difference between revisions

Content deleted Content added

Inline

Latest revision as of 01:39, 28 October 2024

In mathematical logic, the diagonal lemma (also known as diagonalization lemma, self-reference lemma^[1] or fixed point theorem) establishes the existence of self-referential sentences in certain formal theories of the natural numbers—specifically those theories that are strong enough to represent all computable functions. The sentences whose existence is secured by the diagonal lemma can then, in turn, be used to prove fundamental limitative results such as Gödel's incompleteness theorems and Tarski's undefinability theorem.^[2] It is named in reference to Cantor's diagonal argument in set and number theory.

Background

Let $\mathbb {N}$ be the set of natural numbers. A first-order theory $T$ in the language of arithmetic represents^[3] the computable function $f:\mathbb {N} \rightarrow \mathbb {N}$ if there exists a "graph" formula ${\mathcal {G}}_{f}(x,y)$ in the language of $T$ — that is, a formula such that for each $n\in \mathbb {N}$

\vdash _{T}\,(\forall y)[(^{\circ }f(n)=y)\Leftrightarrow {\mathcal {G}}_{f}(^{\circ }n,\,y)]

.

Here ${}^{\circ }n$ is the numeral corresponding to the natural number $n$ , which is defined to be the $n$ th successor of presumed first numeral $0$ in $T$ .

The diagonal lemma also requires a systematic way of assigning to every formula ${\mathcal {A}}$ a natural number $\#({\mathcal {A}})$ (also written as $\#_{\mathcal {A}}$ ) called its Gödel number. Formulas can then be represented within $T$ by the numerals corresponding to their Gödel numbers. For example, ${\mathcal {A}}$ is represented by $^{\circ }\#_{\mathcal {A}}$

The diagonal lemma applies to theories capable of representing all primitive recursive functions. Such theories include first-order Peano arithmetic and the weaker Robinson arithmetic, and even to a much weaker theory known as R. A common statement of the lemma (as given below) makes the stronger assumption that the theory can represent all computable functions, but all the theories mentioned have that capacity, as well.

Statement of the lemma

Lemma^[4] — Let $T$ be a first-order theory in the language of arithmetic and capable of representing all computable functions, and ${\mathcal {F}}(y)$ be a formula in $T$ with one free variable. Then there exists a sentence ${\mathcal {C}}$ such that

\vdash _{T}\,{\mathcal {C}}\Leftrightarrow {\mathcal {F}}({}^{\circ }\#_{\mathcal {C}})

Intuitively, ${\mathcal {C}}$ is a self-referential sentence: ${\mathcal {C}}$ says that ${\mathcal {C}}$ has the property ${\mathcal {F}}$ . The sentence ${\mathcal {C}}$ can also be viewed as a fixed point of the operation that assigns, to the equivalence class of a given sentence ${\mathcal {A}}$ , the equivalence class of the sentence ${\mathcal {F}}(^{\circ }\#_{\mathcal {A}})$ (a sentence's equivalence class is the set of all sentences to which it is provably equivalent in the theory $T$ ). The sentence ${\mathcal {C}}$ constructed in the proof is not literally the same as ${\mathcal {F}}(^{\circ }\#_{\mathcal {C}})$ , but is provably equivalent to it in the theory $T$ .

Proof

Let $f:\mathbb {N} \to \mathbb {N}$ be the function defined by:

f(\#_{\mathcal {A}})=\#[{\mathcal {A}}(^{\circ }\#_{\mathcal {A}})]

for each formula ${\mathcal {A}}(x)$ with only one free variable $x$ in theory $T$ , and $f(n)=0$ otherwise. Here $\#_{\mathcal {A}}=\#({\mathcal {A}}(x))$ denotes the Gödel number of formula ${\mathcal {A}}(x)$ . The function $f$ is computable (which is ultimately an assumption about the Gödel numbering scheme), so there is a formula ${\mathcal {G}}_{f}(x,\,y)$ representing $f$ in $T$ . Namely

\vdash _{T}\,(\forall y)\{{\mathcal {G}}_{f}(^{\circ }\#_{\mathcal {A}},\,y)\Leftrightarrow [y={}^{\circ }f(\#_{\mathcal {A}})]\}

which is to say

\vdash _{T}\,(\forall y)\{{\mathcal {G}}_{f}(^{\circ }\#_{\mathcal {A}},\,y)\Leftrightarrow [y={}^{\circ }\#({\mathcal {A}}(^{\circ }\#_{\mathcal {A}}))]\}

Now, given an arbitrary formula ${\mathcal {F}}(y)$ with one free variable $y$ , define the formula ${\mathcal {B}}(z)$ as:

{\mathcal {B}}(z):=(\forall y)[{\mathcal {G}}_{f}(z,\,y)\Rightarrow {\mathcal {F}}(y)]

Then, for all formulas ${\mathcal {A}}(x)$ with one free variable:

\vdash _{T}\,{\mathcal {B}}(^{\circ }\#_{\mathcal {A}})\Leftrightarrow (\forall y)\{[y={}^{\circ }\#({\mathcal {A}}(^{\circ }\#_{\mathcal {A}}))]\Rightarrow {\mathcal {F}}(y)\}

which is to say

\vdash _{T}\,{\mathcal {B}}(^{\circ }\#_{\mathcal {A}})\Leftrightarrow {\mathcal {F}}(^{\circ }\#[{\mathcal {A}}(^{\circ }\#_{\mathcal {A}})])

Now substitute ${\mathcal {A}}$ with ${\mathcal {B}}$ , and define the sentence ${\mathcal {C}}$ as:

{\mathcal {C}}:={\mathcal {B}}(^{\circ }\#_{\mathcal {B}})

Then the previous line can be rewritten as

\vdash _{T}\,{\mathcal {C}}\Leftrightarrow {\mathcal {F}}(^{\circ }\#_{\mathcal {C}})

which is the desired result.

(The same argument in different terms is given in [Raatikainen (2015a)].)

History

The lemma is called "diagonal" because it bears some resemblance to Cantor's diagonal argument.^[5] The terms "diagonal lemma" or "fixed point" do not appear in Kurt Gödel's 1931 article or in Alfred Tarski's 1936 article.

Rudolf Carnap (1934) was the first to prove the general self-referential lemma,^[6] which says that for any formula F in a theory T satisfying certain conditions, there exists a formula ψ such that ψ ↔ F(°#(ψ)) is provable in T. Carnap's work was phrased in alternate language, as the concept of computable functions was not yet developed in 1934. Mendelson (1997, p. 204) believes that Carnap was the first to state that something like the diagonal lemma was implicit in Gödel's reasoning. Gödel was aware of Carnap's work by 1937.^[7]

The diagonal lemma is closely related to Kleene's recursion theorem in computability theory, and their respective proofs are similar.

Notes

^ Hájek, Petr; Pudlák, Pavel (1998) [first printing 1993]. Metamathematics of First-Order Arithmetic. Perspectives in Mathematical Logic (1st ed.). Springer. ISBN 3-540-63648-X. ISSN 0172-6641. In modern texts these results are proved using the well-known diagonalization (or self-reference) lemma, which is already implicit in Gödel's proof.
^ See Boolos and Jeffrey (2002, sec. 15) and Mendelson (1997, Prop. 3.37 and Cor. 3.44 ).
^ For details on representability, see Hinman 2005, p. 316
^ Smullyan (1991, 1994) are standard specialized references. The lemma is Prop. 3.34 in Mendelson (1997), and is covered in many texts on basic mathematical logic, such as Boolos and Jeffrey (1989, sec. 15) and Hinman (2005).
^ See, for example, Gaifman (2006).
^ Kurt Gödel, Collected Works, Volume I: Publications 1929–1936, Oxford University Press, 1986, p. 339.
^ See Gödel's Collected Works, Vol. 1, Oxford University Press, 1986, p. 363, fn 23.

References

George Boolos and Richard Jeffrey, 1989. Computability and Logic, 3rd ed. Cambridge University Press. ISBN 0-521-38026-X ISBN 0-521-38923-2
Rudolf Carnap, 1934. Logische Syntax der Sprache. (English translation: 2003. The Logical Syntax of Language. Open Court Publishing.)
Haim Gaifman, 2006. 'Naming and Diagonalization: From Cantor to Gödel to Kleene'. Logic Journal of the IGPL, 14: 709–728.
Hinman, Peter, 2005. Fundamentals of Mathematical Logic. A K Peters. ISBN 1-56881-262-0
Mendelson, Elliott, 1997. Introduction to Mathematical Logic, 4th ed. Chapman & Hall.
Panu Raatikainen, 2015a. The Diagonalization Lemma. In Stanford Encyclopedia of Philosophy, ed. Zalta. Supplement to Raatikainen (2015b).
Panu Raatikainen, 2015b. Gödel's Incompleteness Theorems. In Stanford Encyclopedia of Philosophy, ed. Zalta.
Raymond Smullyan, 1991. Gödel's Incompleteness Theorems. Oxford Univ. Press.
Raymond Smullyan, 1994. Diagonalization and Self-Reference. Oxford Univ. Press.
Alfred Tarski (1936). "Der Wahrheitsbegriff in den formalisierten Sprachen" (PDF). Studia Philosophica. 1: 261–405. Archived from the original (PDF) on 9 January 2014. Retrieved 26 June 2013.
- Alfred Tarski, tr. J. H. Woodger, 1983. "The Concept of Truth in Formalized Languages". English translation of Tarski's 1936 article. In A. Tarski, ed. J. Corcoran, 1983, Logic, Semantics, Metamathematics, Hackett.

[1] Hájek, Petr; Pudlák, Pavel (1998) [first printing 1993]. Metamathematics of First-Order Arithmetic. Perspectives in Mathematical Logic (1st ed.). Springer. ISBN 3-540-63648-X. ISSN 0172-6641. In modern texts these results are proved using the well-known diagonalization (or self-reference) lemma, which is already implicit in Gödel's proof.

[2] See Boolos and Jeffrey (2002, sec. 15) and Mendelson (1997, Prop. 3.37 and Cor. 3.44 ).

[3] For details on representability, see Hinman 2005, p. 316

[4] Smullyan (1991, 1994) are standard specialized references. The lemma is Prop. 3.34 in Mendelson (1997), and is covered in many texts on basic mathematical logic, such as Boolos and Jeffrey (1989, sec. 15) and Hinman (2005).

[5] See, for example, Gaifman (2006).

[6] Kurt Gödel, Collected Works, Volume I: Publications 1929–1936, Oxford University Press, 1986, p. 339.

[7] See Gödel's Collected Works, Vol. 1, Oxford University Press, 1986, p. 363, fn 23.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

@@ Line 1: / Line 1: @@
+{{Short description|Statement in mathematical logic}}
-{{about|a concept in mathematical logic|text = It is named in reference to [[Cantor's diagonal argument]] in set and number theory. See [[diagonalization (disambiguation)]] for several unrelated uses of the term in mathematics.}}
+{{Other uses|Diagonal argument (disambiguation){{!}}Diagonal argument|Diagonalization (disambiguation)}}
 In [[mathematical logic]], the '''diagonal lemma'''  (also known as '''diagonalization lemma''', '''self-reference lemma'''<ref>{{cite book
@@ Line 17: / Line 18: @@
 | quote        = In modern texts these results are proved using the well-known diagonalization (or self-reference) lemma, which is already implicit in Gödel's proof.
 }}</ref> or '''fixed point theorem''') establishes the existence of [[self-referential]] [[Sentence (mathematical logic)|sentence]]s in certain formal theories of the [[natural number]]s—specifically those theories that are strong enough to represent all [[computable function]]s. The sentences whose existence is secured by the diagonal lemma can then, in turn, be used to prove fundamental limitative results such as  [[Gödel's incompleteness theorems]] and [[Tarski's undefinability theorem]].<ref>See Boolos and Jeffrey (2002, sec. 15) and Mendelson (1997, Prop.&nbsp;3.37 and Cor.&nbsp;3.44 ).</ref>
+It is named in reference to [[Cantor's diagonal argument]] in set and number theory.
 == Background ==
-Let <math>\mathbb{N}</math> be the set of [[natural number]]s. A [[theory (mathematical logic)|theory]] ''T'' ''represents'' the computable function <math>f: \mathbb{N}\rightarrow\mathbb{N}</math> if there exists a "graph" predicate <math>\Gamma_{f}(x, y)</math> in the language of ''T'' such that for each <math>x \in \mathbb{N}</math>, ''T'' proves
+Let <math>\mathbb{N}</math> be the set of [[natural number]]s. A [[first-order logic|first-order]] [[theory (mathematical logic)|theory]] <math>T</math> in the language of arithmetic ''represents''<ref>For details on representability, see Hinman 2005, p.&nbsp;316</ref> the computable function <math>f: \mathbb{N}\rightarrow\mathbb{N}</math> if there exists a "graph" [[First-order_logic#Formulas|formula]] <math>\mathcal{G}_f(x, y)</math> in the language of <math>T</math> — that is, a formula such that for each <math>n \in \mathbb{N}</math>
-: <math>\forall y \quad \left( {}^\circ f(x)=y \Leftrightarrow \Gamma_f({}^\circ x, y) \right)</math>.<ref>For details on representability, see Hinman 2005, p.&nbsp;316</ref>
+: <math>\vdash_{T}\,(\forall y)[(^\circ f(n)=y) \Leftrightarrow \mathcal{G}_f(^\circ n,\,y)]</math>.
-Here <math>{}^\circ x</math> is the numeral corresponding to the natural number <math>x</math>, which is defined to be the closed term 1+ ··· +1 (<math>x</math> ones), and <math>{}^\circ f(x)</math> is the numeral corresponding to <math>f(x)</math>.
+Here <math>{}^\circ n</math> is the [[Peano_axioms#First-order_theory_of_arithmetic|numeral]] corresponding to the natural number <math>n</math>, which is defined to be the <math>n</math>th successor of presumed first numeral <math>0</math> in <math>T</math>.
-The diagonal lemma also requires that there be a systematic way of assigning to every formula ''θ'' a natural number #(''θ'') called its [[Gödel number]]. Formulas can then be represented within the theory by the numerals corresponding to their Gödel numbers. For example, ''θ'' is represented by °#(''θ'')
+The diagonal lemma also requires a systematic way of assigning to every formula <math>\mathcal{A}</math> a natural number <math>\#(\mathcal{A})</math> (also written as <math>\#_{\mathcal{A}}</math>) called its [[Gödel number]]. Formulas can then be represented within <math>T</math> by the numerals corresponding to their Gödel numbers. For example, <math>\mathcal{A}</math> is represented by <math>^{\circ}\#_{\mathcal{A}}</math>
-The diagonal lemma applies to theories capable of representing all [[primitive recursive functions]]. Such theories include [[Peano arithmetic]] and the weaker [[Robinson arithmetic]]. A common statement of the lemma (as given below) makes the stronger assumption that the theory can represent all [[computable function]]s.
+The diagonal lemma applies to theories capable of representing all [[primitive recursive functions]]. Such theories include [[Peano axioms#First-order theory of arithmetic|first-order Peano arithmetic]] and the weaker [[Robinson arithmetic]], and even to a much weaker theory known as R. A common statement of the lemma (as given below) makes the stronger assumption that the theory can represent all [[computable function]]s, but all the theories mentioned have that capacity, as well.
 == Statement of the lemma ==
-Let ''T'' be a [[first-order logic|first-order]] theory in the language of arithmetic and capable of representing all [[computable function]]s. Let ''F'' be a formula in the language with one free variable, then:
+{{math theorem|Let <math>T</math> be a [[first-order logic|first-order]] theory in the language of arithmetic and capable of representing all [[computable function]]s, and <math>\mathcal{F}(y)</math> be a formula in <math>T</math> with one free variable. Then there exists a [[sentence (mathematical logic)|sentence]] <math>\mathcal{C}</math> such that
+:<math>\vdash_T\,\mathcal{C}\Leftrightarrow\mathcal{F}({}^{\circ}\#_{\mathcal{C}})</math>
+| name = Lemma<ref>Smullyan (1991, 1994) are standard specialized references. The lemma is Prop. 3.34 in Mendelson (1997), and is covered in many texts on basic mathematical logic, such as Boolos and Jeffrey (1989, sec. 15) and Hinman (2005).</ref>
+}}
+Intuitively, <math>\mathcal{C}</math> is a [[self-referential]] sentence: <math>\mathcal{C}</math> says that <math>\mathcal{C}</math> has the property <math>\mathcal{F}</math>. The sentence <math>\mathcal{C}</math> can also be viewed as a [[Fixed point (mathematics)|fixed point]] of the operation that assigns, to the equivalence class of a given sentence <math>\mathcal{A}</math>, the equivalence class of the sentence <math>\mathcal{F}(^\circ \#_{\mathcal{A}})</math> (a sentence's equivalence class is the set of all sentences to which it is provably equivalent in the theory <math>T</math>). The sentence <math>\mathcal{C}</math> constructed in the proof is not literally the same as <math>\mathcal{F}(^\circ \#_{\mathcal{C}})</math>, but is provably equivalent to it in the theory <math>T</math>.
-{{math theorem
-|There is a sentence <math>\psi</math> such that <math>\psi\iff F(^\circ \#(\psi))</math> is provable in&nbsp;''T''.<ref>Smullyan (1991, 1994) are standard specialized references. The lemma is Prop. 3.34 in Mendelson (1997), and is covered in many texts on basic mathematical logic, such as Boolos and Jeffrey (1989, sec. 15) and Hinman (2005).</ref>
-|name=Lemma}}
-Intuitively, <math>\psi</math> is a [[self-referential]] sentence saying that <math>\psi</math> has the property ''F''. The sentence <math>\psi</math> can also be viewed as a [[Fixed point (mathematics)|fixed point]] of the operation assigning to each formula <math>\theta</math> the sentence <math>F(^\circ \#(\theta))</math>. The sentence <math>\psi</math> constructed in the proof is not literally the same as <math>F(^\circ \#(\psi))</math>, but is provably equivalent to it in the theory&nbsp;''T''.
 ==Proof==
 Let <math>f:\mathbb{N}\to\mathbb{N}</math> be the function defined by:
-:<math>f(\#_{\mathcal{A}}) = \#[\mathcal{A}({}^{\circ}\#_{\mathcal{A}})]</math>
+:<math>f(\#_{\mathcal{A}}) = \#[\mathcal{A}(^{\circ}\#_{\mathcal{A}})]</math>
-for each formula <math>\mathcal{A}(x)</math> with only one free variable <math>x</math> in theory <math>T</math>, and <math>f(n)=0</math> otherwise. The function <math>f</math> is '''computable''', so there's a formula <math>\mathcal{A}_f(x,\,y)</math> representing <math>f</math> in <math>T</math>. Namely
+for each formula <math>\mathcal{A}(x)</math> with only one free variable <math>x</math> in theory <math>T</math>, and <math>f(n)=0</math> otherwise. Here <math>\#_{\mathcal{A}}=\#(\mathcal{A}(x))</math> denotes the Gödel number of formula <math>\mathcal{A}(x)</math>. The function <math>f</math> is computable (which is ultimately an assumption about the Gödel numbering scheme), so there is a formula <math>\mathcal{G}_f(x,\,y)</math> representing <math>f</math> in <math>T</math>. Namely
-:<math>\vdash_T\,(\forall y)\{\mathcal{A}_f({}^{\circ}\#_{\mathcal{A}},\,y) \Leftrightarrow [y = {}^{\circ}f(\#_{\mathcal{A}})]\}</math>
+:<math>\vdash_T\,(\forall y)\{\mathcal{G}_f(^{\circ}\#_{\mathcal{A}},\,y) \Leftrightarrow [y = {}^{\circ}f(\#_{\mathcal{A}})]\}</math>
 which is to say
-:<math>\vdash_T\,(\forall y)\{[\mathcal{A}_f({}^{\circ}\#_{\mathcal{A}},\,y)] \Leftrightarrow [y = {}^{\circ}\#(\mathcal{A}({}^{\circ}\#_{\mathcal{A}}))]\}</math>
+:<math>\vdash_T\,(\forall y)\{\mathcal{G}_f(^{\circ}\#_{\mathcal{A}},\,y) \Leftrightarrow [y = {}^{\circ}\#(\mathcal{A}(^{\circ}\#_{\mathcal{A}}))]\}</math>
-Now define the formula <math>\mathcal{B}(z)</math> as:
+Now, given an arbitrary formula <math>\mathcal{F}(y)</math> with one free variable <math>y</math>, define the formula <math>\mathcal{B}(z)</math> as:
-:<math>\mathcal{B}(z) := (\forall y) [\mathcal{A}_f(z,\,y)\Rightarrow \mathcal{F}(y)]</math>
+:<math>\mathcal{B}(z) := (\forall y) [\mathcal{G}_f(z,\,y)\Rightarrow \mathcal{F}(y)]</math>
-for arbitrary formula <math>\mathcal{F}(y)</math> with one free variable <math>y</math>. Then
+Then, for all formulas <math>\mathcal{A}(x)</math> with one free variable:
-:<math>\vdash_T\,\mathcal{B}({}^{\circ}\#_{\mathcal{A}}) \Leftrightarrow (\forall y)\{[ y = {}^{\circ}\#(\mathcal{A}({}^{\circ}\#_{\mathcal{A}}))] \Rightarrow \mathcal{F}(y)\}</math>
+:<math>\vdash_T\,\mathcal{B}(^{\circ}\#_{\mathcal{A}}) \Leftrightarrow (\forall y)\{[ y = {}^{\circ}\#(\mathcal{A}(^{\circ}\#_{\mathcal{A}}))] \Rightarrow \mathcal{F}(y)\}</math>
 which is to say
-:<math>\vdash_T\,\mathcal{B}({}^{\circ}\#_{\mathcal{A}}) \Leftrightarrow \mathcal{F}\{{}^{\circ}\#[\mathcal{A}({}^{\circ}\#_{\mathcal{A}})]\}</math>
+:<math>\vdash_T\,\mathcal{B}(^{\circ}\#_{\mathcal{A}}) \Leftrightarrow \mathcal{F}(^{\circ}\#[\mathcal{A}(^{\circ}\#_{\mathcal{A}})])</math>
-Now substitute <math>\mathcal{A}</math> with <math>\mathcal{B}</math>, and define the formula <math>\mathcal{C}</math> as:
+Now substitute <math>\mathcal{A}</math> with <math>\mathcal{B}</math>, and define the sentence <math>\mathcal{C}</math> as:
-:<math>\mathcal{C}:= \mathcal{B}({}^{\circ}\#_{\mathcal{B}})</math>
+:<math>\mathcal{C}:= \mathcal{B}(^{\circ}\#_{\mathcal{B}})</math>
-Then the previous theorem can be rewrited as
+Then the previous line can be rewritten as
-:<math>\vdash_T\,\mathcal{C}\Leftrightarrow\mathcal{F}({}^{\circ}\#_{\mathcal{C}})</math>
+:<math>\vdash_T\,\mathcal{C}\Leftrightarrow\mathcal{F}(^{\circ}\#_{\mathcal{C}})</math>
 which is the desired result.