Binary tree: Difference between revisions
Changing short description from "Tree data structure in which each node has at most two children" to "Limited form of tree data structure" (Shortdesc helper) |
ObserveOwl (talk | contribs) m Reverted edit by 200.53.241.141 (talk) to last version by Tamfang |
||
(89 intermediate revisions by 57 users not shown) | |||
Line 1: | Line 1: | ||
{{Short description|Limited form of tree data structure}} |
{{Short description|Limited form of tree data structure}} |
||
{{ |
{{Distinguish|Binary search tree|B-tree|B+ tree}} |
||
[[ |
[[File:Binary tree v2.svg|thumb|alt=A labeled binary tree of size 9 and height 3, with a root node whose value is 1. The above tree is unbalanced and not sorted.|A labeled binary tree of size 9 (the number of nodes in the tree) and height 3 (the height of a tree defined as the number of edges or links from the top-most or root node to the farthest leaf node), with a root node whose value is 1. The above tree is unbalanced and not sorted.]] |
||
In [[computer science]], a '''binary tree''' is a [[Tree (data structure)|tree data structure]] in which each node has at most two [[child node|children]], |
In [[computer science]], a '''binary tree''' is a [[Tree (data structure)|tree data structure]] in which each node has at most two [[child node|children]], referred to as the ''left child'' and the ''right child''. That is, it is a [[m-ary tree|''k''-ary tree]] with {{math|''k'' {{=}} 2}}. A [[recursive definition]] using [[set theory]] is that a binary tree is a [[tuple]] (''L'', ''S'', ''R''), where ''L'' and ''R'' are binary trees or the [[empty set]] and ''S'' is a [[singleton set]] containing the root.<ref name="GarnierTaylor2009">{{cite book| author1=Rowan Garnier| author2=John Taylor| title=Discrete Mathematics:Proofs, Structures and Applications, Third Edition|url=https://books.google.com/books?id=WnkZSSc4IkoC&pg=PA620|year=2009|publisher=CRC Press|isbn=978-1-4398-1280-8|page=620}}</ref><ref name="Skiena2009">{{cite book|author=Steven S Skiena|title=The Algorithm Design Manual| url=https://books.google.com/books?id=7XUSn0IKQEgC&pg=PA77|year=2009| publisher=Springer Science & Business Media| isbn=978-1-84800-070-4|page=77}}</ref> |
||
From a [[graph theory]] perspective, binary |
From a [[graph theory]] perspective, binary trees as defined here are [[Arborescence (graph theory)|arborescences]].<ref name="Knuth1997">{{cite book| author=Knuth |title=The Art Of Computer Programming, Volume 1, 3/E| year=1997| publisher=Pearson Education| isbn=0-201-89683-4|page=363}}</ref> A binary tree may thus be also called a '''bifurcating arborescence''',<ref name="Knuth1997"/> a term which appears in some early programming books<ref name="Flores1971">{{cite book| author=Iván Flores|title=Computer programming system/360|year=1971| publisher=Prentice-Hall| page=39}}</ref> before the modern computer science terminology prevailed. It is also possible to interpret a binary tree as an [[undirected graph|undirected]], rather than [[directed graph]], in which case a binary tree is an [[ordered tree|ordered]], [[rooted tree]].<ref>{{cite book| author=Kenneth Rosen|title=Discrete Mathematics and Its Applications, 7th edition|year=2011|publisher=McGraw-Hill Science|page=749|isbn=978-0-07-338309-5}}</ref> Some authors use '''rooted binary tree''' instead of ''binary tree'' to emphasize the fact that the tree is rooted, but as defined above, a binary tree is always rooted.<ref name="Mazur2010">{{cite book|author=David R. Mazur |title=Combinatorics: A Guided Tour| url=https://books.google.com/books?id=yI4Jx5Obr08C&pg=PA246|year=2010|publisher=Mathematical Association of America |isbn=978-0-88385-762-5|page=246}}</ref> |
||
In mathematics, what is termed ''binary tree'' can vary significantly from author to author. Some use the definition commonly used in computer science,<ref name="oem"/> but others define it as every non-leaf having exactly two children and don't necessarily |
In mathematics, what is termed ''binary tree'' can vary significantly from author to author. Some use the definition commonly used in computer science,<ref name="oem"/> but others define it as every non-leaf having exactly two children and don't necessarily label the children as left and right either.<ref name="Foulds1992">{{cite book|author=L.R. Foulds|title=Graph Theory Applications |url=https://books.google.com/books?id=IK7kreGl3vkC&pg=PA32| year=1992| publisher=Springer Science & Business Media|isbn=978-0-387-97599-3|page=32}}</ref> |
||
In computing, binary trees |
In computing, binary trees can be used in two very different ways: |
||
*First, as a means of accessing nodes based on some value or label associated with each node.<ref name="Makinson2009b">{{cite book|author=David Makinson| title=Sets, Logic and Maths for Computing|year=2009|publisher=Springer Science & Business Media| isbn=978-1-84628-845-6|page=199}}</ref> Binary trees labelled this way are used to implement [[binary search tree]]s and [[binary heap]]s, and are used for efficient [[search algorithm|searching]] and [[Sorting algorithm|sorting]]. The designation of non-root nodes as left or right child even when there is only one child present matters in some of these applications, in particular, it is significant in binary search trees.<ref name="Gross2007">{{cite book|author=Jonathan L. Gross| title=Combinatorial Methods with Computer Applications |url=https://books.google.com/books?id=hamtabmh0ZoC&pg=PA248| year=2007| publisher=CRC Press|isbn=978-1-58488-743-0|page=248}}</ref> However, the arrangement of particular nodes into the tree is not part of the conceptual information. For example, in a normal binary search tree the placement of nodes depends almost entirely on the order in which they were added, and can be re-arranged (for example by [[Self- |
*First, as a means of accessing nodes based on some value or label associated with each node.<ref name="Makinson2009b">{{cite book|author=David Makinson| title=Sets, Logic and Maths for Computing|year=2009|publisher=Springer Science & Business Media| isbn=978-1-84628-845-6|page=199}}</ref> Binary trees labelled this way are used to implement [[binary search tree]]s and [[binary heap]]s, and are used for efficient [[search algorithm|searching]] and [[Sorting algorithm|sorting]]. The designation of non-root nodes as left or right child even when there is only one child present matters in some of these applications, in particular, it is significant in binary search trees.<ref name="Gross2007">{{cite book|author=Jonathan L. Gross| title=Combinatorial Methods with Computer Applications |url=https://books.google.com/books?id=hamtabmh0ZoC&pg=PA248| year=2007| publisher=CRC Press|isbn=978-1-58488-743-0|page=248}}</ref> However, the arrangement of particular nodes into the tree is not part of the conceptual information. For example, in a normal binary search tree the placement of nodes depends almost entirely on the order in which they were added, and can be re-arranged (for example by [[Self-balancing binary search tree|balancing]]) without changing the meaning. |
||
⚫ | *Second, as a representation of data with a relevant bifurcating structure. In such cases, the particular arrangement of nodes under and/or to the left or right of other nodes is part of the information (that is, changing it would change the meaning). Common examples occur with [[Huffman coding]] and [[cladograms]]. The everyday division of documents into chapters, sections, paragraphs, and so on is an analogous example with ''n''-ary rather than binary trees. |
||
⚫ | *Second, as a representation of data with a relevant bifurcating structure. In such cases, the particular arrangement of nodes under and/or to the left or right of other nodes is part of the information (that is, changing it would change the meaning). Common examples occur with [[Huffman coding]] and [[cladograms]]. The everyday division of documents into chapters, sections, paragraphs, and so on is an analogous example with n-ary rather than binary trees. |
||
== Definitions == |
== Definitions == |
||
Line 19: | Line 18: | ||
=== Recursive definition === |
=== Recursive definition === |
||
{{cleanup rewrite|section|date=July 2014}} |
{{cleanup rewrite|section|date=July 2014}} |
||
To |
To define a binary tree, the possibility that only one of the children may be empty must be acknowledged. An [[Artifact (software development)|artifact]], which in some textbooks is called an ''extended binary tree,'' is needed for that purpose. An extended binary tree is thus recursively defined as:<ref name="Rosen2011"/> |
||
* the [[empty set]] is an extended binary tree |
* the [[empty set]] is an extended binary tree |
||
* if T<sub>1</sub> and T<sub>2</sub> are extended binary trees, then denote by T<sub>1</sub> • T<sub>2</sub> the extended binary tree obtained by {{Clarify|date=May 2020|post-text=where did the 'r' go in the 'T<sub>1</sub> • T<sub>2</sub>' symbol|text=adding a root ''r'' connected to the left to T<sub>1</sub> and to the right to T<sub>2</sub>}} |
* if T<sub>1</sub> and T<sub>2</sub> are extended binary trees, then denote by T<sub>1</sub> • T<sub>2</sub> the extended binary tree obtained by {{Clarify|date=May 2020|post-text=where did the 'r' go in the 'T<sub>1</sub> • T<sub>2</sub>' symbol|text=adding a root ''r'' connected to the left to T<sub>1</sub> and to the right to T<sub>2</sub>}} by adding edges when these sub-trees are non-empty. |
||
Another way of imagining this construction (and understanding the terminology) is to consider instead of the empty set a different type of node—for instance square nodes if the regular ones are circles.<ref name="HuShing2002">{{cite book|author1=Te Chiang Hu|author2=Man-tak Shing|title=Combinatorial Algorithms |year=2002 |publisher=Courier Dover Publications| isbn=978-0-486-41962-6|page=162}}</ref> |
Another way of imagining this construction (and understanding the terminology) is to consider instead of the empty set a different type of node—for instance square nodes if the regular ones are circles.<ref name="HuShing2002">{{cite book|author1=Te Chiang Hu|author2=Man-tak Shing|title=Combinatorial Algorithms |year=2002 |publisher=Courier Dover Publications| isbn=978-0-486-41962-6|page=162}}</ref> |
||
=== Using graph theory concepts === |
=== Using graph theory concepts === |
||
A binary tree is a [[rooted tree]] that is also an [[ordered tree]] (a.k.a. plane tree) in which every node has at most two children. A rooted tree naturally imparts a notion of levels (distance from the root) |
A binary tree is a [[rooted tree]] that is also an [[ordered tree]] (a.k.a. plane tree) in which every node has at most two children. A rooted tree naturally imparts a notion of levels (distance from the root); thus, for every node, a notion of children may be defined as the nodes connected to it a level below. Ordering of these children (e.g., by drawing them on a plane) makes it possible to distinguish a left child from a right child.<ref name="HsuLin2008">{{cite book|author1=Lih-Hsing Hsu| author2=Cheng-Kuan Lin|title=Graph Theory and Interconnection Networks|url=https://books.google.com/books?id=vbxdqhDKOSYC&pg=PA66|date=2008| publisher=CRC Press|isbn=978-1-4200-4482-9|page=66}}</ref> But this still does not distinguish between a node with left but not a right child from a node with right but no left child. |
||
The necessary distinction can be made by first partitioning the edges |
The necessary distinction can be made by first partitioning the edges; i.e., defining the binary tree as triplet (V, E<sub>1</sub>, E<sub>2</sub>), where (V, E<sub>1</sub> ∪ E<sub>2</sub>) is a rooted tree (equivalently arborescence) and E<sub>1</sub> ∩ E<sub>2</sub> is empty, and also requiring that for all ''j'' ∈ { 1, 2 }, every node has at most one E<sub>''j''</sub> child.<ref name="FlumGrohe2006">{{cite book|author1=J. Flum| author2=M. Grohe| author2-link=Martin Grohe| title=Parameterized Complexity Theory| year=2006| publisher=Springer|isbn=978-3-540-29953-0|page=245}}</ref> A more informal way of making the distinction is to say, quoting the [[Encyclopedia of Mathematics]], that "every node has a left child, a right child, neither, or both" and to specify that these "are all different" binary trees.<ref name="oem">{{SpringerEOM| id=Binary_tree&oldid=31607|title=Binary tree}} also in print as {{cite book| author=Michiel Hazewinkel|title=Encyclopaedia of Mathematics. Supplement I| url=https://books.google.com/books?id=3ndQH4mTzWQC&pg=PA124| year=1997|publisher=Springer Science & Business Media|isbn=978-0-7923-4709-5|page=124}}</ref> |
||
== Types of binary trees == |
== Types of binary trees == |
||
Tree terminology is not well-standardized and so varies in |
Tree terminology is not well-standardized and so varies in literatures. |
||
* A '''{{visible anchor|rooted}}''' binary [[tree data structure|tree]] has a [[root node]] and every node has at most two children. |
* A '''{{visible anchor|rooted}}''' binary [[tree data structure|tree]] has a [[root node]] and every node has at most two children. |
||
[[File:Full binary.svg|thumbnail|A full binary tree]] |
[[File:Full binary.svg|thumbnail|A full binary tree]] |
||
[[File:Waldburg Ahnentafel.jpg|thumb|An [[ancestry chart]] which |
[[File:Waldburg Ahnentafel.jpg|thumb|An [[ancestry chart]] which can be mapped to a perfect 4-level binary tree.]] |
||
* A '''{{visible anchor|full}}''' binary tree (sometimes referred to as a '''proper'''<ref>{{cite book|last1=Tamassia|first1=Michael T. Goodrich, Roberto |title=Algorithm design : foundations, analysis, and Internet examples| date=2011| publisher=Wiley-India|location=New Delhi| isbn=978-81-265-0986-7| page=76|edition=2|ref=Goodrich}}</ref> or ''' |
* A '''{{visible anchor|full}}''' binary tree (sometimes referred to as a '''proper''',<ref>{{cite book|last1=Tamassia|first1=Michael T. Goodrich, Roberto |title=Algorithm design : foundations, analysis, and Internet examples| date=2011| publisher=Wiley-India|location=New Delhi| isbn=978-81-265-0986-7| page=76|edition=2|ref=Goodrich}}</ref> '''plane''', or '''strict''' binary tree)<ref>{{cite web| url=http://xlinux.nist.gov/dads/HTML/fullBinaryTree.html | title=full binary tree | publisher = [[NIST]]}}</ref><ref>Richard Stanley, Enumerative Combinatorics, volume 2, p.36</ref> is a tree in which every node has either 0 or 2 children. Another way of defining a full binary tree is a [[recursive definition]]. A full binary tree is either:<ref name="Rosen2011">{{cite book|author=Kenneth Rosen| title=Discrete Mathematics and Its Applications 7th edition| year=2011| publisher=McGraw-Hill Science|pages=352–353|isbn=978-0-07-338309-5}}</ref> |
||
** A single vertex. |
** A single vertex (a single node as the root node). |
||
** A tree whose root node has two subtrees, both of which are full binary trees. |
** A tree whose root node has two subtrees, both of which are full binary trees. |
||
* |
* A '''{{visible anchor|perfect}}''' binary tree is a binary tree in which all interior nodes have two children ''and'' all leaves have the same ''depth'' or same ''level'' (the level of a node defined as the number of edges or links from the root node to a node).<ref>{{cite web| url=https://xlinux.nist.gov/dads/HTML/perfectBinaryTree.html|title=perfect binary tree | publisher = [[NIST]]}}</ref> A perfect binary tree is a full binary tree. |
||
* A '''{{visible anchor|complete}}''' binary tree is a binary tree in which every level, ''except possibly the last'', is completely filled, and all nodes in the last level are as far left as possible. It can have between 1 and 2<sup>''h''</sup> nodes at the last level ''h''.<ref name="complete binary tree">{{cite web| url=https://xlinux.nist.gov/dads/HTML/completeBinaryTree.html|title=complete binary tree| publisher = NIST}}</ref> A perfect tree is therefore always complete but a complete tree is not always perfect. Some authors use the term '''complete''' to refer instead to a '''perfect''' binary tree as defined above, in which case they call this type of tree (with a possibly not filled last level) an '''almost complete''' binary tree or '''nearly complete''' binary tree.<ref name="almost complete binary tree">{{cite web| url=http://faculty.cs.niu.edu/~mcmahon/CS241/Notes/bintree.html|title=almost complete binary tree|access-date=2015-12-11|archive-url=https://web.archive.org/web/20160304081430/http://faculty.cs.niu.edu/~mcmahon/CS241/Notes/bintree.html|archive-date=2016-03-04|url-status=dead}}</ref><ref name="nearly complete binary tree">{{cite web |url=http://homepages.math.uic.edu/~leon/cs-mcs401-s08/handouts/nearly_complete.pdf |archive-url=https://ghostarchive.org/archive/20221009/http://homepages.math.uic.edu/~leon/cs-mcs401-s08/handouts/nearly_complete.pdf |archive-date=2022-10-09 |url-status=live| title=nearly complete binary tree}}</ref> A complete binary tree can be efficiently represented using an array.<ref name="complete binary tree"/> |
|||
[[File:Complete binary2.svg|alt=|thumb|A complete binary tree (that is not full)]] |
[[File:Complete binary2.svg|alt=|thumb|A complete binary tree (that is not full)]] |
||
* The '''infinite complete''' binary tree is a tree with <math>{\aleph_0}</math> levels, where for each level ''d'' the number of existing nodes at level d is equal to 2<sup>''d''</sup>. The cardinal number of the set of all levels is <math>{\aleph_0}</math> (countably infinite). The cardinal number of the set of all paths (the "leaves", so to speak) is uncountable, having the [[cardinality of the continuum]]. |
|||
* A '''{{visible anchor|perfect}}''' binary tree is a binary tree in which all interior nodes have two children ''and'' all leaves have the same ''depth'' or same ''level''.<ref>{{cite web| url=https://xlinux.nist.gov/dads/HTML/perfectBinaryTree.html|title=perfect binary tree | publisher = [[NIST]]}}</ref> An example of a perfect binary tree is the (non-incestuous) [[ancestry chart]] of a person to a given depth, as each person has exactly two biological parents (one mother and one father). Provided the ancestry chart always displays the mother and the father on the same side for a given node, their sex can be seen as an analogy of left and right children, ''children'' being understood here as an algorithmic term. A perfect tree is therefore always complete but a complete tree is not necessarily perfect. |
|||
⚫ | * A '''balanced''' binary tree is a binary tree structure in which the left and right subtrees of every node differ in height (the number of edges from the top-most node to the farthest node in a subtree) by no more than 1 (or the skew is no greater than 1).<ref>Aaron M. Tenenbaum, et al. Data Structures Using C, Prentice Hall, 1990 {{ISBN|0-13-199746-7}}</ref> One may also consider binary trees where no leaf is much farther away from the root than any other leaf. (Different balancing schemes allow different definitions of "much farther".<ref>Paul E. Black (ed.), entry for ''data structure'' in ''[[Dictionary of Algorithms and Data Structures]]''. U.S. [[National Institute of Standards and Technology]]. 15 December 2004. [http://xw2k.nist.gov/dads//HTML/balancedtree.html Online version] {{webarchive |url=https://web.archive.org/web/20101221085950/http://xw2k.nist.gov/dads/ |date=December 21, 2010 }} Accessed 2010-12-19.</ref>) |
||
* In the '''infinite complete''' binary tree, every node has two children (and so the set of levels is [[countably infinite]]). The set of all nodes is countably infinite, but the set of all infinite paths from the root is uncountable, having the [[cardinality of the continuum]]. That's because these paths correspond by an order-preserving [[bijection]] to the points of the [[Cantor set]], or (using the example of a [[Stern–Brocot tree]]) to the set of positive [[irrational number]]s. |
|||
⚫ | * A '''degenerate''' (or '''pathological''') tree is where each parent node has only one associated child node.<ref>{{Cite web|url=https://towardsdatascience.com/5-types-of-binary-tree-with-cool-illustrations-9b335c430254|title=Different Types of Binary Tree with colourful illustrations|last=Parmar|first=Anand K.|date=2020-01-22 |website=Medium|language=en|access-date=2020-01-24}}</ref> This means that the tree will behave like a [[linked list]] data structure. In this case, an advantage of using a binary tree is significantly reduced because it is essentially a linked list which [[time complexity]] is O(''n'') (''n'' as the number of nodes) and it has more data space than the linked list due to two pointers per node, while the complexity of O(log<sub>2</sub>''n'') for data search in a balanced binary tree is normally expected. |
||
⚫ | * A '''balanced''' binary tree is a binary tree structure in which the left and right subtrees of every node differ in height by no more than 1.<ref>Aaron M. Tenenbaum, et al. Data Structures Using C, Prentice Hall, 1990 {{ISBN|0-13-199746-7}}</ref> One may also consider binary trees where no leaf is much farther away from the root than any other leaf. (Different balancing schemes allow different definitions of "much farther".<ref>Paul E. Black (ed.), entry for ''data structure'' in ''[[Dictionary of Algorithms and Data Structures]]. U.S. [[National Institute of Standards and Technology]]. 15 December 2004. [http://xw2k.nist.gov/dads//HTML/balancedtree.html Online version] {{webarchive |url=https://web.archive.org/web/20101221085950/http://xw2k.nist.gov/dads/ |date=December 21, 2010 }} Accessed 2010-12-19.</ref>) |
||
⚫ | * A '''degenerate''' (or '''pathological''') tree is where each parent node has only one associated child node.<ref>{{Cite web|url=https://towardsdatascience.com/5-types-of-binary-tree-with-cool-illustrations-9b335c430254|title=Different Types of Binary Tree with colourful illustrations|last=Parmar|first=Anand K.|date=2020-01-22 |website=Medium|language=en|access-date=2020-01-24}}</ref> This means that the tree will behave like a [[linked list]] data structure. |
||
== Properties of binary trees == |
== Properties of binary trees == |
||
* The number of nodes <math>n</math> in a full binary tree |
* The number of nodes <math>n</math> in a '''full''' binary tree is at least <math>2h+1</math> and at most <math>2^{h+1}-1 </math> (i.e., the number of nodes in a '''perfect''' binary tree), where <math>h</math> is the [[Glossary of graph theory terms#H|height]] of the tree. A tree consisting of only a root node has a height of 0. The least number of nodes is obtained by adding only two children nodes per adding height so <math>2h+1</math> (1 for counting the root node). The maximum number of nodes is obtained by fully filling nodes at each level, i.e., it is a perfect tree. For a perfect tree, the number of nodes is <math>1 + 2 + 4 + \ldots + 2^h = 2^{h + 1} - 1</math>, where the last equality is from the [[geometric series]] sum. |
||
* The number of leaf nodes <math>l</math> in a perfect binary tree |
* The number of leaf nodes <math>l</math> in a '''perfect''' binary tree is <math>l = (n + 1) / 2</math> (where <math>n</math> is the number of nodes in the tree) because <math>n={{2}^{h+1}}-1</math> (by using the above property) and the number of leaves is <math>2^h</math> so <math>n=2\cdot {{2}^{h}}-1=2l-1\to l=\left( n+1 \right)/2</math>. It also means that <math>n = 2l - 1</math>. In terms of the tree height <math>h </math>, <math>l = (2^{h+1}-1 + 1) / 2 = 2^h</math>. |
||
* For any non-empty binary tree with <math>l </math> leaf nodes and <math>i_2 </math> nodes of degree 2 (internal nodes with two child nodes), <math>l = i_2 + 1 </math>.<ref>{{cite book |last=Mehta |first=Dinesh |title=Handbook of Data Structures and Applications |author2=Sartaj Sahni |publisher=[[Chapman and Hall]] |year=2004 |isbn=1-58488-435-5 |author-link2=Sartaj Sahni}}</ref> The proof is the following. For a perfect binary tree, the total number of nodes is <math>n = 2^{h+1}-1 </math> (A perfect binary tree is a full binary tree.) and <math>l = 2^h</math>, so <math>i = n - l = (2^{h+1}-1) - 2^h = 2^h - 1 = l - 1 \to l = i + 1 </math>. To make a full binary tree from a perfect binary tree, a pair of two sibling nodes are removed one by one. This results in "two leaf nodes removed" and "one internal node removed" and "the removed internal node becoming a leaf node", so one leaf node and one internal node is removed per removing two sibling nodes. As a result, <math>l = i + 1</math> also holds for a full binary tree. To make a binary tree with a leaf node without its sibling, a single leaf node is removed from a full binary tree, then "one leaf node removed" and "one internal nodes with two children removed" so <math>l = i + 1</math> also holds. This relation now covers all non-empty binary trees. |
|||
* This means that a full binary tree with <math>l</math> leaves has <math>n = 2l - 1</math> nodes. |
|||
* With given <math>n </math> nodes, the minimum possible tree height is <math>h_{min} = \log _2 (n+1)-1 </math> with which the tree is a balanced full tree or perfect tree. With a given height <math>h</math>, the number of nodes can't exceed the <math>2^{h+1}-1 </math> as the number of nodes in a perfect tree. Thus <math>n \leq 2^{h + 1} - 1 \to h \geq \log _2 (n + 1) - 1</math>. |
|||
* In a '''balanced''' full binary tree, <math>h = \lceil \log_2(l)\rceil + 1 = \lceil \log_2((n + 1) / 2)\rceil + 1 = \lceil \log_2(n + 1)\rceil</math> (see [[Floor and ceiling functions|ceiling function]]).{{citation needed|date=September 2018}} |
|||
* A binary Tree with <math>l</math> leaves has at least the height <math>h_m = \log _ 2 (l)</math>. With a given height <math>h</math>, the number of leaves at that height can't exceed <math>2^h</math> as the number of leaves at the height in a perfect tree. Thus <math>l \leq 2^h \to h \geq \log _ 2 (l)</math>. |
|||
* In a '''perfect''' full binary tree, <math>l = 2^{h}</math> thus <math>n = 2^{h+1} - 1</math>. |
|||
* In a non-empty binary tree, if <math>n</math> is the total number of nodes and <math>e</math> is the total number of edges, then <math>e = n - 1</math>. This is obvious because each node requires one edge except for the root node. |
|||
* The number of null links (i.e., absent children of the nodes) in a binary tree of ''n'' nodes is (''n''+1). |
* The number of null links (i.e., absent children of the nodes) in a binary tree of ''n'' nodes is (''n'' + 1). |
||
* The number of internal nodes in a '''complete''' binary tree of ''n'' nodes is <math>\lfloor n/2\rfloor </math>. |
* The number of internal nodes in a '''complete''' binary tree of ''n'' nodes is <math>\lfloor n/2\rfloor </math>. |
||
* For any non-empty binary tree with ''n''<sub>0</sub> leaf nodes and ''n''<sub>2</sub> nodes of degree 2, ''n''<sub>0</sub> = ''n''<sub>2</sub> + 1.<ref>{{cite book | last=Mehta | first=Dinesh |author2=Sartaj Sahni | title=Handbook of Data Structures and Applications |publisher=[[Chapman and Hall]]|isbn = 1-58488-435-5|year=2004 | author-link2=Sartaj Sahni }}</ref> |
|||
== Combinatorics == |
== Combinatorics == |
||
{{unreferenced section|date=July 2014}} |
{{unreferenced section|date=July 2014}} |
||
In [[combinatorics]] one considers the problem of counting the number of full binary trees of a given size. Here the trees have no values attached to their nodes (this would just multiply the number of possible trees by an easily determined factor), and trees are distinguished only by their structure; however, the left and right child of any node are distinguished (if they are different trees, then interchanging them will produce a tree distinct from the original one). The size of the tree is taken to be the number ''n'' of internal nodes (those with two children); the other nodes are leaf nodes and there are {{ |
In [[combinatorics]], one considers the problem of counting the number of full binary trees of a given size. Here the trees have no values attached to their nodes (this would just multiply the number of possible trees by an easily determined factor), and trees are distinguished only by their structure; however, the left and right child of any node are distinguished (if they are different trees, then interchanging them will produce a tree distinct from the original one). The size of the tree is taken to be the number ''n'' of internal nodes (those with two children); the other nodes are leaf nodes and there are {{math|''n'' + 1}} of them. The number of such binary trees of size ''n'' is equal to the number of ways of fully parenthesizing a string of {{math|''n'' + 1}} symbols (representing leaves) separated by ''n'' binary operators (representing internal nodes), to determine the argument subexpressions of each operator. For instance for {{math|''n'' {{=}} 3}} one has to parenthesize a string like {{tmath|X*X*X*X}}, which is possible in five ways: |
||
: <math>((X*X)*X)*X,\qquad (X*(X*X))*X,\qquad (X*X)*(X*X),\qquad X*((X*X)*X),\qquad X*(X*(X*X)).</math> |
: <math display=block>((X*X)*X)*X,\qquad (X*(X*X))*X,\qquad (X*X)*(X*X),\qquad X*((X*X)*X),\qquad X*(X*(X*X)).</math> |
||
The correspondence to binary trees should be obvious, and the addition of redundant parentheses (around an already parenthesized expression or around the full expression) is disallowed (or at least not counted as producing a new possibility). |
The correspondence to binary trees should be obvious, and the addition of redundant parentheses (around an already parenthesized expression or around the full expression) is disallowed (or at least not counted as producing a new possibility). |
||
There is a unique binary tree of size 0 (consisting of a single leaf), and any other binary tree is characterized by the pair of its left and right children; if these have sizes ''i'' and ''j'' respectively, the full tree has size {{ |
There is a unique binary tree of size 0 (consisting of a single leaf), and any other binary tree is characterized by the pair of its left and right children; if these have sizes ''i'' and ''j'' respectively, the full tree has size {{math|''i'' + ''j'' + 1}}. Therefore, the number <math>C_n</math> of binary trees of size ''n'' has the following recursive description <math>C_0=1</math>, and <math>\textstyle C_n=\sum_{i=0}^{n-1}C_iC_{n-1-i}</math> for any positive integer ''n''. It follows that <math>C_n</math> is the [[Catalan number]] of index ''n''. |
||
The above parenthesized strings should not be confused with the set of words of length 2''n'' in the [[Dyck language]], which consist only of parentheses in such a way that they are properly balanced. The number of such strings satisfies the same recursive description (each Dyck word of length 2''n'' is determined by the Dyck subword enclosed by the initial '(' and its matching ')' together with the Dyck subword remaining after that closing parenthesis, whose lengths 2''i'' and 2''j'' satisfy {{ |
The above parenthesized strings should not be confused with the set of words of length 2''n'' in the [[Dyck language]], which consist only of parentheses in such a way that they are properly balanced. The number of such strings satisfies the same recursive description (each Dyck word of length 2''n'' is determined by the Dyck subword enclosed by the initial '(' and its matching ')' together with the Dyck subword remaining after that closing parenthesis, whose lengths 2''i'' and 2''j'' satisfy {{math|''i'' + ''j'' + 1 {{=}} ''n''}}); this number is therefore also the Catalan number <math>C_n</math>. So there are also five Dyck words of length 6: |
||
: |
: {{math|()()(),{{quad}} ()(()),{{quad}} (())(),{{quad}} (()()),{{quad}} ((()))}} |
||
These Dyck words do not correspond to binary trees in the same way. Instead, they are related by the following recursively defined bijection: the Dyck word equal to the empty string corresponds to the binary tree of size 0 with only one leaf. Any other Dyck word can be written as (<math>w_1</math>)<math>w_2</math>, where <math>w_1</math>,<math>w_2</math> are themselves (possibly empty) Dyck words and where the two written parentheses are matched. The bijection is then defined by letting the words <math>w_1</math> and <math>w_2</math> correspond to the binary trees that are the left and right children of the root. |
These Dyck words do not correspond to binary trees in the same way. Instead, they are related by the following recursively defined bijection: the Dyck word equal to the empty string corresponds to the binary tree of size 0 with only one leaf. Any other Dyck word can be written as (<math>w_1</math>)<math>w_2</math>, where <math>w_1</math>,<math>w_2</math> are themselves (possibly empty) Dyck words and where the two written parentheses are matched. The bijection is then defined by letting the words <math>w_1</math> and <math>w_2</math> correspond to the binary trees that are the left and right children of the root. |
||
Line 83: | Line 82: | ||
This method of storing binary trees wastes a fair bit of memory, as the pointers will be null (or point to the sentinel) more than half the time; a more conservative representation alternative is [[threaded binary tree]].<ref name="Samanta2004">{{cite book|author=D. Samanta|title=Classic Data Structures|year=2004|publisher=PHI Learning Pvt. Ltd.|isbn=978-81-203-1874-8|pages=264–265}}</ref> |
This method of storing binary trees wastes a fair bit of memory, as the pointers will be null (or point to the sentinel) more than half the time; a more conservative representation alternative is [[threaded binary tree]].<ref name="Samanta2004">{{cite book|author=D. Samanta|title=Classic Data Structures|year=2004|publisher=PHI Learning Pvt. Ltd.|isbn=978-81-203-1874-8|pages=264–265}}</ref> |
||
In languages with [[tagged union]]s such as [[ML (programming language)|ML]], a tree node is often a tagged union of two types of nodes, one of which is a 3-tuple of data, left child, and right child, and the other of which is a "leaf" node, which contains no data and functions much like the null value in a language with pointers. For example, the following line of code in [[OCaml]] (an ML dialect) defines a binary tree that stores a character in each node.<ref name="Scott2009">{{cite book|author=Michael L. Scott| title=Programming Language Pragmatics |year=2009| publisher=Morgan Kaufmann|isbn=978-0-08-092299-7|page=347| edition=3rd}}</ref> |
In languages with [[tagged union]]s such as [[ML (programming language)|ML]], a tree node is often a tagged union of two types of nodes, one of which is a 3-tuple of data, left child, and right child, and the other of which is a "leaf" node, which contains no data and functions much like the null value in a language with pointers. For example, the following line of code in [[OCaml]] (an ML dialect) defines a binary tree that stores a character in each node.<ref name="Scott2009">{{cite book|author=Michael L. Scott| title=Programming Language Pragmatics |year=2009| publisher=Morgan Kaufmann|isbn=978-0-08-092299-7|page=347| edition=3rd}}</ref> |
||
<!-- the source gives the example in Standard ML, which has "datatype" instead of "type", but wikipedia's source tag doesn't support Standard ML. --> |
<!-- the source gives the example in Standard ML, which has "datatype" instead of "type", but wikipedia's source tag doesn't support Standard ML. --> |
||
Line 92: | Line 91: | ||
=== Arrays === |
=== Arrays === |
||
Binary trees can also be stored in breadth-first order as an [[implicit data structure]] in [[array data structure|arrays]], and if the tree is a complete binary tree, this method wastes no space. In this compact arrangement, if a node has an index ''i'', its children are found at indices <math>2i + 1</math> (for the left child) and <math>2i +2</math> (for the right), while its parent (if any) is found at index ''<math>\left \lfloor \frac{i-1}{2} \right \rfloor</math>'' (assuming the root has index zero). Alternatively, with a 1-indexed array, the implementation is simplified with children found at <math>2i</math> and <math>2i+1</math>, and parent found at <math>\lfloor i/2 \rfloor</math>.<ref>{{Cite book| title=Introduction to algorithms| date=2001|publisher=MIT Press|others=Cormen, Thomas H., Cormen, Thomas H.|isbn=0-262-03293-7|edition=2nd|location=Cambridge, Mass.| pages=128| oclc=46792720}}</ref> |
Binary trees can also be stored in breadth-first order as an [[implicit data structure]] in [[array data structure|arrays]], and if the tree is a complete binary tree, this method wastes no space. In this compact arrangement, if a node has an index ''i'', its children are found at indices <math>2i + 1</math> (for the left child) and <math>2i +2</math> (for the right), while its parent (if any) is found at index ''<math>\left \lfloor \frac{i-1}{2} \right \rfloor</math>'' (assuming the root has index zero). Alternatively, with a 1-indexed array, the implementation is simplified with children found at <math>2i</math> and <math>2i+1</math>, and parent found at <math>\lfloor i/2 \rfloor</math>.<ref>{{Cite book| title=Introduction to algorithms| date=2001|publisher=MIT Press|others=Cormen, Thomas H., Cormen, Thomas H.|isbn=0-262-03293-7|edition=2nd|location=Cambridge, Mass.| pages=128| oclc=46792720}}</ref> |
||
This method benefits from more compact storage and better [[locality of reference]], particularly during a preorder traversal. It is often used for [[binary heap]]s.<ref>{{Cite web |last=Laakso |first=Mikko |title=Priority Queue and Binary Heap |url=http://www.cse.hut.fi/en/research/SVG/TRAKLA2/tutorials/heap_tutorial/taulukkona.html |access-date=2023-10-11 |website=University of Aalto}}</ref> |
|||
This method of storage is often used for [[binary heap]]s.{{Citation needed|date=December 2020}} |
|||
[[Image:Binary tree in array.svg|upright=1.2|center|A small complete binary tree stored in an array]] |
|||
== Encodings == |
== Encodings == |
||
Line 103: | Line 102: | ||
A [[succinct data structure]] is one which occupies close to minimum possible space, as established by [[information theory|information theoretical]] lower bounds. The number of different binary trees on <math>n</math> nodes is <math>\mathrm{C}_{n}</math>, the <math>n</math>th [[Catalan number]] (assuming we view trees with identical ''structure'' as identical). For large <math>n</math>, this is about <math>4^{n}</math>; thus we need at least about <math>\log_{2}4^{n} = 2n</math> bits to encode it. A succinct binary tree therefore would occupy <math>2n+o(n)</math> bits. |
A [[succinct data structure]] is one which occupies close to minimum possible space, as established by [[information theory|information theoretical]] lower bounds. The number of different binary trees on <math>n</math> nodes is <math>\mathrm{C}_{n}</math>, the <math>n</math>th [[Catalan number]] (assuming we view trees with identical ''structure'' as identical). For large <math>n</math>, this is about <math>4^{n}</math>; thus we need at least about <math>\log_{2}4^{n} = 2n</math> bits to encode it. A succinct binary tree therefore would occupy <math>2n+o(n)</math> bits. |
||
One simple representation which meets this bound is to visit the nodes of the tree in preorder, outputting "1" for an internal node and "0" for a leaf. |
One simple representation which meets this bound is to visit the nodes of the tree in preorder, outputting "1" for an internal node and "0" for a leaf.<ref>{{cite web |last1=Demaine |first1=Erik |title=6.897: Advanced Data Structures Spring 2003 Lecture 12 |url=http://theory.csail.mit.edu/classes/6.897/spring03/scribe_notes/L12/lecture12.pdf |publisher=MIT CSAIL |access-date=14 April 2022 |archive-url=https://web.archive.org/web/20051124175104/http://theory.csail.mit.edu/classes/6.897/spring03/scribe_notes/L12/lecture12.pdf |archive-date=24 November 2005 |url-status=dead}}</ref> If the tree contains data, we can simply simultaneously store it in a consecutive array in preorder. This function accomplishes this: |
||
'''function''' EncodeSuccinct(''node'' n, ''bitstring'' structure, ''array'' data) { |
'''function''' EncodeSuccinct(''node'' n, ''bitstring'' structure, ''array'' data) { |
||
Line 131: | Line 130: | ||
More sophisticated succinct representations allow not only compact storage of trees but even useful operations on those trees directly while they're still in their succinct form. |
More sophisticated succinct representations allow not only compact storage of trees but even useful operations on those trees directly while they're still in their succinct form. |
||
=== Encoding |
=== Encoding ordered trees as binary trees === |
||
There is a natural one-to-one correspondence between ordered trees and binary trees. It allows any ordered tree to be uniquely represented as a binary tree, and vice versa: |
|||
There is a one-to-one mapping between general ordered trees and binary trees, which in particular is used by [[Lisp (programming language)|Lisp]] to represent general ordered trees as binary trees. To convert a general ordered tree to a binary tree, we only need to represent the general tree in left-child right-sibling way. The result of this representation will automatically be a binary tree if viewed from a different perspective. Each node ''N'' in the ordered tree corresponds to a node ''N' '' in the binary tree; the ''left'' child of ''N' '' is the node corresponding to the first child of ''N'', and the ''right'' child of ''N' '' is the node corresponding to ''N'' 's next sibling --- that is, the next node in order among the children of the parent of ''N''. This binary tree representation of a general order tree is sometimes also referred to as a [[left-child right-sibling binary tree]] (also known as LCRS tree, doubly chained tree, filial-heir chain). |
|||
Let ''T'' be a node of an ordered tree, and let ''B'' denote ''T's'' image in the corresponding binary tree. Then ''B's'' ''left'' child represents ''T's'' first child, while the ''B's right'' child represents ''T'''s next sibling. |
|||
One way of thinking about this is that each node's children are in a [[linked list]], chained together with their ''right'' fields, and the node only has a pointer to the beginning or head of this list, through its ''left'' field. |
|||
For example, |
For example, the ordered tree on the left and the binary tree on the right correspond: |
||
⚫ | |||
<center> |
|||
⚫ | |||
</center> |
|||
In the pictured binary tree, the black, left, edges represent ''first child'', while the blue, right, edges represent ''next sibling''. |
|||
This representation is called a [[left-child right-sibling binary tree]]. |
|||
:(((N O) I J) C D ((P) (Q)) F (M)) |
|||
which would be implemented in memory as the binary tree on the right, without any letters on those nodes that have a left child. |
|||
== Common operations == |
== Common operations == |
||
Line 176: | Line 171: | ||
{{Main|Tree traversal}} |
{{Main|Tree traversal}} |
||
Pre-order, in-order, and post-order traversal visit each node in a tree by recursively visiting each node in the left and right subtrees of the root. |
Pre-order, in-order, and post-order traversal visit each node in a tree by recursively visiting each node in the left and right subtrees of the root. Below are the brief descriptions of above mentioned traversals. |
||
==== Pre-order ==== |
|||
In pre-order, we always visit the current node; next, we recursively traverse the current node's left subtree, and then we recursively traverse the current node's right subtree. The pre-order traversal is a [[Topological sorting|topologically sorted]] one, because a parent node is processed before any of its child nodes is done. |
|||
==== In-order ==== |
|||
In in-order, we always recursively traverse the current node's left subtree; next, we visit the current node, and lastly, we recursively traverse the current node's right subtree. |
|||
==== Post-order ==== |
|||
In post-order, we always recursively traverse the current node's left subtree; next, we recursively traverse the current node's right subtree and then visit the current node. Post-order traversal can be useful to get postfix expression of a [[binary expression tree]].<ref>{{Cite web |date=2015-02-13 |first=Todd |last=Wittman |title=Lecture 18: Tree Traversals |url=http://www.math.ucla.edu/~wittman/10b.1.10w/Lectures/Lec18.pdf |access-date=2023-04-29 |archive-url=https://web.archive.org/web/20150213195803/http://www.math.ucla.edu/~wittman/10b.1.10w/Lectures/Lec18.pdf |archive-date=2015-02-13 }}</ref> |
|||
==== Depth-first order ==== |
==== Depth-first order ==== |
||
Line 184: | Line 188: | ||
Contrasting with depth-first order is breadth-first order, which always attempts to visit the node closest to the root that it has not already visited. See [[breadth-first search]] for more information. Also called a ''level-order traversal''. |
Contrasting with depth-first order is breadth-first order, which always attempts to visit the node closest to the root that it has not already visited. See [[breadth-first search]] for more information. Also called a ''level-order traversal''. |
||
In a complete binary tree, a node's breadth-index (''i'' − (2<sup>''d''</sup> − 1)) can be used as traversal instructions from the root. Reading bitwise from left to right, starting at bit ''d'' − 1, where ''d'' is the node's distance from the root (''d'' = |
In a complete binary tree, a node's breadth-index (''i'' − (2<sup>''d''</sup> − 1)) can be used as traversal instructions from the root. Reading bitwise from left to right, starting at bit ''d'' − 1, where ''d'' is the node's distance from the root (''d'' = ⌊log{{sub|2}}(''i''+1)⌋) and the node in question is not the root itself (''d'' > 0). When the breadth-index is masked at bit ''d'' − 1, the bit values {{mono|0}} and {{mono|1}} mean to step either left or right, respectively. The process continues by successively checking the next bit to the right until there are no more. The rightmost bit indicates the final traversal from the desired node's parent to the node itself. There is a time-space trade-off between iterating a complete binary tree this way versus each node having pointer(s) to its sibling(s). |
||
== See also == |
== See also == |
||
{{ |
{{Columns-list|colwidth=30em| |
||
* [[2–3 tree]] |
* [[2–3 tree]] |
||
* [[2–3–4 tree]] |
* [[2–3–4 tree]] |
||
Line 219: | Line 223: | ||
== External links == |
== External links == |
||
{{ |
{{Commons category|Binary trees}} |
||
* [http://www.findstat.org/BinaryTrees binary trees] entry in the [http://www.findstat.org/ FindStat] database |
* [http://www.findstat.org/BinaryTrees binary trees] {{Webarchive|url=https://web.archive.org/web/20200923083018/http://www.findstat.org/BinaryTrees |date=2020-09-23 }} entry in the [http://www.findstat.org/ FindStat] database |
||
* [http://www.brpreiss.com/books/opus4/html/page355.html Binary Tree Proof by Induction] |
* [http://www.brpreiss.com/books/opus4/html/page355.html Binary Tree Proof by Induction] |
||
* [http://piergiu.wordpress.com/2010/02/21/balanced-binary-search-tree-on-array/ Balanced binary search tree on array How to create bottom-up an Ahnentafel list, or a balanced binary search tree on array] |
* [http://piergiu.wordpress.com/2010/02/21/balanced-binary-search-tree-on-array/ Balanced binary search tree on array How to create bottom-up an Ahnentafel list, or a balanced binary search tree on array] |
||
* [http://www.cpphub.com/search/label/Binary%20trees Binary trees and Implementation of the same with working code examples] |
* [https://web.archive.org/web/20190407044321/http://www.cpphub.com/search/label/Binary%20trees Binary trees and Implementation of the same with working code examples] |
||
* [https://lufemas.github.io/binary-tree-jr/ Binary Tree JavaScript Implementation with source code] |
* [https://lufemas.github.io/binary-tree-jr/ Binary Tree JavaScript Implementation with source code] |
||
{{CS-Trees}} |
{{CS-Trees}} |
||
{{Authority control}} |
|||
{{DEFAULTSORT:Binary Tree}} |
{{DEFAULTSORT:Binary Tree}} |
Latest revision as of 10:10, 12 December 2024
In computer science, a binary tree is a tree data structure in which each node has at most two children, referred to as the left child and the right child. That is, it is a k-ary tree with k = 2. A recursive definition using set theory is that a binary tree is a tuple (L, S, R), where L and R are binary trees or the empty set and S is a singleton set containing the root.[1][2]
From a graph theory perspective, binary trees as defined here are arborescences.[3] A binary tree may thus be also called a bifurcating arborescence,[3] a term which appears in some early programming books[4] before the modern computer science terminology prevailed. It is also possible to interpret a binary tree as an undirected, rather than directed graph, in which case a binary tree is an ordered, rooted tree.[5] Some authors use rooted binary tree instead of binary tree to emphasize the fact that the tree is rooted, but as defined above, a binary tree is always rooted.[6]
In mathematics, what is termed binary tree can vary significantly from author to author. Some use the definition commonly used in computer science,[7] but others define it as every non-leaf having exactly two children and don't necessarily label the children as left and right either.[8]
In computing, binary trees can be used in two very different ways:
- First, as a means of accessing nodes based on some value or label associated with each node.[9] Binary trees labelled this way are used to implement binary search trees and binary heaps, and are used for efficient searching and sorting. The designation of non-root nodes as left or right child even when there is only one child present matters in some of these applications, in particular, it is significant in binary search trees.[10] However, the arrangement of particular nodes into the tree is not part of the conceptual information. For example, in a normal binary search tree the placement of nodes depends almost entirely on the order in which they were added, and can be re-arranged (for example by balancing) without changing the meaning.
- Second, as a representation of data with a relevant bifurcating structure. In such cases, the particular arrangement of nodes under and/or to the left or right of other nodes is part of the information (that is, changing it would change the meaning). Common examples occur with Huffman coding and cladograms. The everyday division of documents into chapters, sections, paragraphs, and so on is an analogous example with n-ary rather than binary trees.
Definitions
[edit]Recursive definition
[edit]This article may need to be rewritten to comply with Wikipedia's quality standards, as section. (July 2014) |
To define a binary tree, the possibility that only one of the children may be empty must be acknowledged. An artifact, which in some textbooks is called an extended binary tree, is needed for that purpose. An extended binary tree is thus recursively defined as:[11]
- the empty set is an extended binary tree
- if T1 and T2 are extended binary trees, then denote by T1 • T2 the extended binary tree obtained by adding a root r connected to the left to T1 and to the right to T2[clarification needed where did the 'r' go in the 'T1 • T2' symbol] by adding edges when these sub-trees are non-empty.
Another way of imagining this construction (and understanding the terminology) is to consider instead of the empty set a different type of node—for instance square nodes if the regular ones are circles.[12]
Using graph theory concepts
[edit]A binary tree is a rooted tree that is also an ordered tree (a.k.a. plane tree) in which every node has at most two children. A rooted tree naturally imparts a notion of levels (distance from the root); thus, for every node, a notion of children may be defined as the nodes connected to it a level below. Ordering of these children (e.g., by drawing them on a plane) makes it possible to distinguish a left child from a right child.[13] But this still does not distinguish between a node with left but not a right child from a node with right but no left child.
The necessary distinction can be made by first partitioning the edges; i.e., defining the binary tree as triplet (V, E1, E2), where (V, E1 ∪ E2) is a rooted tree (equivalently arborescence) and E1 ∩ E2 is empty, and also requiring that for all j ∈ { 1, 2 }, every node has at most one Ej child.[14] A more informal way of making the distinction is to say, quoting the Encyclopedia of Mathematics, that "every node has a left child, a right child, neither, or both" and to specify that these "are all different" binary trees.[7]
Types of binary trees
[edit]Tree terminology is not well-standardized and so varies in literatures.
- A full binary tree (sometimes referred to as a proper,[15] plane, or strict binary tree)[16][17] is a tree in which every node has either 0 or 2 children. Another way of defining a full binary tree is a recursive definition. A full binary tree is either:[11]
- A single vertex (a single node as the root node).
- A tree whose root node has two subtrees, both of which are full binary trees.
- A perfect binary tree is a binary tree in which all interior nodes have two children and all leaves have the same depth or same level (the level of a node defined as the number of edges or links from the root node to a node).[18] A perfect binary tree is a full binary tree.
- A complete binary tree is a binary tree in which every level, except possibly the last, is completely filled, and all nodes in the last level are as far left as possible. It can have between 1 and 2h nodes at the last level h.[19] A perfect tree is therefore always complete but a complete tree is not always perfect. Some authors use the term complete to refer instead to a perfect binary tree as defined above, in which case they call this type of tree (with a possibly not filled last level) an almost complete binary tree or nearly complete binary tree.[20][21] A complete binary tree can be efficiently represented using an array.[19]
- The infinite complete binary tree is a tree with levels, where for each level d the number of existing nodes at level d is equal to 2d. The cardinal number of the set of all levels is (countably infinite). The cardinal number of the set of all paths (the "leaves", so to speak) is uncountable, having the cardinality of the continuum.
- A balanced binary tree is a binary tree structure in which the left and right subtrees of every node differ in height (the number of edges from the top-most node to the farthest node in a subtree) by no more than 1 (or the skew is no greater than 1).[22] One may also consider binary trees where no leaf is much farther away from the root than any other leaf. (Different balancing schemes allow different definitions of "much farther".[23])
- A degenerate (or pathological) tree is where each parent node has only one associated child node.[24] This means that the tree will behave like a linked list data structure. In this case, an advantage of using a binary tree is significantly reduced because it is essentially a linked list which time complexity is O(n) (n as the number of nodes) and it has more data space than the linked list due to two pointers per node, while the complexity of O(log2n) for data search in a balanced binary tree is normally expected.
Properties of binary trees
[edit]- The number of nodes in a full binary tree is at least and at most (i.e., the number of nodes in a perfect binary tree), where is the height of the tree. A tree consisting of only a root node has a height of 0. The least number of nodes is obtained by adding only two children nodes per adding height so (1 for counting the root node). The maximum number of nodes is obtained by fully filling nodes at each level, i.e., it is a perfect tree. For a perfect tree, the number of nodes is , where the last equality is from the geometric series sum.
- The number of leaf nodes in a perfect binary tree is (where is the number of nodes in the tree) because (by using the above property) and the number of leaves is so . It also means that . In terms of the tree height , .
- For any non-empty binary tree with leaf nodes and nodes of degree 2 (internal nodes with two child nodes), .[25] The proof is the following. For a perfect binary tree, the total number of nodes is (A perfect binary tree is a full binary tree.) and , so . To make a full binary tree from a perfect binary tree, a pair of two sibling nodes are removed one by one. This results in "two leaf nodes removed" and "one internal node removed" and "the removed internal node becoming a leaf node", so one leaf node and one internal node is removed per removing two sibling nodes. As a result, also holds for a full binary tree. To make a binary tree with a leaf node without its sibling, a single leaf node is removed from a full binary tree, then "one leaf node removed" and "one internal nodes with two children removed" so also holds. This relation now covers all non-empty binary trees.
- With given nodes, the minimum possible tree height is with which the tree is a balanced full tree or perfect tree. With a given height , the number of nodes can't exceed the as the number of nodes in a perfect tree. Thus .
- A binary Tree with leaves has at least the height . With a given height , the number of leaves at that height can't exceed as the number of leaves at the height in a perfect tree. Thus .
- In a non-empty binary tree, if is the total number of nodes and is the total number of edges, then . This is obvious because each node requires one edge except for the root node.
- The number of null links (i.e., absent children of the nodes) in a binary tree of n nodes is (n + 1).
- The number of internal nodes in a complete binary tree of n nodes is .
Combinatorics
[edit]In combinatorics, one considers the problem of counting the number of full binary trees of a given size. Here the trees have no values attached to their nodes (this would just multiply the number of possible trees by an easily determined factor), and trees are distinguished only by their structure; however, the left and right child of any node are distinguished (if they are different trees, then interchanging them will produce a tree distinct from the original one). The size of the tree is taken to be the number n of internal nodes (those with two children); the other nodes are leaf nodes and there are n + 1 of them. The number of such binary trees of size n is equal to the number of ways of fully parenthesizing a string of n + 1 symbols (representing leaves) separated by n binary operators (representing internal nodes), to determine the argument subexpressions of each operator. For instance for n = 3 one has to parenthesize a string like , which is possible in five ways:
The correspondence to binary trees should be obvious, and the addition of redundant parentheses (around an already parenthesized expression or around the full expression) is disallowed (or at least not counted as producing a new possibility).
There is a unique binary tree of size 0 (consisting of a single leaf), and any other binary tree is characterized by the pair of its left and right children; if these have sizes i and j respectively, the full tree has size i + j + 1. Therefore, the number of binary trees of size n has the following recursive description , and for any positive integer n. It follows that is the Catalan number of index n.
The above parenthesized strings should not be confused with the set of words of length 2n in the Dyck language, which consist only of parentheses in such a way that they are properly balanced. The number of such strings satisfies the same recursive description (each Dyck word of length 2n is determined by the Dyck subword enclosed by the initial '(' and its matching ')' together with the Dyck subword remaining after that closing parenthesis, whose lengths 2i and 2j satisfy i + j + 1 = n); this number is therefore also the Catalan number . So there are also five Dyck words of length 6:
- ()()(), ()(()), (())(), (()()), ((()))
These Dyck words do not correspond to binary trees in the same way. Instead, they are related by the following recursively defined bijection: the Dyck word equal to the empty string corresponds to the binary tree of size 0 with only one leaf. Any other Dyck word can be written as (), where , are themselves (possibly empty) Dyck words and where the two written parentheses are matched. The bijection is then defined by letting the words and correspond to the binary trees that are the left and right children of the root.
A bijective correspondence can also be defined as follows: enclose the Dyck word in an extra pair of parentheses, so that the result can be interpreted as a Lisp list expression (with the empty list () as only occurring atom); then the dotted-pair expression for that proper list is a fully parenthesized expression (with NIL as symbol and '.' as operator) describing the corresponding binary tree (which is, in fact, the internal representation of the proper list).
The ability to represent binary trees as strings of symbols and parentheses implies that binary trees can represent the elements of a free magma on a singleton set.
Methods for storing binary trees
[edit]Binary trees can be constructed from programming language primitives in several ways.
Nodes and references
[edit]In a language with records and references, binary trees are typically constructed by having a tree node structure which contains some data and references to its left child and its right child. Sometimes it also contains a reference to its unique parent. If a node has fewer than two children, some of the child pointers may be set to a special null value, or to a special sentinel node.
This method of storing binary trees wastes a fair bit of memory, as the pointers will be null (or point to the sentinel) more than half the time; a more conservative representation alternative is threaded binary tree.[26]
In languages with tagged unions such as ML, a tree node is often a tagged union of two types of nodes, one of which is a 3-tuple of data, left child, and right child, and the other of which is a "leaf" node, which contains no data and functions much like the null value in a language with pointers. For example, the following line of code in OCaml (an ML dialect) defines a binary tree that stores a character in each node.[27]
type chr_tree = Empty | Node of char * chr_tree * chr_tree
Arrays
[edit]Binary trees can also be stored in breadth-first order as an implicit data structure in arrays, and if the tree is a complete binary tree, this method wastes no space. In this compact arrangement, if a node has an index i, its children are found at indices (for the left child) and (for the right), while its parent (if any) is found at index (assuming the root has index zero). Alternatively, with a 1-indexed array, the implementation is simplified with children found at and , and parent found at .[28]
This method benefits from more compact storage and better locality of reference, particularly during a preorder traversal. It is often used for binary heaps.[29]
Encodings
[edit]Succinct encodings
[edit]A succinct data structure is one which occupies close to minimum possible space, as established by information theoretical lower bounds. The number of different binary trees on nodes is , the th Catalan number (assuming we view trees with identical structure as identical). For large , this is about ; thus we need at least about bits to encode it. A succinct binary tree therefore would occupy bits.
One simple representation which meets this bound is to visit the nodes of the tree in preorder, outputting "1" for an internal node and "0" for a leaf.[30] If the tree contains data, we can simply simultaneously store it in a consecutive array in preorder. This function accomplishes this:
function EncodeSuccinct(node n, bitstring structure, array data) { if n = nil then append 0 to structure; else append 1 to structure; append n.data to data; EncodeSuccinct(n.left, structure, data); EncodeSuccinct(n.right, structure, data); }
The string structure has only bits in the end, where is the number of (internal) nodes; we don't even have to store its length. To show that no information is lost, we can convert the output back to the original tree like this:
function DecodeSuccinct(bitstring structure, array data) { remove first bit of structure and put it in b if b = 1 then create a new node n remove first element of data and put it in n.data n.left = DecodeSuccinct(structure, data) n.right = DecodeSuccinct(structure, data) return n else return nil }
More sophisticated succinct representations allow not only compact storage of trees but even useful operations on those trees directly while they're still in their succinct form.
Encoding ordered trees as binary trees
[edit]There is a natural one-to-one correspondence between ordered trees and binary trees. It allows any ordered tree to be uniquely represented as a binary tree, and vice versa:
Let T be a node of an ordered tree, and let B denote T's image in the corresponding binary tree. Then B's left child represents T's first child, while the B's right child represents T's next sibling.
For example, the ordered tree on the left and the binary tree on the right correspond:
In the pictured binary tree, the black, left, edges represent first child, while the blue, right, edges represent next sibling.
This representation is called a left-child right-sibling binary tree.
Common operations
[edit]There are a variety of different operations that can be performed on binary trees. Some are mutator operations, while others simply return useful information about the tree.
Insertion
[edit]Nodes can be inserted into binary trees in between two other nodes or added after a leaf node. In binary trees, a node that is inserted is specified as to whose child it will be.
Leaf nodes
[edit]To add a new node after leaf node A, A assigns the new node as one of its children and the new node assigns node A as its parent.
Internal nodes
[edit]Insertion on internal nodes is slightly more complex than on leaf nodes. Say that the internal node is node A and that node B is the child of A. (If the insertion is to insert a right child, then B is the right child of A, and similarly with a left child insertion.) A assigns its child to the new node and the new node assigns its parent to A. Then the new node assigns its child to B and B assigns its parent as the new node.
Deletion
[edit]Deletion is the process whereby a node is removed from the tree. Only certain nodes in a binary tree can be removed unambiguously.[31]
Node with zero or one children
[edit]Suppose that the node to delete is node A. If A has no children, deletion is accomplished by setting the child of A's parent to null. If A has one child, set the parent of A's child to A's parent and set the child of A's parent to A's child.
Node with two children
[edit]In a binary tree, a node with two children cannot be deleted unambiguously.[31] However, in certain binary trees (including binary search trees) these nodes can be deleted, though with a rearrangement of the tree structure.
Traversal
[edit]Pre-order, in-order, and post-order traversal visit each node in a tree by recursively visiting each node in the left and right subtrees of the root. Below are the brief descriptions of above mentioned traversals.
Pre-order
[edit]In pre-order, we always visit the current node; next, we recursively traverse the current node's left subtree, and then we recursively traverse the current node's right subtree. The pre-order traversal is a topologically sorted one, because a parent node is processed before any of its child nodes is done.
In-order
[edit]In in-order, we always recursively traverse the current node's left subtree; next, we visit the current node, and lastly, we recursively traverse the current node's right subtree.
Post-order
[edit]In post-order, we always recursively traverse the current node's left subtree; next, we recursively traverse the current node's right subtree and then visit the current node. Post-order traversal can be useful to get postfix expression of a binary expression tree.[32]
Depth-first order
[edit]In depth-first order, we always attempt to visit the node farthest from the root node that we can, but with the caveat that it must be a child of a node we have already visited. Unlike a depth-first search on graphs, there is no need to remember all the nodes we have visited, because a tree cannot contain cycles. Pre-order is a special case of this. See depth-first search for more information.
Breadth-first order
[edit]Contrasting with depth-first order is breadth-first order, which always attempts to visit the node closest to the root that it has not already visited. See breadth-first search for more information. Also called a level-order traversal.
In a complete binary tree, a node's breadth-index (i − (2d − 1)) can be used as traversal instructions from the root. Reading bitwise from left to right, starting at bit d − 1, where d is the node's distance from the root (d = ⌊log2(i+1)⌋) and the node in question is not the root itself (d > 0). When the breadth-index is masked at bit d − 1, the bit values 0 and 1 mean to step either left or right, respectively. The process continues by successively checking the next bit to the right until there are no more. The rightmost bit indicates the final traversal from the desired node's parent to the node itself. There is a time-space trade-off between iterating a complete binary tree this way versus each node having pointer(s) to its sibling(s).
See also
[edit]- 2–3 tree
- 2–3–4 tree
- AA tree
- Ahnentafel
- AVL tree
- B-tree
- Binary space partitioning
- Huffman tree
- K-ary tree
- Kraft's inequality
- Optimal binary search tree
- Random binary tree
- Recursion (computer science)
- Red–black tree
- Rope (computer science)
- Self-balancing binary search tree
- Splay tree
- Strahler number
- Tree of primitive Pythagorean triples#Alternative methods of generating the tree
- Unrooted binary tree
References
[edit]Citations
[edit]- ^ Rowan Garnier; John Taylor (2009). Discrete Mathematics:Proofs, Structures and Applications, Third Edition. CRC Press. p. 620. ISBN 978-1-4398-1280-8.
- ^ Steven S Skiena (2009). The Algorithm Design Manual. Springer Science & Business Media. p. 77. ISBN 978-1-84800-070-4.
- ^ a b Knuth (1997). The Art Of Computer Programming, Volume 1, 3/E. Pearson Education. p. 363. ISBN 0-201-89683-4.
- ^ Iván Flores (1971). Computer programming system/360. Prentice-Hall. p. 39.
- ^ Kenneth Rosen (2011). Discrete Mathematics and Its Applications, 7th edition. McGraw-Hill Science. p. 749. ISBN 978-0-07-338309-5.
- ^ David R. Mazur (2010). Combinatorics: A Guided Tour. Mathematical Association of America. p. 246. ISBN 978-0-88385-762-5.
- ^ a b "Binary tree", Encyclopedia of Mathematics, EMS Press, 2001 [1994] also in print as Michiel Hazewinkel (1997). Encyclopaedia of Mathematics. Supplement I. Springer Science & Business Media. p. 124. ISBN 978-0-7923-4709-5.
- ^ L.R. Foulds (1992). Graph Theory Applications. Springer Science & Business Media. p. 32. ISBN 978-0-387-97599-3.
- ^ David Makinson (2009). Sets, Logic and Maths for Computing. Springer Science & Business Media. p. 199. ISBN 978-1-84628-845-6.
- ^ Jonathan L. Gross (2007). Combinatorial Methods with Computer Applications. CRC Press. p. 248. ISBN 978-1-58488-743-0.
- ^ a b Kenneth Rosen (2011). Discrete Mathematics and Its Applications 7th edition. McGraw-Hill Science. pp. 352–353. ISBN 978-0-07-338309-5.
- ^ Te Chiang Hu; Man-tak Shing (2002). Combinatorial Algorithms. Courier Dover Publications. p. 162. ISBN 978-0-486-41962-6.
- ^ Lih-Hsing Hsu; Cheng-Kuan Lin (2008). Graph Theory and Interconnection Networks. CRC Press. p. 66. ISBN 978-1-4200-4482-9.
- ^ J. Flum; M. Grohe (2006). Parameterized Complexity Theory. Springer. p. 245. ISBN 978-3-540-29953-0.
- ^ Tamassia, Michael T. Goodrich, Roberto (2011). Algorithm design : foundations, analysis, and Internet examples (2 ed.). New Delhi: Wiley-India. p. 76. ISBN 978-81-265-0986-7.
{{cite book}}
: CS1 maint: multiple names: authors list (link) - ^ "full binary tree". NIST.
- ^ Richard Stanley, Enumerative Combinatorics, volume 2, p.36
- ^ "perfect binary tree". NIST.
- ^ a b "complete binary tree". NIST.
- ^ "almost complete binary tree". Archived from the original on 2016-03-04. Retrieved 2015-12-11.
- ^ "nearly complete binary tree" (PDF). Archived (PDF) from the original on 2022-10-09.
- ^ Aaron M. Tenenbaum, et al. Data Structures Using C, Prentice Hall, 1990 ISBN 0-13-199746-7
- ^ Paul E. Black (ed.), entry for data structure in Dictionary of Algorithms and Data Structures. U.S. National Institute of Standards and Technology. 15 December 2004. Online version Archived December 21, 2010, at the Wayback Machine Accessed 2010-12-19.
- ^ Parmar, Anand K. (2020-01-22). "Different Types of Binary Tree with colourful illustrations". Medium. Retrieved 2020-01-24.
- ^ Mehta, Dinesh; Sartaj Sahni (2004). Handbook of Data Structures and Applications. Chapman and Hall. ISBN 1-58488-435-5.
- ^ D. Samanta (2004). Classic Data Structures. PHI Learning Pvt. Ltd. pp. 264–265. ISBN 978-81-203-1874-8.
- ^ Michael L. Scott (2009). Programming Language Pragmatics (3rd ed.). Morgan Kaufmann. p. 347. ISBN 978-0-08-092299-7.
- ^ Introduction to algorithms. Cormen, Thomas H., Cormen, Thomas H. (2nd ed.). Cambridge, Mass.: MIT Press. 2001. p. 128. ISBN 0-262-03293-7. OCLC 46792720.
{{cite book}}
: CS1 maint: others (link) - ^ Laakso, Mikko. "Priority Queue and Binary Heap". University of Aalto. Retrieved 2023-10-11.
- ^ Demaine, Erik. "6.897: Advanced Data Structures Spring 2003 Lecture 12" (PDF). MIT CSAIL. Archived from the original (PDF) on 24 November 2005. Retrieved 14 April 2022.
- ^ a b Dung X. Nguyen (2003). "Binary Tree Structure". rice.edu. Retrieved December 28, 2010.
- ^ Wittman, Todd (2015-02-13). "Lecture 18: Tree Traversals" (PDF). Archived from the original (PDF) on 2015-02-13. Retrieved 2023-04-29.
Bibliography
[edit]- Donald Knuth. The Art of Computer Programming vol 1. Fundamental Algorithms, Third Edition. Addison-Wesley, 1997. ISBN 0-201-89683-4. Section 2.3, especially subsections 2.3.1–2.3.2 (pp. 318–348).
External links
[edit]- binary trees Archived 2020-09-23 at the Wayback Machine entry in the FindStat database
- Binary Tree Proof by Induction
- Balanced binary search tree on array How to create bottom-up an Ahnentafel list, or a balanced binary search tree on array
- Binary trees and Implementation of the same with working code examples
- Binary Tree JavaScript Implementation with source code