Jump to content

John Tukey: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
OAbot (talk | contribs)
m Open access bot: doi added to citation with #oabot.
 
(35 intermediate revisions by 18 users not shown)
Line 1: Line 1:
{{short description|American mathematician}}
{{short description|American mathematician}}
{{Use dmy dates|date=August 2019|cs1-dates=y}}
{{Redirect|Tukey}}
{{Redirect|Tukey}}
{{Use dmy dates|date=August 2019|cs1-dates=y}}
{{Infobox scientist
{{Infobox scientist
| name = John Tukey
| name = John Tukey
| image = John Tukey.jpg
| image = John Tukey.jpg
| caption = John Wilder Tukey
| birth_date = {{Birth date|1915|06|16}}
| birth_date = {{Birth date|1915|06|16}}
| birth_place = [[New Bedford, Massachusetts]], U.S.
| birth_place = [[New Bedford, Massachusetts]], U.S.
| death_date = {{death date and age|2000|07|26|1915|06|16}}
| death_date = {{death date and age|2000|07|26|1915|06|16}}
| death_place = [[New Brunswick, New Jersey]], U.S.
| death_place = [[New Brunswick, New Jersey]], U.S.
| citizenship = US
| alma_mater = {{ubl|[[Brown University]] (BA, MSc)|[[Princeton University]] (PhD)}}
| thesis_title = On Denumerability in Topology<ref name="math">{{MathGenealogy|id=15860}}</ref>
| thesis_title = On Denumerability in Topology<ref name="math">{{MathGenealogy|id=15860}}</ref>
| doctoral_advisor = [[Solomon Lefschetz]]<ref name="math"/>
| doctoral_advisor = [[Solomon Lefschetz]]<ref name="math"/>
Line 22: Line 19:
*[[Paul Meier (statistician)|Paul Meier]]
*[[Paul Meier (statistician)|Paul Meier]]
*[[Frederick Mosteller]]
*[[Frederick Mosteller]]
*[[John A. Hartigan]]
}}
}}
| known_for = {{ubl|[[Exploratory data analysis]]|[[Multiple comparisons problem]]|[[Projection pursuit]]|[[Box plot]]|[[Blackman–Tukey transformation]]|[[Cooley–Tukey FFT algorithm]]|[[Anscombe_transform#Alternatives|Freeman–Tukey transformation]]|[[Siegel–Tukey test]]|[[Ham sandwich theorem|Stone–Tukey theorem]]|[[Tukey–Duckworth test]]|[[Tukey's range test]]|[[Tukey lambda distribution]]|[[Trimean|Tukey's trimean]]|[[Tukey's test of additivity]]|[[Teichmüller–Tukey lemma|Tukey's lemma]]|[[Bland–Altman plot|Tukey mean difference plot]]|[[Centerpoint (geometry)|Tukey median]]|[[Tukey depth]]|[[Robust_statistics#Empirical_influence_function|Tukey's biweight function]]|[[Outlier#Tukey's_fences|Tukey's fences]]|[[Window_function#Tukey_window|Tukey window]]|[[Cepstrum]]|[[Flexagon]]|[[Median polish]]|[[Midhinge]]|[[Slash distribution]]|[[Theory of conjoint measurement]]|[[Bit|Coining the term 'bit']]|[[Scagnostics]]}}
| known_for = {{ubl|[[Exploratory data analysis]]|[[Multiple comparisons problem]]|[[Projection pursuit]]|[[Box plot]]|[[Blackman–Tukey transformation]]|[[Cooley–Tukey FFT algorithm]]|[[Anscombe transform#Alternatives|Freeman–Tukey transformation]]|[[Siegel–Tukey test]]|[[Ham sandwich theorem|Stone–Tukey theorem]]|[[Tukey–Duckworth test]]|[[Tukey's range test]]|[[Tukey lambda distribution]]|[[Trimean|Tukey's trimean]]|[[Tukey's test of additivity]]|[[Teichmüller–Tukey lemma|Tukey's lemma]]|[[Bland–Altman plot|Tukey mean difference plot]]|[[Centerpoint (geometry)|Tukey median]]|[[Tukey depth]]|[[Robust statistics#Empirical influence function|Tukey's biweight function]]|[[Outlier#Tukey's fences|Tukey's fences]]|[[Window function#Tukey window|Tukey window]]|[[Cepstrum]]|[[Flexagon]]|[[Median polish]]|[[Midhinge]]|[[Slash distribution]]|[[Theory of conjoint measurement]]|[[Bit|Coining the term 'bit']]|[[Scagnostics]]}}
| footnotes =
| footnotes =
| field = [[Mathematician]]
| field = [[Topology]]
| work_institution = {{ubl|[[Bell Labs]]|[[Princeton University]]}}
| work_institution = {{ubl|[[Bell Labs]]|[[Princeton University]]}}
| prizes = {{ubl|[[Wilks Memorial Award]] (1965)|[[National Medal of Science]] (USA) in Mathematical, Statistical, and Computational Sciences (1973)|[[Shewhart Medal]] (1976)|[[IEEE Medal of Honor]] (1982)|[http://www.asq.org/about-asq/awards/deming.html Deming Medal] (1982)|[https://alumni.princeton.edu/our-community/awards/james-madison-medal James Madison Medal] (1984)|[[Foreign Member of the Royal Society]] (1991)}}
| prizes = {{ubl|[[Wilks Memorial Award]] (1965)|[[National Medal of Science]] (1973)|[[Shewhart Medal]] (1976)|[[IEEE Medal of Honor]] (1982)|[http://www.asq.org/about-asq/awards/deming.html Deming Medal] (1982)|[[Foreign Member of the Royal Society]] (1991)}}
| education = {{ubl|[[Brown University]] ([[B. A.|BA]], [[M. S.|MS]])|[[Princeton University]] ([[PhD]])}}
}}
}}
'''John Wilder Tukey''' ({{IPAc-en|ˈ|t|uː|k|i}}; June 16, 1915 – July 26, 2000) was an American [[mathematician]] and [[statistician]], best known for the development of the [[Cooley–Tukey FFT algorithm|fast Fourier Transform (FFT) algorithm]] and [[box plot]].<ref>{{cite journal |author-last=Sande |author-first=Gordon |title=Obituary: John Wilder Tukey |journal=[[Physics Today]] |date=July 2001 |volume=54 |issue=7 |pages=80–81 |doi=10.1063/1.1397408|doi-access=free }}</ref> The [[Tukey's range test|Tukey range test]], the [[Tukey lambda distribution]], the [[Tukey's test of additivity|Tukey test of additivity]], and the [[Teichmüller–Tukey lemma]] all bear his name. He is also credited with coining the term '[[bit]]' and the first published use of the word '[[software]]'.
'''John Wilder Tukey''' ({{IPAc-en|ˈ|t|uː|k|i}}; June 16, 1915 – July 26, 2000) was an American [[mathematician]] and [[statistician]], best known for the development of the [[Cooley–Tukey FFT algorithm|fast Fourier Transform (FFT) algorithm]] and [[box plot]].<ref>{{cite journal |author-last=Sande |author-first=Gordon |title=Obituary: John Wilder Tukey |journal=[[Physics Today]] |date=July 2001 |volume=54 |issue=7 |pages=80–81 |doi=10.1063/1.1397408|doi-access=free }}</ref> The [[Tukey's range test|Tukey range test]], the [[Tukey lambda distribution]], the [[Tukey's test of additivity|Tukey test of additivity]], and the [[Teichmüller–Tukey lemma]] all bear his name. He is also credited with coining the term ''[[bit]]'' and the first published use of the word ''[[software]]''.


== Biography ==
== Biography ==
Tukey was born in [[New Bedford, Massachusetts]] in 1915, to a Latin teacher father and a private tutor. He was mainly taught by his mother and attended regular classes only for certain subjects like French.<ref name="Leonhardt_2000"/> Tukey obtained a [[Bachelor of Arts|BA]] in 1936 and [[MSc]] in 1937 in chemistry, from [[Brown University]], before moving to [[Princeton University]], where in 1939 he received a [[PhD]] in [[mathematics]] after completing a doctoral dissertation titled "On [[denumerability]] in [[topology]]".<ref>{{cite web |title=John Tukey |url=https://mathgenealogy.org/id.php?id=15860 |website=Mathematics Genealogy Project |access-date=2 July 2022}}</ref><ref>{{Cite book|last=Tukey|first=John W.|url=https://catalog.princeton.edu/catalog/2700812|title=On denumerability in topology|date=1939|language=en}}</ref><ref>{{cite web |url=http://www.ieeeghn.org/wiki/index.php/John_Tukey |title=John Tukey |work=IEEE Global History Network |publisher=IEEE |access-date=2011-07-18}}</ref>
Tukey was born in [[New Bedford, Massachusetts]], in 1915, to a Latin teacher father and a private tutor. He was mainly taught by his mother and attended regular classes only for certain subjects like French.<ref name="Leonhardt_2000"/> Tukey obtained a [[Bachelor of Arts|B.A.]] in 1936 and [[M.S.]] in 1937 in chemistry, from [[Brown University]], before moving to [[Princeton University]], where in 1939 he received a [[PhD]] in [[mathematics]] after completing a doctoral dissertation titled "On [[denumerability]] in [[topology]]".<ref>{{cite web |title=John Tukey |url=https://mathgenealogy.org/id.php?id=15860 |website=Mathematics Genealogy Project |access-date=2 July 2022}}</ref><ref>{{Cite book|last=Tukey|first=John W.|url=https://catalog.princeton.edu/catalog/2700812|title=On denumerability in topology|date=1939|language=en}}</ref><ref>{{cite web |url=http://www.ieeeghn.org/wiki/index.php/John_Tukey |title=John Tukey |work=IEEE Global History Network |publisher=IEEE |access-date=2011-07-18}}</ref>


During [[World War II]], Tukey worked at the Fire Control Research Office and collaborated with [[Samuel S. Wilks|Samuel Wilks]] and [[William Gemmell Cochran|William Cochran]]. He is claimed to have helped design the U-2 spy plane. After the war, he returned to Princeton, dividing his time between the university and [[Bell Labs|AT&T Bell Laboratories]]. In 1962, Tukey was elected to the [[American Philosophical Society]].<ref>{{Cite web|title=APS Member History|url=https://search.amphilsoc.org/memhist/search?creator=John+W.+Tukey&title=&subject=&subdiv=&mem=&year=&year-max=&dead=&keyword=&smode=advanced|access-date=2021-01-28|website=search.amphilsoc.org}}</ref> He became a full professor at 35 and founding chairman of the Princeton statistics department in 1965.<ref name="Leonhardt_2000"/>
During [[World War II]], Tukey worked at the Fire Control Research Office and collaborated with [[Samuel S. Wilks|Samuel Wilks]] and [[William Gemmell Cochran|William Cochran]]. He is claimed to have helped design the U-2 spy plane. After the war, he returned to Princeton, dividing his time between the university and [[Bell Labs|AT&T Bell Laboratories]]. In 1962, Tukey was elected to the [[American Philosophical Society]].<ref>{{Cite web|title=APS Member History|url=https://search.amphilsoc.org/memhist/search?creator=John+W.+Tukey&title=&subject=&subdiv=&mem=&year=&year-max=&dead=&keyword=&smode=advanced|access-date=2021-01-28|website=search.amphilsoc.org}}</ref> He became a full professor at 35 and founding chairman of the Princeton statistics department in 1965.<ref name="Leonhardt_2000"/>


Among many contributions to [[civil society]], Tukey served on a committee of the [[American Statistical Association]] that produced a report critiquing the statistical methodology of the [[Kinsey Reports|Kinsey Report]], ''Statistical Problems of the Kinsey Report on Sexual Behavior in the Human Male'', which summarised "A random selection of three people would have been better than a group of 300 chosen by Mr. Kinsey".
Among many contributions to [[civil society]], Tukey served on a committee of the [[American Statistical Association]] that produced a report critiquing the statistical methodology of the [[Kinsey Reports|Kinsey Report]], ''Statistical Problems of the Kinsey Report on Sexual Behavior in the Human Male'', which summarized "A random selection of three people would have been better than a group of 300 chosen by Mr. Kinsey".


From 1960 to 1980, Tukey helped design the NBC television network polls used to predict and analyze elections. He was also a consultant to the Educational Testing Service, the Xerox Corporation, and Merck & Company.
From 1960 to 1980, Tukey helped design the NBC television network polls used to predict and analyze elections. He was also a consultant to the Educational Testing Service, the Xerox Corporation, and Merck & Company.

During the 1970s and early 1980s, Tukey played a key role in the design and conduct of the [[National Assessment of Educational Progress]].


He was awarded the [[National Medal of Science]] by President Nixon in 1973.<ref name="Leonhardt_2000"/> He was awarded the [[IEEE Medal of Honor]] in 1982 "For his contributions to the spectral analysis of random processes and the [[fast Fourier transform]] (FFT) [[algorithm]]".
He was awarded the [[National Medal of Science]] by President Nixon in 1973.<ref name="Leonhardt_2000"/> He was awarded the [[IEEE Medal of Honor]] in 1982 "For his contributions to the spectral analysis of random processes and the [[fast Fourier transform]] (FFT) [[algorithm]]".
Line 45: Line 46:


== Scientific contributions ==
== Scientific contributions ==
Early in his career Tukey worked on developing [[statistical]] methods for computers at [[Bell Labs]] where he invented the term "bit" in 1947.<ref name="Shannon_1948_1"/><ref name="Shannon_1948_2"/><ref name="Shannon_1949"/>
Early in his career Tukey worked on developing [[statistical]] methods for computers at [[Bell Labs]], where he invented the term ''bit'' in 1947.<ref name="Shannon_1948_1"/><ref name="Shannon_1948_2"/><ref name="Shannon_1949"/>


His statistical interests were many and varied. He is particularly remembered for his development with [[James Cooley]] of the [[Cooley–Tukey FFT algorithm]]. In 1970, he contributed significantly to what is today known as the [[Resampling (statistics)#Jackknife|jackknife estimation]]—also termed Quenouille–Tukey jackknife. He introduced the [[box plot]] in his 1977 book, "Exploratory Data Analysis".
His statistical interests were many and varied. He is particularly remembered for his development with [[James Cooley]] of the [[Cooley–Tukey FFT algorithm]]. In 1970, he contributed significantly to what is today known as the [[Jackknife resampling|jackknife]]—also termed Quenouille–Tukey jackknife. He introduced the [[box plot]] in his 1977 book, "Exploratory Data Analysis".


[[Tukey's range test]], the [[Tukey lambda distribution]], [[Tukey's test of additivity]], [[Tukey's lemma]], and the [[Tukey window]] all bear his name. He is also the creator of several little-known methods such as the [[trimean]] and [[Median#Median.E2.80.93median line|median-median line]], an easier alternative to [[linear regression]].
[[Tukey's range test]], the [[Tukey lambda distribution]], [[Tukey's test of additivity]], [[Tukey's lemma]], and the [[Tukey window]] all bear his name. He is also the creator of several little-known methods such as the [[trimean]] and [[Median#Median.E2.80.93median line|median-median line]], an easier alternative to [[linear regression]].
Line 53: Line 54:
In 1974, he developed, with [[Jerome H. Friedman]], the concept of the [[projection pursuit]].<ref>{{cite journal |title=A Projection Pursuit Algorithm for Exploratory Data Analysis |author-first1=Jerome H. |author-last1=Friedman |author-link1=Jerome H. Friedman |author-first2=John Wilder |author-last2=Tukey |author-link2=John Tukey |journal=[[IEEE Transactions on Computers]] |date=September 1974 |volume=C-23 |issue=9 |pages= 881–890 |issn=0018-9340 |doi=10.1109/T-C.1974.224051|osti=1442925 |s2cid=7997450 }}</ref>
In 1974, he developed, with [[Jerome H. Friedman]], the concept of the [[projection pursuit]].<ref>{{cite journal |title=A Projection Pursuit Algorithm for Exploratory Data Analysis |author-first1=Jerome H. |author-last1=Friedman |author-link1=Jerome H. Friedman |author-first2=John Wilder |author-last2=Tukey |author-link2=John Tukey |journal=[[IEEE Transactions on Computers]] |date=September 1974 |volume=C-23 |issue=9 |pages= 881–890 |issn=0018-9340 |doi=10.1109/T-C.1974.224051|osti=1442925 |s2cid=7997450 }}</ref>


=== Data analysis and foundations of data science ===
=== Statistical practice ===

He also contributed to statistical practice and articulated the important distinction between [[exploratory data analysis]] and [[confirmatory data analysis]], believing that much statistical methodology placed too great an emphasis on the latter.
John Tukey contributed greatly to statistical practice and data analysis in general. In fact, some regard John Tukey as the father of data science. At the very least, he pioneered many of the key foundations of what came later to be known as data science.<ref>David Donoho (2017), 50 Years of Data Science, Journal of Computational and Graphical Statistics, 2017, https://doi.org/10.1080/10618600.2017.1384734</ref>

Making sense of data has a long history and has been addressed by statisticians, mathematicians, scientists, and others for many many years. During the 1960s, Tukey challenged the dominance at the time of what he called "confirmatory data analysis", statistical analyses driven by rigid mathematical configurations.<ref>John W. Tukey (1962) The Future of Data Analysis. Ann. Math. Statist. 33(1): 1-67. DOI: 10.1214/aoms/1177704711.</ref> Tukey emphasized the importance of having a more flexible attitude towards data analysis and of exploring data carefully to see what structures and information might be contained therein. He called this "exploratory data analysis" (EDA). In many ways, EDA was a precursor to data science.

Tukey also realized the importance of computer science to EDA. Graphics are an integral part of EDA methodology and, while much of Tukey's work focused on static displays (such as box plots) that could be drawn by hand, he realized that computer graphics would be much more effective for studying multivariate data. PRIM-9, the first program for viewing multivariate data, was conceived by him during the early 1970s.<ref>Friedman, J. H., & Stuetzle, W. (2002). John W. Tukey’s Work on Interactive Graphics. The Annals of Statistics, 30(6), 1629-1639. http://www.jstor.org/stable/1558733</ref>

This coupling of data analysis and computer science is what is now called data science.


Though he believed in the utility of separating the two types of analysis, he pointed out that sometimes, especially in [[natural science]], this was problematic and termed such situations [[uncomfortable science]].
Tukey articulated the important distinction between [[exploratory data analysis]] and [[confirmatory data analysis]], believing that much statistical methodology placed too great an emphasis on the latter. Though he believed in the utility of separating the two types of analysis, he pointed out that sometimes, especially in [[natural science]], this was problematic and termed such situations [[uncomfortable science]].


A. D. Gordon offered the following summary of Tukey's principles for statistical practice:<ref name="mathshistory">{{cite web |title=John Tukey - Biography |url=https://mathshistory.st-andrews.ac.uk/Biographies/Tukey/ |website=Maths History |access-date=18 February 2022 |language=en}}</ref>
A. D. Gordon offered the following summary of Tukey's principles for statistical practice:<ref name="mathshistory">{{cite web |title=John Tukey - Biography |url=https://mathshistory.st-andrews.ac.uk/Biographies/Tukey/ |website=Maths History |access-date=18 February 2022 |language=en}}</ref>
{{quote|... the usefulness and limitation of mathematical statistics; the importance of having methods of statistical analysis that are robust to violations of the assumptions underlying their use; the need to amass experience of the behaviour of specific methods of analysis in order to provide guidance on their use; the importance of allowing the possibility of data's influencing the choice of method by which they are analysed; the need for statisticians to reject the role of "guardian of proven truth", and to resist attempts to provide once-for-all solutions and tidy over-unifications of the subject; the iterative nature of data analysis; implications of the increasing power, availability, and cheapness of computing facilities; the training of statisticians.}}
{{blockquote|... the usefulness and limitation of mathematical statistics; the importance of having methods of statistical analysis that are robust to violations of the assumptions underlying their use; the need to amass experience of the behaviour of specific methods of analysis in order to provide guidance on their use; the importance of allowing the possibility of data's influencing the choice of method by which they are analysed; the need for statisticians to reject the role of "guardian of proven truth", and to resist attempts to provide once-for-all solutions and tidy over-unifications of the subject; the iterative nature of data analysis; implications of the increasing power, availability, and cheapness of computing facilities; the training of statisticians.}}


Tukey's lectures were described to be unusual. McCullagh described his lecture given in London in 1977:<ref name="mathshistory"/><ref>P McCullagh, John Wilder Tukey, ''Biographical Memoirs of Fellows of the Royal Society'' 49 (2003), 538-559.</ref>
Tukey's lectures were described to be unusual. McCullagh described his lecture given in London in 1977:<ref name="mathshistory"/><ref>P McCullagh, John Wilder Tukey, ''Biographical Memoirs of Fellows of the Royal Society'' 49 (2003), 538-559.</ref>
{{quote|Tukey ambled to the podium, a great bear of a man dressed in baggy pants and a black knitted shirt. These might once have been a matching pair but the vintage was such that it was hard to tell. ... Carefully and deliberately a list of headings was chalked on the blackboard. The words came too, not many, like overweight parcels, delivered at a slow unfaltering pace. ... When it was complete, Tukey turned to face the audience and the podium ... "Comments, queries, suggestions?" he asked the audience ... As he waited for a response, he clambered onto the podium and manoeuvred until he was sitting cross-legged facing the audience. ... We in the audience sat like spectators at the zoo waiting for the great bear to move or say something. But the great bear appeared to be doing the same thing, and the feeling was not comfortable.}}
{{blockquote|Tukey ambled to the podium, a great bear of a man dressed in baggy pants and a black knitted shirt. These might once have been a matching pair but the vintage was such that it was hard to tell. ... Carefully and deliberately a list of headings was chalked on the blackboard. The words came too, not many, like overweight parcels, delivered at a slow unfaltering pace. ... When it was complete, Tukey turned to face the audience and the podium ... "Comments, queries, suggestions?" he asked the audience ... As he waited for a response, he clambered onto the podium and manoeuvred until he was sitting cross-legged facing the audience. ... We in the audience sat like spectators at the zoo waiting for the great bear to move or say something. But the great bear appeared to be doing the same thing, and the feeling was not comfortable.}}


==Coining the term ''bit''==
==Coining the term ''bit''==
While working with [[John von Neumann]] on early computer designs, Tukey introduced the word "[[bit]]" as a portmanteau of "binary digit".<ref>{{Cite web|url=http://www.linfo.org/bit.html|title=Bit definition by The Linux Information Project (LINFO)|website=www.linfo.org}}</ref> The term "bit" was first used in an article by [[Claude Shannon]] in 1948.
While working with [[John von Neumann]] on early computer designs, Tukey introduced the word ''[[bit]]'' as a portmanteau of ''binary digit''.<ref>{{Cite web|url=http://www.linfo.org/bit.html|title=Bit definition by The Linux Information Project (LINFO)|website=www.linfo.org}}</ref> The term ''bit'' was first used in [[A Mathematical Theory of Communication|an article]] by [[Claude Shannon]] in 1948.


==See also==
==See also==
Line 73: Line 81:
* {{cite book |author-last1=Andrews |author-first1=David F. |author-first2=Peter J. |author-last2=Bickel |author-first3=Frank R. |author-last3=Hampel |author-first4=Peter J. |author-last4=Huber |author-first5=W. H. |author-last5=Rogers |author-first6=John Wilder |author-last6=Tukey |author-link6=John Tukey |title=Robust estimates of location: survey and advances |date=1972 |publisher=[[Princeton University Press]] |isbn=978-0-691-08113-7 |oclc=369963 |url-access=registration |url=https://archive.org/details/robustestimateso0000unse }}
* {{cite book |author-last1=Andrews |author-first1=David F. |author-first2=Peter J. |author-last2=Bickel |author-first3=Frank R. |author-last3=Hampel |author-first4=Peter J. |author-last4=Huber |author-first5=W. H. |author-last5=Rogers |author-first6=John Wilder |author-last6=Tukey |author-link6=John Tukey |title=Robust estimates of location: survey and advances |date=1972 |publisher=[[Princeton University Press]] |isbn=978-0-691-08113-7 |oclc=369963 |url-access=registration |url=https://archive.org/details/robustestimateso0000unse }}
* {{cite book |author-last1=Basford |author-first1=Kaye E. |author-link1=Kaye Basford |author-first2=John Wilder |author-last2=Tukey |author-link2=John Tukey |title=Graphical Analysis of Multiresponse Data |url=https://archive.org/details/graphicalanalysi0000basf |url-access=registration |date=1998 |publisher=[[Chapman & Hall]]/[[CRC Press]] |isbn=978-0-8493-0384-5 |oclc=154674707}}<ref>{{cite journal |author-last=Talbot |author-first=M. |date=June 2000 |issue=2 |journal=Biometrics |jstor=2677019 |pages=649–650 |title=none |volume=56 |doi=10.1111/j.0006-341X.2000.00647.x}}</ref><ref>{{cite journal |author-last=Cooper |author-first=Mark |date=July–August 2000 |doi=10.2135/cropsci2000.0015br |issue=4 |journal=Crop Science |page=1184 |title=none |volume=40}}</ref><ref>{{cite journal |author-last=Heckler |author-first=Charles E. |date=February 2001 |doi=10.1198/tech.2001.s547 |issue=1 |journal=Technometrics |jstor=1270862 |pages=97–98 |title=none |volume=43|s2cid=26430218 }}</ref><ref>{{cite journal |author-last=Broadfoot |author-first=L. |date=June 2001 |doi=10.1017/s002185960124893x |issue=4 |journal=[[The Journal of Agricultural Science]] |title=none |volume=136|pages=471–475 |s2cid=86230606 }}</ref>
* {{cite book |author-last1=Basford |author-first1=Kaye E. |author-link1=Kaye Basford |author-first2=John Wilder |author-last2=Tukey |author-link2=John Tukey |title=Graphical Analysis of Multiresponse Data |url=https://archive.org/details/graphicalanalysi0000basf |url-access=registration |date=1998 |publisher=[[Chapman & Hall]]/[[CRC Press]] |isbn=978-0-8493-0384-5 |oclc=154674707}}<ref>{{cite journal |author-last=Talbot |author-first=M. |date=June 2000 |issue=2 |journal=Biometrics |jstor=2677019 |pages=649–650 |title=none |volume=56 |doi=10.1111/j.0006-341X.2000.00647.x}}</ref><ref>{{cite journal |author-last=Cooper |author-first=Mark |date=July–August 2000 |doi=10.2135/cropsci2000.0015br |issue=4 |journal=Crop Science |page=1184 |title=none |volume=40}}</ref><ref>{{cite journal |author-last=Heckler |author-first=Charles E. |date=February 2001 |doi=10.1198/tech.2001.s547 |issue=1 |journal=Technometrics |jstor=1270862 |pages=97–98 |title=none |volume=43|s2cid=26430218 }}</ref><ref>{{cite journal |author-last=Broadfoot |author-first=L. |date=June 2001 |doi=10.1017/s002185960124893x |issue=4 |journal=[[The Journal of Agricultural Science]] |title=none |volume=136|pages=471–475 |s2cid=86230606 }}</ref>
* {{cite book |author-last1=Blackman |author-first=R. B. |author-first2=John Wilder |author-last2=Tukey |author-link=John Tukey |title=The measurement of power spectra from the point of view of communications engineering |url=https://archive.org/details/TheMeasurementOfPowerSpectra |date=1959 |publisher=[[Dover Publications]] |isbn=978-0-486-60507-4}}
* {{cite book |author-last1=Blackman |author-first=R. B. |author-link=R. B. Blackman |author-first2=John Wilder |author-last2=Tukey |author-link2=John Tukey |title=The measurement of power spectra from the point of view of communications engineering |url=https://archive.org/details/TheMeasurementOfPowerSpectra |date=1959 |publisher=[[Dover Publications]] |isbn=978-0-486-60507-4}}
* {{cite book |author-last1=Cochran |author-first1=William Gemmell |author-link1=William Gemmell Cochran |author-first2=Charles Frederick |author-last2=Mosteller |author-link2=Charles Frederick Mosteller |author-first3=John Wilder |author-last3=Tukey |author-link3=John Tukey |title=Statistical problems of the Kinsey report on sexual behavior in the human male |date=1953 |publisher=[[Journal of the American Statistical Association]]|doi=10.1080/01621459.1953.10501194}}
* {{cite book |author-last1=Cochran |author-first1=William Gemmell |author-link1=William Gemmell Cochran |author-first2=Charles Frederick |author-last2=Mosteller |author-link2=Charles Frederick Mosteller |author-first3=John Wilder |author-last3=Tukey |author-link3=John Tukey |title=Statistical problems of the Kinsey report on sexual behavior in the human male |date=1953 |publisher=[[Journal of the American Statistical Association]]|doi=10.1080/01621459.1953.10501194}}
* {{cite book |editor-last1=Hoaglin |editor-first1=David C. |editor-first2=Charles Frederick |editor-last2=Mosteller |editor-link2=Charles Frederick Mosteller |editor-first3=John Wilder |editor-last3=Tukey |editor-link3=John Tukey |title=Understanding Robust and Exploratory Data Analysis |date=1983 |publisher=[[Wiley (publisher)|Wiley]] |isbn=978-0-471-09777-8 |oclc=8495063}}
* {{cite journal |last1=Cooley |first1=James W. |first2=John W. |last2=Tukey |title=An algorithm for the machine calculation of complex Fourier series |journal=[[Mathematics of Computation|Math. Comput.]] |volume=19 |issue= 90|pages=297–301 |year=1965 |doi=10.2307/2003354 |jstor=2003354 |doi-access=free }}* {{cite book |editor-last1=Hoaglin |editor-first1=David C. |editor-first2=Charles Frederick |editor-last2=Mosteller |editor-link2=Charles Frederick Mosteller |editor-first3=John Wilder |editor-last3=Tukey |editor-link3=John Tukey |title=Understanding Robust and Exploratory Data Analysis |date=1983 |publisher=[[Wiley (publisher)|Wiley]] |isbn=978-0-471-09777-8 |oclc=8495063}}
* {{cite book |editor-last1=Hoaglin |editor-first1=David C. |editor-first2=Charles Frederick |editor-last2=Mosteller |editor-link2=Charles Frederick Mosteller |editor-first3=John Wilder |editor-last3=Tukey |editor-link3=John Tukey |title=Exploring Data Tables, Trends and Shapes |year=1985 |publisher=[[Wiley (publisher)|Wiley]] |isbn=978-0-471-09776-1 |oclc=11550398 |url-access=registration |url=https://archive.org/details/exploringdatatab0000unse }}
* {{cite book |editor-last1=Hoaglin |editor-first1=David C. |editor-first2=Charles Frederick |editor-last2=Mosteller |editor-link2=Charles Frederick Mosteller |editor-first3=John Wilder |editor-last3=Tukey |editor-link3=John Tukey |title=Exploring Data Tables, Trends and Shapes |year=1985 |publisher=[[Wiley (publisher)|Wiley]] |isbn=978-0-471-09776-1 |oclc=11550398 |url-access=registration |url=https://archive.org/details/exploringdatatab0000unse }}
* {{cite book |editor-last1=Hoaglin |editor-first1=David C. |editor-first2=Charles Frederick |editor-last2=Mosteller |editor-link2=Charles Frederick Mosteller |editor-first3=John Wilder |editor-last3=Tukey |editor-link3=John Tukey |title=Fundamentals of exploratory analysis of variance |date=1991 |publisher=[[Wiley (publisher)|Wiley]] |isbn=978-0-471-52735-0 |oclc=23180322}}
* {{cite book |editor-last1=Hoaglin |editor-first1=David C. |editor-first2=Charles Frederick |editor-last2=Mosteller |editor-link2=Charles Frederick Mosteller |editor-first3=John Wilder |editor-last3=Tukey |editor-link3=John Tukey |title=Fundamentals of exploratory analysis of variance |date=1991 |publisher=[[Wiley (publisher)|Wiley]] |isbn=978-0-471-52735-0 |oclc=23180322}}
Line 140: Line 148:
[[Category:Brown University alumni]]
[[Category:Brown University alumni]]
[[Category:Burials at Princeton Cemetery]]
[[Category:Burials at Princeton Cemetery]]
[[Category:Foreign Members of the Royal Society]]
[[Category:Foreign members of the Royal Society]]
[[Category:Members of the United States National Academy of Sciences]]
[[Category:Members of the United States National Academy of Sciences]]
[[Category:20th-century American mathematicians]]
[[Category:20th-century American mathematicians]]
[[Category:Computational statisticians]]

Latest revision as of 09:15, 13 September 2024

John Tukey
Born(1915-06-16)June 16, 1915
DiedJuly 26, 2000(2000-07-26) (aged 85)
Education
Known for
Awards
Scientific career
FieldsTopology
Institutions
Thesis On Denumerability in Topology[1]
Doctoral advisorSolomon Lefschetz[1]
Doctoral students

John Wilder Tukey (/ˈtki/; June 16, 1915 – July 26, 2000) was an American mathematician and statistician, best known for the development of the fast Fourier Transform (FFT) algorithm and box plot.[2] The Tukey range test, the Tukey lambda distribution, the Tukey test of additivity, and the Teichmüller–Tukey lemma all bear his name. He is also credited with coining the term bit and the first published use of the word software.

Biography

[edit]

Tukey was born in New Bedford, Massachusetts, in 1915, to a Latin teacher father and a private tutor. He was mainly taught by his mother and attended regular classes only for certain subjects like French.[3] Tukey obtained a B.A. in 1936 and M.S. in 1937 in chemistry, from Brown University, before moving to Princeton University, where in 1939 he received a PhD in mathematics after completing a doctoral dissertation titled "On denumerability in topology".[4][5][6]

During World War II, Tukey worked at the Fire Control Research Office and collaborated with Samuel Wilks and William Cochran. He is claimed to have helped design the U-2 spy plane. After the war, he returned to Princeton, dividing his time between the university and AT&T Bell Laboratories. In 1962, Tukey was elected to the American Philosophical Society.[7] He became a full professor at 35 and founding chairman of the Princeton statistics department in 1965.[3]

Among many contributions to civil society, Tukey served on a committee of the American Statistical Association that produced a report critiquing the statistical methodology of the Kinsey Report, Statistical Problems of the Kinsey Report on Sexual Behavior in the Human Male, which summarized "A random selection of three people would have been better than a group of 300 chosen by Mr. Kinsey".

From 1960 to 1980, Tukey helped design the NBC television network polls used to predict and analyze elections. He was also a consultant to the Educational Testing Service, the Xerox Corporation, and Merck & Company.

During the 1970s and early 1980s, Tukey played a key role in the design and conduct of the National Assessment of Educational Progress.

He was awarded the National Medal of Science by President Nixon in 1973.[3] He was awarded the IEEE Medal of Honor in 1982 "For his contributions to the spectral analysis of random processes and the fast Fourier transform (FFT) algorithm".

Tukey retired in 1985. He died in New Brunswick, New Jersey, on July 26, 2000.

Scientific contributions

[edit]

Early in his career Tukey worked on developing statistical methods for computers at Bell Labs, where he invented the term bit in 1947.[8][9][10]

His statistical interests were many and varied. He is particularly remembered for his development with James Cooley of the Cooley–Tukey FFT algorithm. In 1970, he contributed significantly to what is today known as the jackknife—also termed Quenouille–Tukey jackknife. He introduced the box plot in his 1977 book, "Exploratory Data Analysis".

Tukey's range test, the Tukey lambda distribution, Tukey's test of additivity, Tukey's lemma, and the Tukey window all bear his name. He is also the creator of several little-known methods such as the trimean and median-median line, an easier alternative to linear regression.

In 1974, he developed, with Jerome H. Friedman, the concept of the projection pursuit.[11]

Data analysis and foundations of data science

[edit]

John Tukey contributed greatly to statistical practice and data analysis in general. In fact, some regard John Tukey as the father of data science. At the very least, he pioneered many of the key foundations of what came later to be known as data science.[12]

Making sense of data has a long history and has been addressed by statisticians, mathematicians, scientists, and others for many many years. During the 1960s, Tukey challenged the dominance at the time of what he called "confirmatory data analysis", statistical analyses driven by rigid mathematical configurations.[13] Tukey emphasized the importance of having a more flexible attitude towards data analysis and of exploring data carefully to see what structures and information might be contained therein. He called this "exploratory data analysis" (EDA). In many ways, EDA was a precursor to data science.

Tukey also realized the importance of computer science to EDA. Graphics are an integral part of EDA methodology and, while much of Tukey's work focused on static displays (such as box plots) that could be drawn by hand, he realized that computer graphics would be much more effective for studying multivariate data. PRIM-9, the first program for viewing multivariate data, was conceived by him during the early 1970s.[14]

This coupling of data analysis and computer science is what is now called data science.

Tukey articulated the important distinction between exploratory data analysis and confirmatory data analysis, believing that much statistical methodology placed too great an emphasis on the latter. Though he believed in the utility of separating the two types of analysis, he pointed out that sometimes, especially in natural science, this was problematic and termed such situations uncomfortable science.

A. D. Gordon offered the following summary of Tukey's principles for statistical practice:[15]

... the usefulness and limitation of mathematical statistics; the importance of having methods of statistical analysis that are robust to violations of the assumptions underlying their use; the need to amass experience of the behaviour of specific methods of analysis in order to provide guidance on their use; the importance of allowing the possibility of data's influencing the choice of method by which they are analysed; the need for statisticians to reject the role of "guardian of proven truth", and to resist attempts to provide once-for-all solutions and tidy over-unifications of the subject; the iterative nature of data analysis; implications of the increasing power, availability, and cheapness of computing facilities; the training of statisticians.

Tukey's lectures were described to be unusual. McCullagh described his lecture given in London in 1977:[15][16]

Tukey ambled to the podium, a great bear of a man dressed in baggy pants and a black knitted shirt. These might once have been a matching pair but the vintage was such that it was hard to tell. ... Carefully and deliberately a list of headings was chalked on the blackboard. The words came too, not many, like overweight parcels, delivered at a slow unfaltering pace. ... When it was complete, Tukey turned to face the audience and the podium ... "Comments, queries, suggestions?" he asked the audience ... As he waited for a response, he clambered onto the podium and manoeuvred until he was sitting cross-legged facing the audience. ... We in the audience sat like spectators at the zoo waiting for the great bear to move or say something. But the great bear appeared to be doing the same thing, and the feeling was not comfortable.

Coining the term bit

[edit]

While working with John von Neumann on early computer designs, Tukey introduced the word bit as a portmanteau of binary digit.[17] The term bit was first used in an article by Claude Shannon in 1948.

See also

[edit]

Publications

[edit]
The collected works of John W Tukey, edited by William S. Cleveland
About John Tukey

References

[edit]
  1. ^ a b John Tukey at the Mathematics Genealogy Project
  2. ^ Sande, Gordon (July 2001). "Obituary: John Wilder Tukey". Physics Today. 54 (7): 80–81. doi:10.1063/1.1397408.
  3. ^ a b c Leonhardt, David (2000-07-28). "John Tukey, 85, Statistician; Coined the Word 'Software'". New York Times. Retrieved 2012-09-24.
  4. ^ "John Tukey". Mathematics Genealogy Project. Retrieved 2022-07-02.
  5. ^ Tukey, John W. (1939). On denumerability in topology.
  6. ^ "John Tukey". IEEE Global History Network. IEEE. Retrieved 2011-07-18.
  7. ^ "APS Member History". search.amphilsoc.org. Retrieved 2021-01-28.
  8. ^ Shannon, Claude Elwood (July 1948). "A Mathematical Theory of Communication" (PDF). Bell System Technical Journal. 27 (3): 379–423. doi:10.1002/j.1538-7305.1948.tb01338.x. hdl:11858/00-001M-0000-002C-4314-2. Archived from the original (PDF) on 1998-07-15. The choice of a logarithmic base corresponds to the choice of a unit for measuring information. If the base 2 is used the resulting units may be called binary digits, or more briefly bits, a word suggested by J. W. Tukey.
  9. ^ Shannon, Claude Elwood (October 1948). "A Mathematical Theory of Communication". Bell System Technical Journal. 27 (4): 623–666. doi:10.1002/j.1538-7305.1948.tb00917.x. hdl:11858/00-001M-0000-002C-4314-2.
  10. ^ Shannon, Claude Elwood; Weaver, Warren (1949). A Mathematical Theory of Communication (PDF). University of Illinois Press. ISBN 0-252-72548-4. Archived from the original (PDF) on 1998-07-15.
  11. ^ Friedman, Jerome H.; Tukey, John Wilder (September 1974). "A Projection Pursuit Algorithm for Exploratory Data Analysis". IEEE Transactions on Computers. C-23 (9): 881–890. doi:10.1109/T-C.1974.224051. ISSN 0018-9340. OSTI 1442925. S2CID 7997450.
  12. ^ David Donoho (2017), 50 Years of Data Science, Journal of Computational and Graphical Statistics, 2017, https://doi.org/10.1080/10618600.2017.1384734
  13. ^ John W. Tukey (1962) The Future of Data Analysis. Ann. Math. Statist. 33(1): 1-67. DOI: 10.1214/aoms/1177704711.
  14. ^ Friedman, J. H., & Stuetzle, W. (2002). John W. Tukey’s Work on Interactive Graphics. The Annals of Statistics, 30(6), 1629-1639. http://www.jstor.org/stable/1558733
  15. ^ a b "John Tukey - Biography". Maths History. Retrieved 2022-02-18.
  16. ^ P McCullagh, John Wilder Tukey, Biographical Memoirs of Fellows of the Royal Society 49 (2003), 538-559.
  17. ^ "Bit definition by The Linux Information Project (LINFO)". www.linfo.org.
  18. ^ Talbot, M. (June 2000). Biometrics. 56 (2): 649–650. doi:10.1111/j.0006-341X.2000.00647.x. JSTOR 2677019.{{cite journal}}: CS1 maint: untitled periodical (link)
  19. ^ Cooper, Mark (July–August 2000). Crop Science. 40 (4): 1184. doi:10.2135/cropsci2000.0015br.{{cite journal}}: CS1 maint: untitled periodical (link)
  20. ^ Heckler, Charles E. (February 2001). Technometrics. 43 (1): 97–98. doi:10.1198/tech.2001.s547. JSTOR 1270862. S2CID 26430218.{{cite journal}}: CS1 maint: untitled periodical (link)
  21. ^ Broadfoot, L. (June 2001). The Journal of Agricultural Science. 136 (4): 471–475. doi:10.1017/s002185960124893x. S2CID 86230606.{{cite journal}}: CS1 maint: untitled periodical (link)
[edit]