Jump to content

Ziheng Yang

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 128.40.179.84 (talk) at 16:41, 14 November 2018 (Academic career). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Ziheng Yang
Born1 November 1964 (1964-11) (age 60)
Gansu, China
Nationality UK
Alma materBeijing Agricultural University
Known forModels of DNA sequence evolution and methods of statistical inference in molecular evolution and phylogenetics
AwardsFrink Medal (2010)
Royal Society Wolfson Research Merit Award (2009)

Presidents' Award for Lifetime Achievement (2008)
Fellow of the Royal Society(2006)

Young Investigator’s Prize, American Society of Naturalists (1995)
Scientific career
Fieldsmolecular evolution
molecular phylogenetics
population genetics
computational biology
computational statistics
Markov chain Monte Carlo
InstitutionsUniversity College London
Beijing Agricultural University
Websiteabacus.gene.ucl.ac.uk

Ziheng Yang FRS (Chinese: 杨子恒; born 1 November 1964) is a Chinese biologist. He holds the R.A. Fisher Chair of Statistical Genetics[1] at University College London,[2] and is the Director of R.A. Fisher Centre for Computational Biology at UCL. He was elected a Fellow of the Royal Society in 2006.[2]

Academic career

Yang graduated from Gansu Agricultural University with a BSc in 1984, and from Beijing Agricultural University with a MSc in 1987, and PhD in 1992.[3]

After the PhD, he worked as a postdoctoral researcher in Department of Zoology, University of Cambridge (1992-3), The Natural History Museum (London) (1993-4), Pennsylvania State University (1994-5), and University of California at Berkeley (1995-7), before taking up a faculty position in Department of Biology, University College London. He was a Lecturer (1997), Reader (2000), and then Professor (2001) in the same department. He was appointed to the R.A. Fisher Chair in Statistical Genetics in UCL in 2010.

Yang held a number of visiting appointments. He was a Visiting Associate Professor at Institute of Statistical Mathematics (Tokyo, 1997-8), a Visiting Professor at University of Tokyo (2007-8), Institute of Zoology in Beijing (2010-1), Peking University (2010), National Institute of Genetics, Mishima, Japan (2011), and Swiss Institute of Technology (ETH), Zurich (2011). In 2008-2011, he was the Changjiang Chair Professor at Sun Yat-sen University, with an award from the Ministry of Education of China. From 2016-2019, he was a Visiting Professor at National Institute of Genetics, Japan. Most recently he was awarded a Radcliffe Fellowship at Harvard University's Radcliffe Institute for Advanced Study, 2017-8.[4]

Work in molecular evolution and phylogenetics

Yang developed a number of statistical models and methods in the 1990s, which have been implemented in maximum likelihood and Bayesian software programs for phylogenetic analysis of DNA and protein sequence data. Two decades ago, Felsenstein had described the pruning algorithm for calculating the likelihood on a phylogeny.[5][6] However, the assumed model of character change was simple and, for example, does not account for variable rates among sites in the sequence. By illustrating the power of statistical models to accommodate major features of the evolutionary process and to address important evolutionary questions using molecular sequence data, the models and methods Yang developed had a major impact on the cladistic-statistical controversy at the time and played a major role in the transformation of molecular phylogenetics.

Yang developed a maximum likelihood model of gamma-distributed evolutionary rate variation among sites in the sequence in 1993-4.[7][8] The models he developed for combined analysis of heterogeneous data [9][10] are later known as partition models and mixture models.

Together with Nick Goldman, Yang developed the codon model of nucleotide substitution in 1994.[11] This formed the basis for phylogenetic analysis of protein-coding genes to detect molecular adaptation or Darwinian evolution at the molecular level. A stream of papers followed this to extend the original model to accommodate variable selection pressures (measured by the dN/dS ratio) among evolutionary lineages or among sites in the protein sequence. The branch models allow different branches to have different dN/dS ratios among branches on the tree and can be used to test for positive selection affecting particular lineages.[12] The site models allow different selective pressures on different amino acids in the protein and can be used to test for positive selection affecting only a few amino acid sites.[13][14][15] And the branch-site models attempt to detect positive selection that affects only a few amino acid sites along pre-specific lineages.[16][15] A recent book reviews the recent developments in this area.[17]

Yang developed the statistical (empirical Bayes) method for reconstructing ancestral sequences in 1995.[18] Compared with the parsimony method of ancestral sequence reconstruction (that is, the Fitch-Hartigan algorithm),[19][20] this has the advantages of using branch-length information and of providing a probabilistic assessment of the reconstruction uncertainties.

Together with Bruce Rannala, Yang introduced Bayesian statistics into molecular phylogenetics in 1996.[21][22] The Bayesian is now one of the most popular statistical methodologies used in modeling and inference in molecular phylogenetics. Recent exciting developments in Bayesian phylogenetics are summarized in an edited book[23] and in chapter 8 of Yang's book.[24]

Yang and Rannala also developed the multispecies coalescent model,[25] which has emerged as the natural framework for comparative analysis of genomic sequence data from multiple species, incorporating the coalescent process in both modern species and extinct ancestors. The model has been used to estimate the species tree despite gene tree heterogeneity among genomic regions,[26][27][28] and to delimit/identify species.[29] Yang champions the Bayesian full-likelihood method of inference, using Markov chain Monte Carlo to average over gene trees (gene genealogies), accommodating phylogenetic uncertainties.[28]

Yang maintains the program package PAML (for Phylogenetic Analysis by Maximum Likelihood)[30] and the Bayesian Markov chain Monte Carlo program BPP (for Bayesian Phylogenetics and Phylogeography).[31]

Work in principles of statistical inference and computational statistics

Yang studied the star tree paradox, which is that Bayesian model selection produces spuriously high posterior probabilities for the binary trees if the data are simulated under the star tree.[32][33] A simpler case showing similar behaviours is the fair-coin paradox.[33] The work suggests that Bayesian model selection may produce unpleasant polarized behavior supporting one model with full force while rejecting the others, when the competting models are all misspecified and equally wrong.[34]

Yang has worked extensively on Markov chain Monte Carlo algorithms, deriving many Metropolis-Hastings algorithms in Bayesian phylogenetics.[35] A study examining the efficiency of simple MCMC proposals revealed that the well-studied Gaussian random-walk move is less efficient than the simple uniform random-walk move, which is in turn less efficient than the Bactrian moves, bimodal moves that suppress values very close to the current state.[36]

Professional activities

Yang taught in Woods Hole Workshop on Molecular Evolution.

He was a co-organizer of the Royal Society Discussion Meeting on "Statistical and computational challenges in molecular phylogenetics and evolution" on 28–29 April 2008,[37] and the Royal Society Discussion Meeting on "Dating species divergence using rocks and clocks", on 9–10 November 2015.[38]

Since 2009, he has been a co-organizer of an annual workshop on Computational Molecular Evolution (CoME), which has been running in Sanger/Hinxton in odd years and in Hiraklion, Crete in even years.[1]

He also organized and taught in a number of workshops in Beijing, China.

Awards and honours

2010, Frink Medal for British Zoologists, Zoological Society of London[39]

2009, Royal Society Wolfson Research Merit Award

2008, President's Award for Lifetime Achievement, Society for Systematic Biology [40]

2006, Fellow of the Royal Society, The Royal Society of London [2]

1995, Young Investigator’s Prize, American Society of Naturalists [3]

Books

  • Computational molecular evolution. Oxford University Press. 2006. ISBN 978-0-19-856702-8.
  • Molecular Evolution: A Statistical Approach. Oxford University Press. 2014. ISBN 978-0-19-960261-2.

References

  1. ^ "Genetics, Evolution and Environment". Ucl.ac.uk. Retrieved 2017-06-23.
  2. ^ a b ‘YANG, Prof. Ziheng’, Who's Who 2011, A & C Black, 2011; online edn, Oxford University Press, Dec 2010 ; online edn, Oct 2010 accessed 11 May 2011(subscription required)
  3. ^ "Iris View Profile". Iris.ucl.ac.uk. Retrieved 2017-06-23.
  4. ^ "Ziheng Yang | Radcliffe Institute for Advanced Study at Harvard University". www.radcliffe.harvard.edu. Retrieved 2017-12-01.
  5. ^ Felsenstein, Joe (1973). "Maximum likelihood and minimum-steps methods for estimating evolutionary trees from data on discrete characters". Syst. Zool. 22 (3): 240–249. doi:10.2307/2412304. JSTOR 2412304.
  6. ^ Felsenstein, Joe (1981). "Evolutionary trees from DNA sequences: a maximum likelihood approach". J. Mol. Evol. 17 (6): 368–376. Bibcode:1981JMolE..17..368F. doi:10.1007/bf01734359.
  7. ^ Yang, Ziheng (1993). "Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites". Mol. Biol. Evol. 10: 1396–1401.
  8. ^ Yang, Z (1994). "Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods". J Mol Evol. 39 (3): 306–314. Bibcode:1994JMolE..39..306Y. CiteSeerX 10.1.1.305.951. doi:10.1007/bf00160154.
  9. ^ Yang Z, Lauder IJ, Lin HJ (1995). "Molecular evolution of the hepatitis B virus genome". J. Mol. Evol. 41 (5): 587–596. Bibcode:1995JMolE..41..587Y. doi:10.1007/bf00175817.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  10. ^ Yang Z. (1996). "Maximum-likelihood models for combined analyses of multiple sequence data". J. Mol. Evol. 42 (5): 587–596. Bibcode:1996JMolE..42..587Y. CiteSeerX 10.1.1.19.6773. doi:10.1007/bf02352289.
  11. ^ Goldman N, Yang Z. (1994). "A codon-based model of nucleotide substitution for protein-coding DNA sequences". Mol Biol Evol. 11 (5): 725–736. doi:10.1093/oxfordjournals.molbev.a040153. PMID 7968486.
  12. ^ Yang, Ziheng (1998). "Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution". Mol. Biol. Evol. 15 (5): 568–573. doi:10.1093/oxfordjournals.molbev.a025957. PMID 9580986.
  13. ^ Nielsen, R., Yang, Z. (1998). "Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene". Genetics. 148: 929–936.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  14. ^ Yang, Z., Nielsen, R., Goldman, N., Pedersen, A.-M.K. (2000). "Codon-substitution models for heterogeneous selection pressure at amino acid sites". Genetics. 155: 431–449.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  15. ^ a b Yang, Ziheng; Wong, Wendy S. W.; Nielsen, Rasmus (2005-04-01). "Bayes Empirical Bayes Inference of Amino Acid Sites Under Positive Selection". Molecular Biology and Evolution. 22 (4): 1107–1118. doi:10.1093/molbev/msi097. ISSN 0737-4038. PMID 15689528.
  16. ^ Yang, Z., Nielsen, R. (2002). "Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages". Mol. Biol. Evol. 19 (6): 908–917. doi:10.1093/oxfordjournals.molbev.a004148. PMID 12032247.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  17. ^ Codon evolution : mechanisms and models. Cannarozzi, Gina M., Schneider, Adrian. Oxford: Oxford University Press. 2012. ISBN 9780199601165. OCLC 784949340.{{cite book}}: CS1 maint: others (link)
  18. ^ Yang Z, Kumar S, Nei M. (1995). "A new method of inference of ancestral nucleotide and amino acid sequences". Genetics. 141: 1641–1650.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  19. ^ Fitch, Walter M. (1971). "Toward defining the course of evolution: minimum change for a specific tree topology". Syst. Zool. 20 (4): 406–416. doi:10.2307/2412116. JSTOR 2412116.
  20. ^ Hartigan, J.A. (1973). "Minimum evolution fits to a given tree". Biometrics. 29: 53–65. doi:10.2307/2529676. JSTOR 2529676.
  21. ^ Rannala B, Yang Z. (1996). "Probability distribution of molecular evolutionary trees: a new method of phylogenetic inference". J. Mol. Evol. 43 (3): 304–311. Bibcode:1996JMolE..43..304R. doi:10.1007/bf02338839.
  22. ^ Yang Z, Rannala B. (1997). "Bayesian phylogenetic inference using DNA sequences: a Markov chain Monte Carlo Method". Mol. Biol. Evol. 14 (7): 717–724. doi:10.1093/oxfordjournals.molbev.a025811. PMID 9214744.
  23. ^ Chen, Ming-Hui; Kuo, Lynn; Lewis, Paul O (2014-05-27). Bayesian phylogenetics : methods, algorithms, and applications. Chen, Ming-Hui, 1961-, Kuo, Lynn, 1949-, Lewis, Paul O., 1961-. Boca Raton. ISBN 9781466500792. OCLC 881387408.{{cite book}}: CS1 maint: location missing publisher (link)
  24. ^ Ziheng,, Yang, (2014). Molecular evolution : a statistical approach (First ed.). Oxford. ISBN 9780199602605. OCLC 869346345.{{cite book}}: CS1 maint: extra punctuation (link) CS1 maint: location missing publisher (link) CS1 maint: multiple names: authors list (link)
  25. ^ Rannala B, Yang Z. (2003). "Bayes estimation of species divergence times and ancestral population sizes using DNA sequences from multiple loci". Genetics. 164: 1645–1656.
  26. ^ Yang, Ziheng; Rannala, Bruce (2014-12-01). "Unguided Species Delimitation Using DNA Sequence Data from Multiple Loci". Molecular Biology and Evolution. 31 (12): 3125–3135. doi:10.1093/molbev/msu279. ISSN 0737-4038. PMC 4245825. PMID 25274273.
  27. ^ Rannala, Bruce; Yang, Ziheng (2017-09-01). "Efficient Bayesian Species Tree Inference under the Multispecies Coalescent". Systematic Biology. 66 (5): 823–842. arXiv:1512.03843. doi:10.1093/sysbio/syw119. ISSN 1063-5157. PMID 28053140.
  28. ^ a b Xu, Bo; Yang, Ziheng (2016-12-01). "Challenges in Species Tree Estimation Under the Multispecies Coalescent Model". Genetics. 204 (4): 1353–1368. Bibcode:2001gpm..book.....L. doi:10.1534/genetics.116.190173. ISSN 0016-6731. PMC 5161269. PMID 27927902.
  29. ^ Yang, Ziheng; Rannala, Bruce (2010-05-18). "Bayesian species delimitation using multilocus sequence data". Proceedings of the National Academy of Sciences. 107 (20): 9264–9269. Bibcode:2010PNAS..107.9264Y. doi:10.1073/pnas.0913022107. ISSN 0027-8424. PMC 2889046. PMID 20439743.
  30. ^ Yang, Ziheng (2007). "PAML 4: Phylogenetic analysis by maximum likelihood". Mol. Biol. Evol. 24 (8): 1586–1591. doi:10.1093/molbev/msm088. PMID 17483113.
  31. ^ Yang, Ziheng (2015-10-01). "The BPP program for species tree estimation and species delimitation". Current Zoology. 61 (5): 854–865. doi:10.1093/czoolo/61.5.854. ISSN 1674-5507.
  32. ^ Yang, Ziheng; Rannala, Bruce; Lewis, Paul (2005-06-01). "Branch-Length Prior Influences Bayesian Posterior Probability of Phylogeny". Systematic Biology. 54 (3): 455–470. doi:10.1080/10635150590945313. ISSN 1063-5157. PMID 16012111.
  33. ^ a b Yang, Ziheng (2007-08-01). "Fair-Balance Paradox, Star-tree Paradox, and Bayesian Phylogenetics". Molecular Biology and Evolution. 24 (8): 1639–1655. doi:10.1093/molbev/msm081. ISSN 0737-4038. PMID 17488737.
  34. ^ Yang, Ziheng; Zhu, Tianqi (5 February 2018). "Bayesian selection of misspecified models is overconfident and may cause spurious posterior probabilities for phylogenetic trees". Proceedings of the National Academy of Sciences. 115 (8): 1854–1859. doi:10.1073/pnas.1712673115. PMC 5828583. PMID 29432193.
  35. ^ Ziheng,, Yang, (2014). Molecular evolution : a statistical approach (First ed.). Oxford. ISBN 9780199602612. OCLC 869346345.{{cite book}}: CS1 maint: extra punctuation (link) CS1 maint: location missing publisher (link) CS1 maint: multiple names: authors list (link)
  36. ^ Yang, Ziheng; Rodríguez, Carlos E. (2013-11-26). "Searching for efficient Markov chain Monte Carlo proposal kernels". Proceedings of the National Academy of Sciences. 110 (48): 19307–19312. Bibcode:2013PNAS..11019307Y. doi:10.1073/pnas.1311790110. ISSN 0027-8424. PMC 3845170. PMID 24218600.
  37. ^ "Statistical and computational challenges in molecular phylogenetics and evolution". Royal Society. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)
  38. ^ "Dating species divergences using rocks and clocks". Royal Society. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)
  39. ^ "Winners of the ZSL Frink Medal for British Zoologists" (PDF). Static.zsl.org. Retrieved 2017-06-23.
  40. ^ "Society of Systematic Biologists (SSB)". Society of Systematic Biologists.