Pitman–Yor process: Difference between revisions
Changed range of d to disallow 1 (<= became <) |
Entropeneur (talk | contribs) mNo edit summary |
||
(30 intermediate revisions by 22 users not shown) | |||
Line 1: | Line 1: | ||
In [[probability theory]], a '''Pitman–Yor process'''<ref name= |
In [[probability theory]], a '''Pitman–Yor process'''<ref name=Ishwaran03>{{cite journal |
||
|first1= |
|first1=H|last1=Ishwaran |
||
|first2=L F|last2=James |
|||
⚫ | |||
|title=Generalized weighted Chinese restaurant processes for species sampling mixture models |
|||
⚫ | |||
|journal=Statistica Sinica |
|||
⚫ | |||
|volume=13|pages=1211–1235 |
|||
⚫ | |||
|year=2003 |
|||
<ref name=PitmanYor1997>{{cite journal |
}}</ref><ref name=PitmanYor1997>{{cite journal |
||
|first1=Jim |last1=Pitman |first2=Marc |last2=Yor |
|first1=Jim |last1=Pitman |first2=Marc |last2=Yor |
||
|title=The two-parameter Poisson–Dirichlet distribution derived from a stable subordinator |
|title=The two-parameter Poisson–Dirichlet distribution derived from a stable subordinator |
||
Line 12: | Line 13: | ||
|doi=10.1214/aop/1024404422 |
|doi=10.1214/aop/1024404422 |
||
|mr=1434129 | zbl = 0880.60076 |
|mr=1434129 | zbl = 0880.60076 |
||
|citeseerx=10.1.1.69.1273}}</ref><ref name=Pitman2006>{{cite book|last=Pitman|first=Jim|title=Combinatorial Stochastic Processes|volume=1875|url=http://works.bepress.com/jim_pitman/1/|publisher=Springer-Verlag|year=2006|location=Berlin|isbn=9783540309901}}</ref><ref name=Teh2006>{{cite journal |
|||
}}</ref> |
|||
|first1=Yee Whye |last1=Teh |
|||
,<ref name=Pitman2006>{{cite book|last=Pitman|first=Jim|title=Combinatorial Stochastic Processes|url=http://works.bepress.com/jim_pitman/1/|publisher=Springer-Verlag|year=2006|location=Berlin}}</ref> denoted PY(''d'', ''θ'', ''G''<sub>0</sub>), is a [[stochastic process]] whose sample path is a [[probability distribution]]. A random sample from this process is a finite-dimensional Pitman–Yor distribution, named after [[Jim Pitman]] and [[Marc Yor]]. Unfortunately, there is no known analytic form for this distribution. |
|||
⚫ | |||
⚫ | |||
⚫ | |||
}}</ref> denoted PY(''d'', ''θ'', ''G''<sub>0</sub>), is a [[stochastic process]] whose sample path is a [[probability distribution]]. A random sample from this process is an infinite discrete probability distribution, consisting of an infinite set of atoms drawn from ''G''<sub>0</sub>, with weights drawn from a two-parameter [[Poisson-Dirichlet distribution]]. The process is named after [[Jim Pitman]] and [[Marc Yor]]. |
|||
The parameters governing the Pitman–Yor process are: 0 ≤ ''d'' < 1 a discount parameter, a strength parameter ''θ'' > −''d'' and a base distribution ''G''<sub>0</sub> over a probability space ''X''. When ''d'' = 0, it becomes the [[Dirichlet process]]. The discount parameter gives the Pitman–Yor process more flexibility over tail behavior than the Dirichlet process, which has exponential tails. This makes Pitman–Yor process useful for modeling data with [[power-law]] tails (e.g., word frequencies in natural language). |
The parameters governing the Pitman–Yor process are: 0 ≤ ''d'' < 1 a discount parameter, a strength parameter ''θ'' > −''d'' and a base distribution ''G''<sub>0</sub> over a probability space ''X''. When ''d'' = 0, it becomes the [[Dirichlet process]]. The discount parameter gives the Pitman–Yor process more flexibility over tail behavior than the Dirichlet process, which has exponential tails. This makes Pitman–Yor process useful for modeling data with [[power-law]] tails (e.g., word frequencies in natural language). |
||
The exchangeable random partition induced by the Pitman–Yor process is an example of a [[Chinese_restaurant_process#Two-parameter_generalization|Chinese restaurant process]], a [[Poisson–Kingman partition]], and of a [[Gibbs type random partition]]. |
|||
==Naming conventions== |
|||
The name "Pitman–Yor process" was coined by Ishwaran and James<ref>{{cite journal |
|||
|first1=H. |
|||
|last1=Ishwaran |
|||
|first2=L. |
|||
|last2=James |
|||
|title=Gibbs Sampling Methods for Stick-Breaking Priors |
|||
|journal=Journal of the American Statistical Association |
|||
|year=2001 |
|||
|doi=10.1198/016214501750332758 |
|||
|volume=96 |
|||
|issue=453 |
|||
|pages=161–173 |
|||
|citeseerx=10.1.1.36.2559 |
|||
}}</ref> after Pitman and Yor's review on the subject.<ref name=PitmanYor1997 /> However the process was originally studied in Perman et al.<ref name=PermanPitmanYor1992>{{cite journal |
|||
|first1=M. |
|||
|last1=Perman |
|||
|first2=J. |
|||
|last2=Pitman |
|||
|first3=M. |
|||
|last3=Yor |
|||
|title=Size-biased sampling of Poisson point processes and excursions |
|||
|journal=Probability Theory and Related Fields |
|||
|volume=92 |
|||
|pages=21–39 |
|||
|year=1992 |
|||
|doi=10.1007/BF01205234 |
|||
|doi-access=free |
|||
}}</ref><ref name=Perman1990>{{cite thesis |
|||
|first1=M. |
|||
|last1=Perman |
|||
|title=Random Discrete Distributions Derived from Subordinators |
|||
|publisher=Department of Statistics, University of California at Berkeley |
|||
|year=1990 |
|||
⚫ | |||
It is also sometimes referred to as the two-parameter Poisson–Dirichlet process, after the two-parameter generalization of the Poisson–Dirichlet distribution which describes the joint distribution of the sizes of the atoms in the [[random measure]], sorted by strictly decreasing order. |
|||
==See also== |
==See also== |
||
*[[Chinese restaurant process]] |
*[[Chinese restaurant process]] |
||
*[[Dirichlet distribution]] |
*[[Dirichlet distribution]] |
||
*[[Dirichlet process]] |
|||
*[[Latent Dirichlet allocation]] |
*[[Latent Dirichlet allocation]] |
||
Line 26: | Line 70: | ||
{{Reflist}} |
{{Reflist}} |
||
{{Stochastic processes}} |
|||
{{DEFAULTSORT:Pitman-Yor process}} |
|||
[[Category:Stochastic processes]] |
[[Category:Stochastic processes]] |
||
[[Category: |
[[Category:Nonparametric Bayesian statistics]] |
||
[[Category: |
[[Category:Cluster analysis algorithms]] |
||
{{ |
{{probability-stub}} |
Latest revision as of 07:12, 7 July 2024
In probability theory, a Pitman–Yor process[1][2][3][4] denoted PY(d, θ, G0), is a stochastic process whose sample path is a probability distribution. A random sample from this process is an infinite discrete probability distribution, consisting of an infinite set of atoms drawn from G0, with weights drawn from a two-parameter Poisson-Dirichlet distribution. The process is named after Jim Pitman and Marc Yor.
The parameters governing the Pitman–Yor process are: 0 ≤ d < 1 a discount parameter, a strength parameter θ > −d and a base distribution G0 over a probability space X. When d = 0, it becomes the Dirichlet process. The discount parameter gives the Pitman–Yor process more flexibility over tail behavior than the Dirichlet process, which has exponential tails. This makes Pitman–Yor process useful for modeling data with power-law tails (e.g., word frequencies in natural language).
The exchangeable random partition induced by the Pitman–Yor process is an example of a Chinese restaurant process, a Poisson–Kingman partition, and of a Gibbs type random partition.
Naming conventions
[edit]The name "Pitman–Yor process" was coined by Ishwaran and James[5] after Pitman and Yor's review on the subject.[2] However the process was originally studied in Perman et al.[6][7]
It is also sometimes referred to as the two-parameter Poisson–Dirichlet process, after the two-parameter generalization of the Poisson–Dirichlet distribution which describes the joint distribution of the sizes of the atoms in the random measure, sorted by strictly decreasing order.
See also
[edit]References
[edit]- ^ Ishwaran, H; James, L F (2003). "Generalized weighted Chinese restaurant processes for species sampling mixture models". Statistica Sinica. 13: 1211–1235.
- ^ a b Pitman, Jim; Yor, Marc (1997). "The two-parameter Poisson–Dirichlet distribution derived from a stable subordinator". Annals of Probability. 25 (2): 855–900. CiteSeerX 10.1.1.69.1273. doi:10.1214/aop/1024404422. MR 1434129. Zbl 0880.60076.
- ^ Pitman, Jim (2006). Combinatorial Stochastic Processes. Vol. 1875. Berlin: Springer-Verlag. ISBN 9783540309901.
- ^ Teh, Yee Whye (2006). "A hierarchical Bayesian language model based on Pitman–Yor processes". Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics.
- ^ Ishwaran, H.; James, L. (2001). "Gibbs Sampling Methods for Stick-Breaking Priors". Journal of the American Statistical Association. 96 (453): 161–173. CiteSeerX 10.1.1.36.2559. doi:10.1198/016214501750332758.
- ^ Perman, M.; Pitman, J.; Yor, M. (1992). "Size-biased sampling of Poisson point processes and excursions". Probability Theory and Related Fields. 92: 21–39. doi:10.1007/BF01205234.
- ^ Perman, M. (1990). Random Discrete Distributions Derived from Subordinators (Thesis). Department of Statistics, University of California at Berkeley.