Statistical benchmarking: Difference between revisions
Tooncool64 (talk | contribs) AfD: Nominated for deletion; see Wikipedia:Articles for deletion/Statistical benchmarking. |
Duckmather (talk | contribs) adding several sources to hopefully save the article |
||
Line 3: | Line 3: | ||
<!-- Once discussion is closed, please place on talk page: {{Old AfD multi|page=Statistical benchmarking|date=20 December 2023|result='''keep'''}} --> |
<!-- Once discussion is closed, please place on talk page: {{Old AfD multi|page=Statistical benchmarking|date=20 December 2023|result='''keep'''}} --> |
||
<!-- End of AfD message, feel free to edit beyond this point --> |
<!-- End of AfD message, feel free to edit beyond this point --> |
||
{{Short description|Method of using auxiliary information for better results}} |
{{Short description|Method of using auxiliary information for better results}}{{Noinline|date=January 2024}} |
||
{{Unreferenced|date=October 2007}} |
|||
In [[statistics]], '''benchmarking''' is a method of using auxiliary information to adjust the [[weight function|sampling weight]]s used in an [[estimation]] process, in order to yield more accurate estimates of totals. |
In [[statistics]], '''benchmarking''' is a method of using auxiliary information to adjust the [[weight function|sampling weight]]s used in an [[estimation]] process, in order to yield more accurate estimates of totals. |
||
Line 22: | Line 21: | ||
For this reason, benchmarking is generally used in situations where stratified sampling is impractical. For instance, when selecting people from a telephone directory, we can't tell what age they are so we can't easily stratify the sample by age. However, we can collect this information from the people sampled, allowing us to benchmark against demographic information. |
For this reason, benchmarking is generally used in situations where stratified sampling is impractical. For instance, when selecting people from a telephone directory, we can't tell what age they are so we can't easily stratify the sample by age. However, we can collect this information from the people sampled, allowing us to benchmark against demographic information. |
||
== Further reading == |
|||
* {{Cite journal |last=Jilovsky |first=Cathie |date=2011-01-01 |title=Singing in harmony: statistical benchmarking for academic libraries |url=https://doi.org/10.1108/01435121111102575 |journal=Library Management |volume=32 |issue=1/2 |pages=48–61 |doi=10.1108/01435121111102575 |issn=0143-5124}} |
|||
* {{Cite journal |last=Jilovsky |first=Cathie |date=2011-01-01 |title=Singing in harmony: statistical benchmarking for academic libraries |url=https://doi.org/10.1108/01435121111102575 |journal=Library Management |volume=32 |issue=1/2 |pages=48–61 |doi=10.1108/01435121111102575 |issn=0143-5124}} |
|||
* {{Cite journal |last=Drummond |first=Chris |last2=Japkowicz |first2=Nathalie |date=2010-03 |title=Warning: statistical benchmarking is addictive. Kicking the habit in machine learning |url=http://www.tandfonline.com/doi/abs/10.1080/09528130903010295 |journal=Journal of Experimental & Theoretical Artificial Intelligence |language=en |volume=22 |issue=1 |pages=67–80 |doi=10.1080/09528130903010295 |issn=0952-813X}} |
|||
* {{Cite journal |last=Tiedau |first=J. |last2=Engelkemeier |first2=M. |last3=Brecht |first3=B. |last4=Sperling |first4=J. |last5=Silberhorn |first5=C. |date=2021-01-12 |title=Statistical Benchmarking of Scalable Photonic Quantum Systems |url=https://link.aps.org/doi/10.1103/PhysRevLett.126.023601 |journal=Physical Review Letters |volume=126 |issue=2 |pages=023601 |doi=10.1103/PhysRevLett.126.023601}} |
|||
* {{Cite journal |last=Reisenthel |first=Patrick |last2=Lesieutre |first2=Daniel |date=2010-04-12 |title=Statistical Benchmarking of Surrogate-Based and Other Optimization Methods Constrained by Fixed Computational Budget |url=http://arc.aiaa.org/doi/abs/10.2514/6.2010-3088 |language=en |publisher=American Institute of Aeronautics and Astronautics |doi=10.2514/6.2010-3088 |isbn=978-1-60086-961-7}} |
|||
[[Category:Sampling (statistics)]] |
[[Category:Sampling (statistics)]] |
Revision as of 04:40, 7 January 2024
An editor has nominated this article for deletion. You are welcome to participate in the deletion discussion, which will decide whether or not to retain it. |
This article includes a list of references, related reading, or external links, but its sources remain unclear because it lacks inline citations. (January 2024) |
In statistics, benchmarking is a method of using auxiliary information to adjust the sampling weights used in an estimation process, in order to yield more accurate estimates of totals.
Suppose we have a population where each unit has a "value" associated with it. For example, could be a wage of an employee , or the cost of an item . Suppose we want to estimate the sum of all the . So we take a sample of the , get a sampling weight W(k) for all sampled , and then sum up for all sampled .
One property usually common to the weights described here is that if we sum them over all sampled , then this sum is an estimate of the total number of units in the population (for example, the total employment, or the total number of items). Because we have a sample, this estimate of the total number of units in the population will differ from the true population total. Similarly, the estimate of total (where we sum for all sampled ) will also differ from true population total.
We do not know what the true population total value is (if we did, there would be no point in sampling!). Yet often we do know what the sum of the are over all units in the population. For example, we may not know the total earnings of the population or the total cost of the population, but often we know the total employment or total volume of sales. And even if we don't know these exactly, there often are surveys done by other organizations or at earlier times, with very accurate estimates of these auxiliary quantities. One important function of a population census is to provide data that can be used for benchmarking smaller surveys.
The benchmarking procedure begins by first breaking the population into benchmarking cells. Cells are formed by grouping units together that share common characteristics, for example, similar , yet anything can be used that enhances the accuracy of the final estimates. For each cell , we let be the sum of all , where the sum is taken over all sampled in the cell . For each cell , we let be the auxiliary value for cell , which is commonly called the "benchmark target" for cell . Next, we compute a benchmark factor . Then, we adjust all weights by multiplying it by its benchmark factor , for its cell . The net result is that the estimated [formed by summing ] will now equal the benchmark target total . But the more important benefit is that the estimate of the total of [formed by summing ] will tend to be more accurate.
Relationship to stratified sampling
Benchmarking is sometimes referred to as 'post-stratification' because of its similarities to stratified sampling. The difference between the two is that in stratified sampling, we decide in advance how many units will be sampled from each stratum (equivalent to benchmarking cells); in benchmarking, we select units from the broader population, and the number chosen from each cell is a matter of chance.
The advantage of stratified sampling is that the sample numbers in each stratum can be controlled for desired accuracy outcomes. Without this control, we may end up with too much sample in one stratum and not enough in another – indeed, it's possible that a sample will contain no members from a certain cell, in which case benchmarking fails because , leading to a divide-by-zero problem. In such cases, it is necessary to 'collapse' cells together so that each remaining cell has an adequate sample size.
For this reason, benchmarking is generally used in situations where stratified sampling is impractical. For instance, when selecting people from a telephone directory, we can't tell what age they are so we can't easily stratify the sample by age. However, we can collect this information from the people sampled, allowing us to benchmark against demographic information.
Further reading
- Jilovsky, Cathie (2011-01-01). "Singing in harmony: statistical benchmarking for academic libraries". Library Management. 32 (1/2): 48–61. doi:10.1108/01435121111102575. ISSN 0143-5124.
- Jilovsky, Cathie (2011-01-01). "Singing in harmony: statistical benchmarking for academic libraries". Library Management. 32 (1/2): 48–61. doi:10.1108/01435121111102575. ISSN 0143-5124.
- Drummond, Chris; Japkowicz, Nathalie (2010-03). "Warning: statistical benchmarking is addictive. Kicking the habit in machine learning". Journal of Experimental & Theoretical Artificial Intelligence. 22 (1): 67–80. doi:10.1080/09528130903010295. ISSN 0952-813X.
{{cite journal}}
: Check date values in:|date=
(help) - Tiedau, J.; Engelkemeier, M.; Brecht, B.; Sperling, J.; Silberhorn, C. (2021-01-12). "Statistical Benchmarking of Scalable Photonic Quantum Systems". Physical Review Letters. 126 (2): 023601. doi:10.1103/PhysRevLett.126.023601.
- Reisenthel, Patrick; Lesieutre, Daniel (2010-04-12). "Statistical Benchmarking of Surrogate-Based and Other Optimization Methods Constrained by Fixed Computational Budget". American Institute of Aeronautics and Astronautics. doi:10.2514/6.2010-3088. ISBN 978-1-60086-961-7.
{{cite journal}}
: Cite journal requires|journal=
(help)