Jump to content

Friedman test: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
In this compact version, the test statistics *need* to be corrected for tie values, as e.g. stated on http://www.itl.nist.gov/div898/software/dataplot/refman1/auxillar/friedman.htm
 
(36 intermediate revisions by 24 users not shown)
Line 1: Line 1:
{{Short description|Non-parametric statistical test}}
{{For|US Army cryptologist [[William F. Friedman]]'s cryptanalytic test|Vigenère cipher#Friedman test}}{{For|Friedman pregnancy test|Rabbit test}}
{{About||the cryptanalytic test|Vigenère cipher#Friedman test|the Friedman pregnancy test|Rabbit test}}


The '''Friedman test''' is a [[non-parametric statistics|non-parametric]] [[statistical test]] developed by [[Milton Friedman]].<ref>{{cite journal
The '''Friedman test''' is a [[non-parametric statistics|non-parametric]] [[statistical test]] developed by [[Milton Friedman]].<ref>{{cite journal
| last = Friedman
| last = Friedman
| first = Milton
| first = Milton
| authorlink = Milton Friedman
| author-link = Milton Friedman
|date=December 1937
|date=December 1937
| title = The use of ranks to avoid the assumption of normality implicit in the analysis of variance
| title = The use of ranks to avoid the assumption of normality implicit in the analysis of variance
Line 13: Line 14:
| doi = 10.1080/01621459.1937.10503522
| doi = 10.1080/01621459.1937.10503522
| jstor = 2279372
| jstor = 2279372
| publisher = American Statistical Association
}}</ref><ref>{{cite journal
}}</ref><ref>{{cite journal
| last = Friedman
| last = Friedman
| first = Milton
| first = Milton
| authorlink = Milton Friedman
| author-link = Milton Friedman
|date=March 1939
|date=March 1939
| title = A correction: The use of ranks to avoid the assumption of normality implicit in the analysis of variance
| title = A correction: The use of ranks to avoid the assumption of normality implicit in the analysis of variance
Line 23: Line 23:
| volume = 34
| volume = 34
| issue = 205
| issue = 205
| page = 109
| pages = 109
| doi = 10.1080/01621459.1939.10502372
| doi = 10.1080/01621459.1939.10502372
| jstor = 2279169
| jstor = 2279169
| publisher = American Statistical Association
}}</ref><ref>{{cite journal
}}</ref><ref>{{cite journal
| last = Friedman
| last = Friedman
| first = Milton
| first = Milton
| authorlink = Milton Friedman
| author-link = Milton Friedman
|date=March 1940
|date=March 1940
| title = A comparison of alternative tests of significance for the problem of ''m'' rankings
| title = A comparison of alternative tests of significance for the problem of ''m'' rankings
Line 39: Line 38:
| doi = 10.1214/aoms/1177731944
| doi = 10.1214/aoms/1177731944
| jstor=2235971
| jstor=2235971
| doi-access = free
}}</ref> Similar to the [[parametric statistics|parametric]] [[repeated measures]] [[ANOVA]], it is used to detect differences in treatments across multiple test attempts. The procedure involves [[ranking]] each row (or ''block'') together, then considering the values of ranks by columns. Applicable to [[complete block design]]s, it is thus a special case of the [[Durbin test]].
}}</ref> Similar to the [[parametric statistics|parametric]] [[repeated measures]] [[ANOVA]], it is used to detect differences in treatments across multiple test attempts. The procedure involves [[ranking]] each row (or ''block'') together, then considering the values of ranks by columns. Applicable to [[complete block design]]s, it is thus a special case of the [[Durbin test]].


Classic examples of use are:
Classic examples of use are:
* ''n'' wine judges each rate ''k'' different wines. Are any of the ''k'' wines ranked consistently higher or lower than the others?
* <math display="inline">n</math> wine judges each rate <math display="inline">k</math> different wines. Are any of the <math display="inline">k</math> wines ranked consistently higher or lower than the others?
* ''n'' welders each use ''k'' welding torches, and the ensuing welds were rated on quality. Do any of the ''k'' torches produce consistently better or worse welds?
* <math display="inline">n</math> welders each use <math display="inline">k</math> welding torches, and the ensuing welds were rated on quality. Do any of the <math display="inline">k</math> torches produce consistently better or worse welds?


The Friedman test is used for one-way repeated measures analysis of variance by ranks. In its use of ranks it is similar to the [[Kruskal–Wallis one-way analysis of variance]] by ranks.
The Friedman test is used for one-way repeated measures analysis of variance by ranks. In its use of ranks it is similar to the [[Kruskal–Wallis one-way analysis of variance]] by ranks.


Friedman test is widely supported by many [[Comparison of statistical packages|statistical software packages]].
The Friedman test is widely supported by many [[Comparison of statistical packages|statistical software packages]].


== Method ==
== Method ==
# Given data <math>\{x_{ij}\}_{n\times k}</math>, that is, a [[Matrix (mathematics)|matrix]] with <math>n</math> rows (the ''blocks''), <math>k</math> columns (the ''treatments'') and a single observation at the intersection of each block and treatment, calculate the [[Rank statistics|ranks]] ''within'' each block. If there are tied values, assign to each tied value the average of the ranks that would have been assigned without ties. Replace the data with a new matrix <math>\{r_{ij}\}_{n \times k}</math> where the entry <math>r_{ij}</math> is the rank of <math>x_{ij}</math> within block <math>i</math>.
# Given data <math>\{x_{ij}\}_{n\times k}</math>, that is, a [[Matrix (mathematics)|matrix]] with <math>n</math> rows (the ''blocks''), <math>k</math> columns (the ''treatments'') and a single observation at the intersection of each block and treatment, calculate the [[Rank statistics|ranks]] ''within'' each block. If there are tied values, assign to each tied value the average of the ranks that would have been assigned without ties. Replace the data with a new matrix <math>\{r_{ij}\}_{n \times k}</math> where the entry <math>r_{ij}</math> is the rank of <math>x_{ij}</math> within block <math>i</math>.
# Find the values <math>\bar{r}_{\cdot j} = \frac{1}{n} \sum_{i=1}^n {r_{ij}}</math>
# Find the values <math>\bar{r}_{\cdot j} = \frac{1}{n} \sum_{i=1}^n {r_{ij}}</math>
# The test statistic is given by <math>Q = \frac{12n}{k(k+1)} \sum_{j=1}^k \left(\bar{r}_{\cdot j}-\frac{k+1}{2}\right)^2</math>. Note that the value of Q does need to be adjusted for tied values in the data
# The test statistic is given by <math>Q = \frac{12n}{k(k+1)} \sum_{j=1}^k \left(\bar{r}_{\cdot j}-\frac{k+1}{2}\right)^2</math>. Note that the value of <math display="inline">Q</math> does need to be adjusted for tied values in the data.<ref>{{cite web |title=FRIEDMAN TEST in NIST Dataplot |date=August 20, 2018 |url=https://www.itl.nist.gov/div898/software/dataplot/refman1/auxillar/friedman.htm}}</ref>
# Finally, when <math display="inline">n</math> or <math display="inline">k</math> is large (i.e. <math display="inline">n>15</math> or <math display="inline">k> 4</math>), the [[probability distribution]] of <math display="inline">Q</math> can be approximated by that of a [[chi-squared distribution]]. In this case the [[p-value|{{Math|''p''}}-value]] is given by <math>\mathbf{P}(\chi^2_{k-1} \ge Q)</math>. If <math display="inline">n</math> or <math display="inline">k</math> is small, the approximation to chi-square becomes poor and the {{Math|''p''}}-value should be obtained from tables of <math display="inline">Q</math> specially prepared for the Friedman test. If the {{Math|''p''}}-value is [[statistical significance|significant]], appropriate post-hoc [[multiple comparisons]] tests would be performed.
<ref>{{cite web |title=FRIEDMAN TEST in NIST Dataplot |work= |date=August 20, 2018 |url=https://www.itl.nist.gov/div898/software/dataplot/refman1/auxillar/friedman.htm/}}</ref>.
# Finally, when n or k is large (i.e. n > 15 or k > 4), the [[probability distribution]] of Q can be approximated by that of a [[chi-squared distribution]]. In this case the [[p-value]] is given by <math>\mathbf{P}(\chi^2_{k-1} \ge Q)</math>. If n or k is small, the approximation to chi-square becomes poor and the p-value should be obtained from tables of Q specially prepared for the Friedman test. If the p-value is [[statistical significance|significant]], appropriate post-hoc [[multiple comparisons]] tests would be performed.


== Related tests ==
== Related tests ==
* When using this kind of design for a binary response, one instead uses the [[Cochran's Q test]].
* When using this kind of design for a binary response, one instead uses the [[Cochran's Q test]].
* The [[Sign test]] (with a two-sided alternative) is equivalent to a Friedman test on two groups.
* [[Kendall's W]] is a normalization of the Friedman statistic between 0 and 1.
* [[Kendall's W]] is a normalization of the Friedman statistic between <math display="inline">0</math> and <math display="inline">1</math>.
* The [[Wilcoxon signed-rank test]] is a nonparametric test of nonindependent data from only two groups.
* The [[Wilcoxon signed-rank test]] is a nonparametric test of nonindependent data from only two groups.
* The [[Skillings–Mack test]] is a general Friedman-type statistic that can be used in almost any block design with an arbitrary missing-data structure.
* The [[Skillings–Mack test]] is a general Friedman-type statistic that can be used in almost any block design with an arbitrary missing-data structure.
Line 64: Line 64:
| last = Wittkowski
| last = Wittkowski
| first = Knut M.
| first = Knut M.
| authorlink = Knut M. Wittkowski
| author-link = Knut M. Wittkowski
| title = Friedman-Type statistics and consistent multiple comparisons for unbalanced designs with missing data
| title = Friedman-Type statistics and consistent multiple comparisons for unbalanced designs with missing data
| journal = Journal of the American Statistical Association
| journal = Journal of the American Statistical Association
| volume = 83
| volume = 83
| issue = 404
| issue = 404
| pages = 1163-1170
| pages = 1163–1170
| doi = 10.1080/01621459.1988.10478715
| doi = 10.1080/01621459.1988.10478715
| jstor = 2290150
| jstor = 2290150
| year = 1988
| publisher = American Statistical Association
| citeseerx = 10.1.1.533.1948
}}</ref> An implementation of the test exists in [[R (programming language)|R]].<ref>{{cite web |title=muStat package (R code) |work= |date=August 23, 2012 |url=https://cran.r-project.org/package=muStat/}}</ref>
}}</ref>


== Post hoc analysis ==
== Post hoc analysis ==


[[Post-hoc analysis|Post-hoc tests]] were proposed by Schaich and Hamerle (1984)<ref>Schaich, E. & Hamerle, A. (1984). Verteilungsfreie statistische Prüfverfahren. Berlin: Springer. {{ISBN|3-540-13776-9}}.</ref> as well as Conover (1971, 1980)<ref>Conover, W. J. (1971, 1980). Practical nonparametric statistics. New York: Wiley. {{ISBN|0-471-16851-3}}.</ref> in order to decide which groups are significantly different from each other, based upon the mean rank differences of the groups. These procedures are detailed in Bortz, Lienert and Boehnke (2000, p.&nbsp;275).<ref>Bortz, J., Lienert, G. & Boehnke, K. (2000). Verteilungsfreie Methoden in der Biostatistik. Berlin: Springer. {{ISBN|3-540-67590-6}}.</ref>
[[Post-hoc analysis|Post-hoc tests]] were proposed by Schaich and Hamerle (1984)<ref>Schaich, E. & Hamerle, A. (1984). Verteilungsfreie statistische Prüfverfahren. Berlin: Springer. {{ISBN|3-540-13776-9}}.</ref> as well as Conover (1971, 1980)<ref>Conover, W. J. (1971, 1980). Practical nonparametric statistics. New York: Wiley. {{ISBN|0-471-16851-3}}.</ref> in order to decide which groups are significantly different from each other, based upon the mean rank differences of the groups. These procedures are detailed in Bortz, Lienert and Boehnke (2000, p.&nbsp;275).<ref>Bortz, J., Lienert, G. & Boehnke, K. (2000). Verteilungsfreie Methoden in der Biostatistik. Berlin: Springer. {{ISBN|3-540-67590-6}}.</ref> Eisinga, Heskes, Pelzer and Te Grotenhuis (2017)<ref>{{cite journal | last1 = Eisinga | first1 = R. | last2 = Heskes | first2 = T. | last3 = Pelzer | first3 = B. | last4 = Te Grotenhuis | first4 = M. | year = 2017 | title = Exact ''p''-values for pairwise comparison of Friedman rank sums, with application to comparing classifiers | doi = 10.1186/s12859-017-1486-2 | journal = BMC Bioinformatics | volume = 18 | issue = 1 | pages = 68 | pmc = 5267387 | pmid=28122501 | url=http://rdcu.be/oOf9 | doi-access = free }}</ref> provide an exact test for pairwise comparison of Friedman rank sums, implemented in [[R (programming language)|R]]. The [[Eisinga c.s. exact test]] offers a substantial improvement over available approximate tests, especially if the number of groups (<math>k</math>) is large and the number of blocks (<math>n</math>) is small.


Not all statistical packages support Post-hoc analysis for Friedman's test, but user-contributed code exists that provides these facilities (for example in [[SPSS]],<ref>{{cite web |title=Post-hoc comparisons for Friedman test |url=http://timo.gnambs.at/en/scripts/friedmanposthoc }}</ref> and in [[R (programming language)|R]].<ref>{{cite web |title=Post hoc analysis for Friedman's Test (R code) |work= |date=February 22, 2010 |url=https://www.r-statistics.com/2010/02/post-hoc-analysis-for-friedmans-test-r-code/ }}</ref>). Also, there is a specialized package available in [[R (programming language)|R]] containing numerous non-parametric methods for post-hoc analysis after Friedman<ref>{{cite web |title=PMCMRplus: Calculate Pairwise Multiple Comparisons of Mean Rank Sums Extended |url=https://cran.r-project.org/web/packages/PMCMRplus/index.html }}</ref>.
Not all statistical packages support post-hoc analysis for Friedman's test, but user-contributed code exists that provides these facilities (for example in [[SPSS]],<ref>{{cite web |title=Post-hoc comparisons for Friedman test |url=http://timo.gnambs.at/en/scripts/friedmanposthoc |access-date=2010-02-22 |archive-url=https://web.archive.org/web/20121103040410/http://timo.gnambs.at/en/scripts/friedmanposthoc |archive-date=2012-11-03 |url-status=dead }}</ref> and in [[R (programming language)|R]].<ref>{{cite web |title=Post hoc analysis for Friedman's Test (R code) |date=February 22, 2010 |url=https://www.r-statistics.com/2010/02/post-hoc-analysis-for-friedmans-test-r-code/ }}</ref>). Also, there is a specialized package available in [[R (programming language)|R]] containing numerous non-parametric methods for post-hoc analysis after Friedman.<ref>{{cite web |title=PMCMRplus: Calculate Pairwise Multiple Comparisons of Mean Rank Sums Extended |date=17 August 2022 |url=https://cran.r-project.org/web/packages/PMCMRplus/index.html }}</ref>


== References ==
== References ==
Line 85: Line 86:


== Further reading ==
== Further reading ==
* {{cite book |last=Daniel |first=Wayne W. |chapter=Friedman two-way analysis of variance by ranks |title=Applied Nonparametric Statistics |location=Boston |publisher=PWS-Kent |edition=2nd |year=1990 |isbn=0-534-91976-6 |pages=262–74 |chapterurl=https://books.google.com/books?id=0hPvAAAAMAAJ&pg=PA262 }}
* {{cite book |last=Daniel |first=Wayne W. |chapter=Friedman two-way analysis of variance by ranks |title=Applied Nonparametric Statistics |location=Boston |publisher=PWS-Kent |edition=2nd |year=1990 |isbn=978-0-534-91976-4 |pages=262–74 |chapter-url=https://books.google.com/books?id=0hPvAAAAMAAJ&pg=PA262 }}
* {{cite book |last=Kendall |first=M. G. |authorlink=Maurice Kendall |title=Rank Correlation Methods |year=1970 |edition=4th |location=London |publisher=Charles Griffin |isbn=0-85264-199-0 |url= }}
* {{cite book |last=Kendall |first=M. G. |author-link=Maurice Kendall |title=Rank Correlation Methods |year=1970 |edition=4th |location=London |publisher=Charles Griffin |isbn=978-0-85264-199-6 }}
* {{cite book |last=Hollander |first=M. |last2=Wolfe |first2=D. A. |title=Nonparametric Statistics |year=1973 |location=New York |publisher=J. Wiley |isbn=0-471-40635-X |url=https://books.google.com/books?id=ajxMAAAAMAAJ }}
* {{cite book |last1=Hollander |first1=M. |last2=Wolfe |first2=D. A. |title=Nonparametric Statistics |year=1973 |location=New York |publisher=J. Wiley |isbn=978-0-471-40635-8 |url=https://archive.org/details/nonparametricsta00holl |url-access=registration }}
* {{cite book |last=Siegel |first=Sidney |authorlink=Sidney Siegel |last2=Castellan |first2=N. John Jr. |title=Nonparametric Statistics for the Behavioral Sciences |year=1988 |edition=2nd |location=New York |publisher=McGraw-Hill |isbn=0-07-100326-6 |url=https://books.google.com/books?id=ha3AQgAACAAJ }}
* {{cite book |last1=Siegel |first1=Sidney |author-link=Sidney Siegel |last2=Castellan |first2=N. John Jr. |title=Nonparametric Statistics for the Behavioral Sciences |year=1988 |edition=2nd |location=New York |publisher=McGraw-Hill |isbn=978-0-07-100326-1 |url=https://books.google.com/books?id=ha3AQgAACAAJ }}


{{statistics|inference|collapsed}}
{{statistics|inference|collapsed}}
Line 94: Line 95:


{{DEFAULTSORT:Friedman Test}}
{{DEFAULTSORT:Friedman Test}}
[[Category:Analysis of variance]]
[[Category:Statistical tests]]
[[Category:Statistical tests]]
[[Category:Milton Friedman]]
[[Category:Milton Friedman]]

Latest revision as of 19:36, 1 August 2024

The Friedman test is a non-parametric statistical test developed by Milton Friedman.[1][2][3] Similar to the parametric repeated measures ANOVA, it is used to detect differences in treatments across multiple test attempts. The procedure involves ranking each row (or block) together, then considering the values of ranks by columns. Applicable to complete block designs, it is thus a special case of the Durbin test.

Classic examples of use are:

  • wine judges each rate different wines. Are any of the wines ranked consistently higher or lower than the others?
  • welders each use welding torches, and the ensuing welds were rated on quality. Do any of the torches produce consistently better or worse welds?

The Friedman test is used for one-way repeated measures analysis of variance by ranks. In its use of ranks it is similar to the Kruskal–Wallis one-way analysis of variance by ranks.

The Friedman test is widely supported by many statistical software packages.

Method

[edit]
  1. Given data , that is, a matrix with rows (the blocks), columns (the treatments) and a single observation at the intersection of each block and treatment, calculate the ranks within each block. If there are tied values, assign to each tied value the average of the ranks that would have been assigned without ties. Replace the data with a new matrix where the entry is the rank of within block .
  2. Find the values
  3. The test statistic is given by . Note that the value of does need to be adjusted for tied values in the data.[4]
  4. Finally, when or is large (i.e. or ), the probability distribution of can be approximated by that of a chi-squared distribution. In this case the p-value is given by . If or is small, the approximation to chi-square becomes poor and the p-value should be obtained from tables of specially prepared for the Friedman test. If the p-value is significant, appropriate post-hoc multiple comparisons tests would be performed.
[edit]
  • When using this kind of design for a binary response, one instead uses the Cochran's Q test.
  • The Sign test (with a two-sided alternative) is equivalent to a Friedman test on two groups.
  • Kendall's W is a normalization of the Friedman statistic between and .
  • The Wilcoxon signed-rank test is a nonparametric test of nonindependent data from only two groups.
  • The Skillings–Mack test is a general Friedman-type statistic that can be used in almost any block design with an arbitrary missing-data structure.
  • The Wittkowski test is a general Friedman-Type statistics similar to Skillings-Mack test. When the data do not contain any missing value, it gives the same result as Friedman test. But if the data contain missing values, it is both, more precise and sensitive than Skillings-Mack test.[5]

Post hoc analysis

[edit]

Post-hoc tests were proposed by Schaich and Hamerle (1984)[6] as well as Conover (1971, 1980)[7] in order to decide which groups are significantly different from each other, based upon the mean rank differences of the groups. These procedures are detailed in Bortz, Lienert and Boehnke (2000, p. 275).[8] Eisinga, Heskes, Pelzer and Te Grotenhuis (2017)[9] provide an exact test for pairwise comparison of Friedman rank sums, implemented in R. The Eisinga c.s. exact test offers a substantial improvement over available approximate tests, especially if the number of groups () is large and the number of blocks () is small.

Not all statistical packages support post-hoc analysis for Friedman's test, but user-contributed code exists that provides these facilities (for example in SPSS,[10] and in R.[11]). Also, there is a specialized package available in R containing numerous non-parametric methods for post-hoc analysis after Friedman.[12]

References

[edit]
  1. ^ Friedman, Milton (December 1937). "The use of ranks to avoid the assumption of normality implicit in the analysis of variance". Journal of the American Statistical Association. 32 (200): 675–701. doi:10.1080/01621459.1937.10503522. JSTOR 2279372.
  2. ^ Friedman, Milton (March 1939). "A correction: The use of ranks to avoid the assumption of normality implicit in the analysis of variance". Journal of the American Statistical Association. 34 (205): 109. doi:10.1080/01621459.1939.10502372. JSTOR 2279169.
  3. ^ Friedman, Milton (March 1940). "A comparison of alternative tests of significance for the problem of m rankings". The Annals of Mathematical Statistics. 11 (1): 86–92. doi:10.1214/aoms/1177731944. JSTOR 2235971.
  4. ^ "FRIEDMAN TEST in NIST Dataplot". August 20, 2018.
  5. ^ Wittkowski, Knut M. (1988). "Friedman-Type statistics and consistent multiple comparisons for unbalanced designs with missing data". Journal of the American Statistical Association. 83 (404): 1163–1170. CiteSeerX 10.1.1.533.1948. doi:10.1080/01621459.1988.10478715. JSTOR 2290150.
  6. ^ Schaich, E. & Hamerle, A. (1984). Verteilungsfreie statistische Prüfverfahren. Berlin: Springer. ISBN 3-540-13776-9.
  7. ^ Conover, W. J. (1971, 1980). Practical nonparametric statistics. New York: Wiley. ISBN 0-471-16851-3.
  8. ^ Bortz, J., Lienert, G. & Boehnke, K. (2000). Verteilungsfreie Methoden in der Biostatistik. Berlin: Springer. ISBN 3-540-67590-6.
  9. ^ Eisinga, R.; Heskes, T.; Pelzer, B.; Te Grotenhuis, M. (2017). "Exact p-values for pairwise comparison of Friedman rank sums, with application to comparing classifiers". BMC Bioinformatics. 18 (1): 68. doi:10.1186/s12859-017-1486-2. PMC 5267387. PMID 28122501.
  10. ^ "Post-hoc comparisons for Friedman test". Archived from the original on 2012-11-03. Retrieved 2010-02-22.
  11. ^ "Post hoc analysis for Friedman's Test (R code)". February 22, 2010.
  12. ^ "PMCMRplus: Calculate Pairwise Multiple Comparisons of Mean Rank Sums Extended". 17 August 2022.

Further reading

[edit]