Talk:Wilcoxon signed-rank test: Difference between revisions
Line 52: | Line 52: | ||
== The W statistic == |
== The W statistic == |
||
According to [http://faculty.vassar.edu/lowry/ch12a.html], which is currently one of the external links at the end of this article, the test statistic W is computed as the sum of W+ and W-. This differs from this article, which uses the minimum of W+ and W-. Apparently this is an unresolved issue as some descriptions use only W+, others use the minimum of W+ and W-, and yet others use the sum of W+ and W-. I emailed the author of that page (Dr. Lowry) about this. He said that using the sum of W+ and W- will converge to approximate the Normal distribution with fewer comparisons, and that the method of using the minimum of W+ and W- date to the time when the properties of the relevant sampling distributions had to be worked out laboriously by hand, and are only useful for small-sample cases (where n is less than about 10). He referred me to Mosteller & Rourke, Sturdy Statistics: Nonparametrics and Order Statistics, Addison-Wesley, 1973. I believe his argument is sound, and I think that the method of using the minimum of W+ and W- will tend to lead to more false-positives simply because it relies on fewer observations. I propose that we change this article to describe the simpler and more robust method of computing W as the sum of W+ and W-.--[[User:Headlessplatter|Headlessplatter]] ([[User talk:Headlessplatter|talk]]) 18:25, 21 June 2011 (UTC) |
According to [http://faculty.vassar.edu/lowry/ch12a.html], which is currently one of the external links at the end of this article, the test statistic W is computed as the sum of W+ and W-. This differs from this article, which uses the minimum of W+ and W-. Apparently this is an unresolved issue as some descriptions use only W+, others use the minimum of W+ and W-, and yet others use the sum of W+ and W-. I emailed the author of that page (Dr. Lowry) about this. He said that using the sum of W+ and W- will converge to approximate the Normal distribution with fewer comparisons, and that the method of using the minimum of W+ and W- date to the time when the properties of the relevant sampling distributions had to be worked out laboriously by hand, and are only useful for small-sample cases (where n is less than about 10). He referred me to Mosteller & Rourke, Sturdy Statistics: Nonparametrics and Order Statistics, Addison-Wesley, 1973. I believe his argument is sound, and I think that the method of using the minimum of W+ and W- will tend to lead to more false-positives simply because it relies on fewer observations. I propose that we change this article to describe the simpler and more robust method of computing W as the sum of W+ and W-.--[[User:Headlessplatter|Headlessplatter]] ([[User talk:Headlessplatter|talk]]) 18:25, 21 June 2011 (UTC) |
||
I've changed the test procedure to clarify all the issues you've mentioned. This section and the example now match the procedure recommended by Dr. Lowry. |
|||
== Assumptions for Wilcoxon-signed rank test - needs excpention == |
== Assumptions for Wilcoxon-signed rank test - needs excpention == |
Revision as of 02:10, 20 April 2012
Statistics Unassessed | ||||||||||
|
It appears that some portion of this have been copied from a textbook: "The recommended cutoff varies from textbook to textbook — here we use 20 although some put it lower (10) or higher (25)." Is there a copyright violation happening here? 96.49.201.125 (talk) 04:39, 1 December 2009 (UTC)
Ordinal data
Im no expert, but pretty sure that this test can also deal with ORDINAL data
The test is in parts not well described, even I, as someone who teaches statistics, have difficulties to understand the test fully from the text. sigbert Mi Apr 11 08:09:57 CEST 2007
- NO! Since you are subtracting two values (e.g. pre vs. post) it has to be interval data, not just ordinal. --Statprof (talk) 17:18, 15 April 2008 (UTC)
I'm pretty sure it could work with ordinal (Ratings of women after 5 pints of beer?) —Preceding unsigned comment added by 128.243.253.114 (talk) 21:06, 21 April 2008 (UTC)
- Wrong: As Statprof mentioned, subtracting values presupposes interval data: In your example it would presuppose that someone rated 10 is equally far away from someone rated 9 as someone rated, say 3 is from someone rated 2. 128.232.231.16 (talk) —Preceding undated comment added 11:07, 2 March 2009 (UTC).
I would say that the test generally doesnt work for ordinal data. But if you have ordinal data and it is possible to assume that a change in two scale steps always is a greater change than a change of one scale step, a change of three steps always is a greater change than a change of two steps and so forth (a kind of ordinal data which is closed to interval, i.e. semi-interval, even though it is not a perfect equidistance) then the test will work. It demands carefull validation of the scale and rather heavy assumptions, to be on the safe side a sign test could be prefarable. //MG Stat. —Preceding unsigned comment added by 90.231.200.20 (talk) 06:16, 22 April 2010 (UTC)
An already referenced article says, 'the measures of XA and XB have the properties of at least an ordinal scale of measurement, so that it is meaningful to speak of "greater than," "less than," and "equal to."' , and lots of other pages (just one example) agree. To people claiming that 'subtraction requires interval data': the test uses the sign and the rank (not the value!) of each difference, which make sense for ordinal data. I'm changing the article. --asqueella (talk) 11:49, 24 March 2011 (UTC)
- The above comment by 90.231.200.20 makes a lot of sense, but I couldn't find a source for that. --asqueella (talk) 12:30, 24 March 2011 (UTC)
- All known WP:Reliable sources say it works for ordinal data, so the article should reflect that, not ill-informed speculation and WP:original research. Subtracting values does not require interval data. Defining a measure of effect size might do so, but this is a hypothesis test, not a measure of effect size. Qwfp (talk) 19:21, 13 April 2011 (UTC)
Assumptions for Wilcoxon-signed rank test?
What are the assumptions for the wilcoxon signed rank test? unfortunately information and appliances i find, mainly "contradict". some statistical books state that one assumption of the test is that the distribution of the differences should be symmetric!!!!??? but this assumption would only be true under the null??? thanks. —Preceding unsigned comment added by Fanny151984 (talk • contribs) 10:11, 23 August 2008 (UTC)
One sample testing against hypothesis
Some statistics software (e.g. GraphPad Prism) claim to use the Wilcoxon Signed Rank test for non-parametric one sample testing. i.e. compares the median of a single group to a hypothetical median. Prism distinguishes from the more common two sample test by calling that the Wilcoxon matched pairs test.
N.B. The one sample test on the difference between matched pairs in two groups seems to be equivalent to a Wilcoxon signed rank test on those two groups and comparing to the null hypothesis that the median difference is equal to zero. Although in the one sample test you can compare to any hypothetical median, not just one equal to zero.
If this is legitimate, a section on this should probably be included. I'm not completely sure so haven't just added it in. schroding79 (talk) 00:59, 2 September 2008 (UTC)
Example wrong?
The example describes the W+ statistic as a sum of the signed ranks. This contradicts the "Test Procedure" section (and my understanding of the W+ statistic) that says W+ is the minimum of the sum of ranks for positive differences and the sum of ranks for negative differences.
- Yeah, I was trying to figure out how the test could be that the value is less than the critical value if it is this type of sum (i.e. if all were positive, then would not-reject always, but then it almost has to be a reject). Also, the previous section makes reference to the statistic converging (presumably in distribution) to the normal but then doesn't say what the mean and SD are. 018 (talk) 18:19, 3 February 2010 (UTC)
In the example should we be looking the critical values for n=10 as the page says or should we use n=9 because we disregard the data point where the values are equal? —Preceding unsigned comment added by 82.25.68.121 (talk) 20:37, 14 October 2010 (UTC)
I've corrected the example using the sum of the signed ranks and the appropriate critical value, and I corrected the test procedure to match this. — Preceding unsigned comment added by Kastchei (talk • contribs) 02:06, 20 April 2012 (UTC)
Conflict with Siegel & Castellan?
65.78.66.242 commented on the article page: "The diecision rule state here is in error. According to Siegel & Castellan 1988 pp. 88-89, the stated decision rule is if the calculated value is less than or equal to the critical value then the null hypothesis is rejected (not retained as stated in the Wikipeda text)." MichaK (talk) 16:13, 25 October 2010 (UTC)
The W statistic
According to [1], which is currently one of the external links at the end of this article, the test statistic W is computed as the sum of W+ and W-. This differs from this article, which uses the minimum of W+ and W-. Apparently this is an unresolved issue as some descriptions use only W+, others use the minimum of W+ and W-, and yet others use the sum of W+ and W-. I emailed the author of that page (Dr. Lowry) about this. He said that using the sum of W+ and W- will converge to approximate the Normal distribution with fewer comparisons, and that the method of using the minimum of W+ and W- date to the time when the properties of the relevant sampling distributions had to be worked out laboriously by hand, and are only useful for small-sample cases (where n is less than about 10). He referred me to Mosteller & Rourke, Sturdy Statistics: Nonparametrics and Order Statistics, Addison-Wesley, 1973. I believe his argument is sound, and I think that the method of using the minimum of W+ and W- will tend to lead to more false-positives simply because it relies on fewer observations. I propose that we change this article to describe the simpler and more robust method of computing W as the sum of W+ and W-.--Headlessplatter (talk) 18:25, 21 June 2011 (UTC)
I've changed the test procedure to clarify all the issues you've mentioned. This section and the example now match the procedure recommended by Dr. Lowry.
Assumptions for Wilcoxon-signed rank test - needs excpention
I have just added proper citation for the assumption of the wilcoxon signed rank test:
http://en.wikipedia.org/wiki/Wilcoxon_signed-rank_test#Assumptions
I am not sure this section is complete, since under the framework of randomization tests, the wilcox test has a different H0 (where it works on the effect of the treatment in changing the "distribution", and not necessarily a shift of location parameter).
Tal Galili (talk) 10:17, 8 December 2011 (UTC)
Confidence Interval section is not good
The section on confidence intervals is complete garbage. This article covers the paired-sample test, but it looks like someone copied some text that was intended for a single-sample non-paired test, and even did a poor job at that. The notation doesn't make sense (e.g., D_i has a single subscript, but seems to define a 2-D matrix of differences), isn't consistent with the notation of the previous section, and the wording itself contains numerous typos, poor grammar, and awkward descriptions.
How to compute the confidence interval for the paired test is not at all obvious, and thus deserves coverage here. As is, the section that is there is doing a disservice and would be better removed. I would, however, like to have a reference for how it is done in the paired case. With the in the test, I don't think it is as easy as using the Walsh averages of Zi.
-- Lonnie Chrisman 19:33, 8 December 2011 (UTC)
I removed this section until someone can add the correct information.Kastchei (talk) 02:08, 20 April 2012 (UTC)