Talk:Null hypothesis: Difference between revisions
Line 35: | Line 35: | ||
This entry is confusing, to say the least. The introduction is somehow split in two sections by the TOC, and paradoxically is too short. The closing sections on controversies and pubilcation bias could be merged as well. I am not attempting a rewrite for I know little about statistics myself - but even so it is evident that the article could be clearer.--[[User:Duplode|Duplode]] 01:20, 4 April 2006 (UTC) |
This entry is confusing, to say the least. The introduction is somehow split in two sections by the TOC, and paradoxically is too short. The closing sections on controversies and pubilcation bias could be merged as well. I am not attempting a rewrite for I know little about statistics myself - but even so it is evident that the article could be clearer.--[[User:Duplode|Duplode]] 01:20, 4 April 2006 (UTC) |
||
---- |
|||
The sentence "However, formulating the null hypothesis before collecting data rejects a true null hypothesis only a small percent of the time." seems to imply that the small percent of the time that a true null hypothesis is different from from the small number of true null hypothesis which would be rejected if tested on the same dataset. It implies that testing many hypothesis on a single dataset will produce less accurate results, with a justification which very much resembles the false statement, "after flipping a coin 4 times and getting heads each time, the odds of getting heads on the fifth flip are very low." [[Special:Contributions/98.114.27.182|98.114.27.182]] ([[User talk:98.114.27.182|talk]]) 17:58, 1 May 2011 (UTC) |
|||
== ?? == |
== ?? == |
Revision as of 17:58, 1 May 2011
Mathematics Start‑class Mid‑priority | ||||||||||
|
Statistics Unassessed | ||||||||||
|
References
Seeing as how this is based on a scientific, or at least scholarly topic, I believe this needs references. I'm (I hope appropriately) adding the "Not Verified" tag, so hopefully someone will come through and add references so the poor, confused beginning stats students (myself included) can know this is more than the ramblings of a demented mind. Having had experience using the subject matter, even reading this straight from a textbook can make one crazy. Garnet avi 10:41, 5 September 2006 (UTC)
Unsorted Comments
I think the main thing that the article misses is that the null hypothesis is always the hypothesis that there is no difference between two (or I suppose more) groups. Thus the word "null". Generally speaking, when studying something, you are trying to establish a difference between two groups (i.e. that group A, which received medication, did better than group B, which did not). It is statistically convenient (as well as philosophically convenient) to always start from the same premise. 24.45.14.133 04:44, 14 February 2007 (UTC) nullo
Sorry, this stuff was above the contents, but are not together or titled. I thought they would be more appropriate under the contents, so moved them to be so. Garnet avi 10:41, 5 September 2006 (UTC)
Is this sentence, from the article, correct?
- But if the null hypothesis is that sample A is drawn from a population whose mean is no lower than the mean of the population from which sample B is drawn, the alternative hypothesis is that sample A comes from a population with a larger mean than the population from which sample B is drawn, and we will proceed to a one-tailed test.
It seems as if the null hypothesis says that mean(A) >= mean(B). Therefore the alternative hypothesis should be the negation of this, or mean(A) < mean(B). But the text states that the alternative hypothesis is that mean(A) > mean(B). Is this right?
- I agree. Fixed. --Bernard Helmstetter 19:54, 8 Jan 2005 (UTC)
The difference between H0:μ1 = μ2 and H0:μ1 - μ2 = 0 is unclear, to say the least. --Bernard Helmstetter 20:02, 8 Jan 2005 (UTC)
This entry is confusing, to say the least. The introduction is somehow split in two sections by the TOC, and paradoxically is too short. The closing sections on controversies and pubilcation bias could be merged as well. I am not attempting a rewrite for I know little about statistics myself - but even so it is evident that the article could be clearer.--Duplode 01:20, 4 April 2006 (UTC)
The sentence "However, formulating the null hypothesis before collecting data rejects a true null hypothesis only a small percent of the time." seems to imply that the small percent of the time that a true null hypothesis is different from from the small number of true null hypothesis which would be rejected if tested on the same dataset. It implies that testing many hypothesis on a single dataset will produce less accurate results, with a justification which very much resembles the false statement, "after flipping a coin 4 times and getting heads each time, the odds of getting heads on the fifth flip are very low." 98.114.27.182 (talk) 17:58, 1 May 2011 (UTC)
??
I'm a sophomore in high school, here's my request:
Could someone create a "Null hypothesis for dummies" section? as it is now, this article is very hard to comprehend. -- Somebody
- I agree. This is unreadable. Behaviour of sets of data indeed! Nonsense. Can someone with scientific authority define it here please and just delete the rest of this garbage? —Preceding unsigned comment added by 90.179.192.118 (talk) 19:02, 5 October 2009 (UTC)
"Null hypothesis for dummies" would be useful. In the examples there are null hypotheses stating that "the value of this real number is the same as the value of that real number". Is there some explanation for why such a hypothesis is reasonable? It seems to me that for a very broad class of probability distributions the null hypothesis has probability of 0 and the opposite probability of 1. The article at the moment says this:
However, concerns regarding the high power of statistical tests to detect differences in large samples have led to suggestions for re-defining the null hypothesis, for example as a hypothesis that an effect falls within a range considered negligible. This is an attempt to address the confusion among non-statisticians between significant and substantial, since large enough samples are likely to be able to indicate differences however minor.
So the more data we have, the more likely it is that the null hypothesis is rejected? This is exactly what should happen if the null hypothesis is always false - the only difference is in how much data we need to prove that. Is this the case in actual use? If so, how does the theory justify drawing conclusions from a false premise? Presumably the theory is "robust enough" when there isn't "too much data", but how exactly does this work? 82.103.214.43 14:58, 11 June 2006 (UTC)
Elisabeth Anscombe
Who the hell is she and why is she quoted here? Any reference?
- Misspelling. Elizabeth Anscombe. Flapdragon 22:00, 18 May 2006 (UTC)
Are you sure the author of the quote is Elizabeth Anscombe? Francis Anscombe was a statistician who, among other things, applied statistical methods to agriculture and is a much more plausible source for that quote. As stated above, a source for the quote would be nice.--jdvelasc 21:29, 9 October 2006 (UTC)
- Also, go to here for details of a significant, although inadvertent contribution to the philosophy of language by Anscombe.Lindsay658 22:07, 16 May 2007 (UTC)
- As I said above, it is very doubtful that Elizabeth Anscombe is the author of that quote. I am taking it out until someone can source it. --Jdvelasc 18:35, 4 September 2007 (UTC)
example conclusion
"For example, if we want to compare the test scores of two random samples of men and women, a null hypothesis would be that the mean score of the male population was the same as the mean score of the female population, and therefore there is no significant statistical difference between them:"
This is wrong, the two samples can have the the same mean and be statistically totally different (e.g. differ in variance). 84.147.219.67 15:56, 26 June 2006 (UTC)
- I made some changes: I deleted "and therefore there is no significant statistical difference between them:", because it is redundant and arguably incorrect. I also added a few words to the part about assuming they're drawn from the same population, to say that this means they have the same variance and shape of distribution too. I deleted the equation with mu1 - mu0 = 0 because it was out of context IMO given the sentence that was just before it, and because it is practically the same as the previous equation mu1 = mu0. Sorry I forgot again to put an "edit summary". Coppertwig 00:14, 5 November 2006 (UTC)
"File drawer problem"?
What is it, and why does it make a sudden and unexplained appearance near the end of this article? If I hadn't gotten a C- in stats I'd go out and fix it myself. :) --User:Dablaze 13:29, 1 August 2006 (UTC)
- The "file drawer problem" is this: suppose a researcher carries out an experiment and does not find any statistically significant difference between two populations. (For example, tests whether a certain substance cures a certain illness and does not find any evidence that it does.) Then, the researcher may consider that this result (or "non-result") is not very interesting, and put all the notes about it into a file drawer and forget about it, instead of publishing it which is what the researcher would have done if the test had found the interesting result that the substance apparently cures the illness.
- Not publishing it is a problem for several reasons: one, other researchers may waste time carrying out the same test on a useless substance and also not publishing. Two, it is sometimes possible to find a statistically significant result by combining the results of several studies; this can't easily happen if it isn't published so nobody knows about it. Three, if various researchers keep repeating the same experiment and not finding statistically significant results, and then one does the same experiment and by a random fluke (luck) does get a statistically significant result, they might publish that and it would look as if the substance cures the illness, although if you combined the results of all the studies you would see that there is no statistically significant result overall.
- It really does make sense if you can guess what "file drawer problem" means. Does it need a few words in the article to explain it? Coppertwig 00:00, 5 November 2006 (UTC)
Accept, reject, do not reject Null Hypothesis
After a statistical test (say, determining p-values), one can only reject or not reject the Null Hypothesis. Accepting the alternative hypothesis is wrong because there is always a probability that you are incorrectly accepting or rejecting (alpha and beta; type I and type II error). --70.111.218.254 02:03, 22 November 2006 (UTC)
Actually, it seems that the first paragraph is entirely confusing. One can not ACCEPT null hypothesis. One can only REJECT or FAIL TO REJECT it. On the other hand, one can ACCEPT alternative hypothesis or FAIL TO ACCEPT it. See D. Gujarati: Basic Econometrics, Fourth Edition, 2004, p.134 --- Argyn
Besides type I and type II error, there's a problem which remains big even when your statistical significance is excellent: that both the Null Hypothesis and the Alternative Hypothesis can be false. I suppose they usually are; they are usually at best oversimplifications (models) of a situation in the real world. That's why the alternative hypothesis is merely "accepted", not "proven" nor "shown" nor "established". However, it can be "shown" or "established", with a certain statistical significance level, that the null hypothesis is false. --Coppertwig 10:29, 15 February 2007 (UTC)
I object to the statement "a null hypothesis (H0) is a hypothesis (scenario) set up to be nullified, refuted, or rejected ('disproved' statistically) in order to support an alternative hypothesis." The point of a null hypothesis is to represent what we currently expect. If the data in an experiment is not sufficient to reject that hypothesis, there is no point in considering an alternative hypothesis. Testing a null hypothesis is a form of Occam's razor--why consider a new, alternative hypothesis when the data is plausibly explained by an existing one (the null hypothesis)? So the null hypothesis is NOT "set up to be rejected"; it is set up to be _tested_. A null hypothesis represents an existing theory that may explain a given set of data. —Preceding unsigned comment added by Maxbox51 (talk • contribs) 17:44, 9 October 2008 (UTC)
Formulation of null hypotheses
This article appears to be a little confused at the moment — I would appreciate a little discussion before I make some changes. In particular...
"if the null hypothesis is that sample A is drawn from a population whose mean is lower than the mean of the population from which sample B is drawn, the alternative hypothesis is that sample A comes from a population with a higher mean than the population from which sample B is drawn, which can be tested with a one-tailed test."
I believe this to be misleading. A null hypothesis is a statement of no effect — by definition it has no directionality. There is a very good reason for this: null hypothesis testing works by first assuming the null hypothesis to be true, and then calculating how often we would expect to see results as extreme as those observed even when the null hypothesis is true. That is, we are trying to find out how often the observed results would be obtained by chance.
It is only possible to do this when we have a well-defined null hypothesis — e.g. when it states that one mean is equal to another mean, or when a mean is equal to a defined value. It would not be possible to calculate our test statistic if our null hypothesis merely said, "Mean one is less than mean two", and indeed this would not be a null hypothesis.
I think the confusion arises in the case of a one-tailed test. Take, for example, an experiment investigating the height of men and women in a class. We might wish to test the hypothesis "that men are taller than women". In this case our hypotheses are as follows:
- Null: That men and women are of equal height.
- Experimental: That men are taller (have greater height) than women.
In this case, we have defined our experimental hypothesis in a one-tailed form. The question many people ask is, "But what if women are taller than men? Surely neither of our hypotheses addresses this?". The confusion then lies in whether or not the null hypothesis should incorporate this possibility. To the very best of my knowledge, it should not: the null hypothesis remains a statement of no effect.
The reason for this is that we are looking to see whether there is evidence to support the specific experimental hypothesis that we have postulated. If we find our results to be non-significant, this tells us that we do not have sufficient evidence to accept our specific experimental hypothesis. If it turns out that we're interested in a difference that we find in the other direction, well that suggests that we should have proposed a two-tailed hypothesis in the first place. Indeed, I would argue that it is very rare indeed that a one-tailed hypothesis is appropriate: we are almost always interested in results in the other direction from that predicted.
Does this sound sensible? If so, then I will modify the article accordingly, and will add some relevant citations! -- Sjb90 16:34, 15 May 2007 (UTC)
- OK, I will start making some changes to this later. This has been quite a complicated issue to resolve, and has involved going back to the paper that first defined the term 'null hypothesis'. Full discussion can be found on Lindsay658's talk page. -- Sjb90 11:06, 18 May 2007 (UTC)
Earlier Null hypothesis Discussion
I thought that the following should appear here (originally at [1] for the ease of others.Lindsay658 (talk) 21:21, 22 February 2008 (UTC)
Hi there,
Over on the null hypothesis talk page, I've been canvassing for opinions on a change that I plan to make regarding the formulation of a null hypothesis. However I've just noticed your excellent edits on Type I and type II errors. In particular, in the null hypothesis section you say:
The consistent application by statisticians of Neyman and Pearson's convention of representing "the hypothesis to be tested" (or "the hypothesis to be nullified") with the expression Ho -- associated with an increasing tendency to incorrectly read the expression's subscript as a zero, rather than an "O" (for "original") -- has led to circumstances where many understand the term "the null hypothesis" as meaning "the nil hypothesis". That is, they incorrectly understand it to mean "there is no phenomenon", and that the results in question have arisen through chance.
Now I know the trouble with stats in empirical science is that everyone is always feeling their way to some extent -- it's an inexact science that tries to bring sharp definition to the real world! But I'm really intrigued to know what you're basing this statement on -- I'm one of those people who has always understood the null hypothesis to be a statement of null effect. I've just dug out my old undergrad notes on this, and that's certainly what I was taught at Cambridge; and it's also what my stats reference (Statistical Methods for Psychology, by David C. Howell) seems to suggest. In addition, whenever I've been an examiner for public exams, the markscheme has tended to state the definition of a null as being a statement of null effect.
I'm a cognitive psychologist rather than a statistician, so I'm entirely prepared to accept that this may be a common misconception, but was wondering whether you could point me towards some decent reference sources that try to clear this up, if so! —The preceding unsigned comment was added by Sjb90 (talk • contribs) 11:07, 16 May 2007 (UTC).
- Sjb90 . . . There are three papers by Neyman and Pearson:
- Neyman, J. & Pearson, E.S., "On the Use and Interpretation of Certain Test Criteria for Purposes of Statistical Inference, Part I", reprinted at pp.1-66 in Neyman, J. & Pearson, E.S., Joint Statistical Papers, Cambridge University Press, (Cambridge), 1967 (originally published in 1928).
- Neyman, J. & Pearson, E.S., "The testing of statistical hypotheses in relation to probabilities a priori", reprinted at pp.186-202 in Neyman, J. & Pearson, E.S., Joint Statistical Papers, Cambridge University Press, (Cambridge), 1967 (originally published in 1933).
- Pearson, E.S. & Neyman, J., "On the Problem of Two Samples", reprinted at pp.99-115 in Neyman, J. & Pearson, E.S., Joint Statistical Papers, Cambridge University Press, (Cambridge), 1967 (originally published in 1930).
- Unfortunately, I do not have these papers at hand and, so, I can not tell you precisely which of these papers was the source of this statement; but I can assure you that the statement was made on the basis of reading all three papers. From memory, I recall that they were quite specific in their written text and in their choice of mathematical symbols to stress that it was O for original (and not 0 for zero). Also, from memory, I am certain that the first use of the notion of a "null" hypothesis comes from:
- Fisher, R.A., The Design of Experiments, Oliver & Boyd (Edinburgh), 1935.
- And, as I recall, Fisher was adamant that whatever it was to be examined was the NULL hypothesis, because it was the hypothesis that was to be NULLIFIED.
- I hope that is of some assistance to you.
- It seems that it is yet one more case of people citing citations that are also citing a citation in someone else's work, rather than reading the originals.
- The second point to make is that the passage you cite from my contribution was 100% based on the literature (and, in fact, the original articles).
- Finally, and this comment is not meant to be a criticism of anyone in particular, simply an observation, I came across something in social science literature that mentioned a "type 2 error" about two years ago. It took me nearly 12 months to track down the source to Neyman and Pearson's papers. I had many conversations with professional mathematicians and statisticians and none of them had any idea where the notion of Type I and type II errors came from and, as a consequence, I would not be at all surprised to find that the majority of mathematicians and statisticians had no idea of the origins and meaning of "null" hypothesis.
- I'm not entirely certain, But I have a feeling that Fisher's work -- which I cited as "Fisher (1935, p.19)", and that reference would be accurate -- was an elaboration and extension of the work of Neyman and Pearson (and, as I recall, Fisher completely understood the it was an oh, rather than a zero in the subscript). Sorry I can't be of any more help. The collection that contains the reprints of Neyman and Pearson's papers and the book by Fisher should be fairly easy for you to find in most university libraries.Lindsay658 22:37, 16 May 2007 (UTC)
- Thanks for the references, Lindsay658 -- I'll dig them out, and have a bit of a chat with my more statsy colleagues here, and will let you know what we reckon. I do agree that it's somewhat non-ideal that such a tenet of experimental design is described rather differently in a range of texts!
- As a general comment, I think it entirely acceptable for people working in a subject, or writing a subject-specific text book / course to read texts more geared towards their own flavour of science, rather than the originals. After all, science is built upon the principle that we trust much of the work created by our predecessors, until we have evidence to do otherwise, and most of these derived texts tend to be more accessible to the non-statistician. However I agree that, when writing for e.g. Wikipedia, it is certainly useful to differentiate between 'correct' and 'common' usage, particularly when the latter is rather misleading. This is why your contribution intrigued me so -- I look forward to reading around this and getting back to you soon -- many thanks for your swift reply! -- Sjb90 07:39, 17 May 2007 (UTC)
- OK, I've now had a read of the references that you mentioned, as well as some others that seemed relevant. Thanks again for giving me these citations -- they were really helpful. This is what I found:
- First of all, you are quite right to talk of the null hypothesis as the 'original hypothesis' -- that is, the hypothesis that we are trying to nullify. However Neyman & Pearson do in fact use a zero (rather than a letter 'O') as the subscript to denote a null hypothesis. In this way, they show that the null hypothesis is merely the original in a range of possible hypotheses: H0, H1, H2 ... Hi.
- As you mentioned, Fisher introduced the term null hypothesis, and defines this a number of times in The Design of Experiments. When talking of an experiment to determine whether a taster can successfully discriminate whether milk or tea was added first to a cup, Fisher defines his null hypothesis as "that the judgements given are in no way influenced by the order in which the ingredients have been added ... Every experiment may be said to exist only in order to give the facts a chance of disproving the null hypothesis."
- Later, Fisher talks about fair testing, namely in ensuring that other possible causes of differentiation (between the cups of tea, in this case) are held fixed or are randomised, to ensure that they are not confounds. By doing this, Fisher explains that every possible cause of differentiation is thus now i) randomised; ii) a consequence of the treatment itself (order of pouring milk & tea), "of which on the null hypothesis there will be none, by definition"; or iii) an effect "supervening by chance".
- Furthermore, Fisher explains that a null hypothesis may contain "arbitrary elements" -- e.g. in the case where H0 is "that the death-rates of two groups of animal are equal, without specifying what those death-rates actually are. In such cases it is evidently the equality rather than any particular values of the death-rates that the experiment is designed to test, and possibly to disprove."
- Finally, Fisher emphasises that "the null hypothesis must be exact, that is free from vagueness and ambiguity, because it must supply the basis of the 'problem of distribution,' of which the test of significance is the solution". He gives an example of a hypothesis that can never be a null hypothesis: that a subject can make some discrimination between two different sorts of object. This cannot be a null hypothesis, as it is inexact, and could relate to an infinity of possible exact scenarios.
- So, where does that leave us? I propose to make the following slight changes to the Type I and type II errors page and the null hypothesis page.
- I will tone down the paragraph about original vs. nil hypotheses: the subscript is actually a zero, but it is entirely correct that the hypothesis should not be read as a "nil hypothesis" -- I agree that it is important to emphasise that the null hypothesis is that one that we are trying to nullify.
- In the null hypothesis article, I will more drastically change the paragraph that suggests that, for a one-tailed test, it is possible to have a null hypothesis "that sample A is drawn from a population whose mean is lower than the mean of the population from which sample B is drawn". As I had previously suspected, this is actively incorrect: such a hypothesis is numerically inexact. The null hypothesis, in the case described, remains "that sample A is drawn from a population with the same mean as sample B".
- I will tone down my original suggestion slightly: A null hypothesis isn't a "statement of no effect" per se, but in an experiment (where we are manipulating an independent variable), it logically follows that the null hypothesis states that the treatment has no effect. However null hypotheses are equally useful in an observation (where we may be looking to see whether the value of a particular measured variable significantly differs from that of a prediction), and in this case the concept of "no effect" has no meaning.
- I'll add in the relevant citations, as these really do help to resolve this issue once and for all!
- Thanks again for your comments on this. I will hold back on my edits for a little longer, in case you have any further comments that you would like to add!
- I agree with your changes. As you can see from [[2]],
[[3]], [[4]], and [[5]] I really didn't have a lot to work with.
- I believe that it might be helpful to make some sort of comment to the effect that when statisticians work -- rather than scientists, that is -- they set up a question that is couched in very particular terms and then try to disprove it (and, if it can not be disproved, the proposition stands, more or less by default).
- The way that the notion of just precisely how the issue of a "null hypothesis" is contemplated by "statisticians" and the way that this (to common ordinary people counter-intuitive notion) of, essentially, couching one's research question as the polar opposite of what one actually believes to be the case (by contrast with "scientists" who generally couch their research question in terms of what they actually believe to be the case) is something that someone like you could far better describe than myself -- and, also, I believe that it would be extremely informative to the more general reader. All the best in your editing. If you have any queries, contact me again pls. Lindsay658 21:49, 17 May 2007 (UTC)
- Just a note to say that I have finally had the chance to sit down and word some changes to the Null hypothesis article and the section on Type_I_and_type_II_errors#The_null_hypothesis. Do shout and/or make changes if you think my changes are misleading/confusing! -- Sjb90 11:33, 14 June 2007 (UTC)
Analogy to proof by contradiction
I noticed the request to simplify (above), and had the idea of inserting something into the lede that would relate this idea to a proof by contradiction. For example, "This is similar to the idea of a proof by contradiction, but instead of a definite proof, experimental data is used to show that the null hypothesis is very unlikely to be true.". I'm not entirely sure if that wording is clear enough, or perhaps there's some imprecision; suggestions? —AySz88\^-^ 22:48, 13 March 2008 (UTC)
Rewrite this page please
The language used in this article is extremely unclear and vague if not outright confusing. Will someone familiar with the subject please edit it, making sure that the text and subject matter have flow? This is a very badly compiled text so far. The exact definition for the null hypothesis seems to have been forgotten initially, and vague examples are presented to illustrate an undefined term. Thanks
Hypothesis Testing Definition
I came to this page trying to understand the concept. Maybe somebody more learned than I can make the following statement more readily accessible to the layman... "Hypothesis testing works by collecting data and measuring how probable the data are, assuming the null hypothesis is true." —Preceding unsigned comment added by Jdownie (talk • contribs) 12:19, 14 October 2010 (UTC)
The initial paragraph
I think that the last but one sentence is unclear and the last one is wrong. Therefore, I suggest that the end of the initial paragraph:
- It is possible for an experiment to fail to reject the null hypothesis. The null hypothesis is never accepted as suspicion always remains over its validity. Failing to reject H0 allows for alternative hypotheses to be developed and tested.
to be changed into this:
- It is possible for an experiment to fail to reject the null hypothesis. This does not prove H0 in any way, it only means that it can not be rejected. The best one can do to substantiate H0 is to consider all reasonable alternative hypotheses and reject them.
--Pot (talk) 17:26, 27 January 2009 (UTC)
- I just now rewrote the sentence in question, before seeing your similar comments here... AnonMoos (talk) 10:22, 18 February 2009 (UTC)
- I think the version above is more clear. What do you think about putting the above one instead of the new one you wrote? --Pot (talk) 12:04, 18 February 2009 (UTC)
- The version above is not good. It almost says to try as many alternative models as possible ...and if enough alternatives are tried one can then always reject the null hypothesis, unless of course some effort is put into taking proper account of having done the multiple tests. In any case, some of these ideas should more properly be stated in the article on hypothesis testing. Melcombe (talk) 13:10, 18 February 2009 (UTC)
- There was a problem with the wording "the null hypothesis is never accepted"; this may be correct according to some definitions of technical terminology, but it would fail to convey much useful meaning to those who aren't already knowledgeable. Your revisions are OK on this point... AnonMoos (talk) 15:27, 18 February 2009 (UTC)
- The version above is not good. It almost says to try as many alternative models as possible ...and if enough alternatives are tried one can then always reject the null hypothesis, unless of course some effort is put into taking proper account of having done the multiple tests. In any case, some of these ideas should more properly be stated in the article on hypothesis testing. Melcombe (talk) 13:10, 18 February 2009 (UTC)
- I think the version above is more clear. What do you think about putting the above one instead of the new one you wrote? --Pot (talk) 12:04, 18 February 2009 (UTC)
H0 or H0?
I suggest that H0 be used in place of H0. Same for H1 and similar ones in the article. This is not a variable, so probably its name should not be slanted. --Pot (talk) 16:15, 28 January 2009 (UTC)
Merging
Someone added this in the main article: {{Mergeto|Statistical hypothesis testing|Talk:Null hypothesis|date=August 2008}}
why? I cannot see a discussion on this topic. Maybe it should be deleted? --Pot (talk) 16:31, 28 January 2009 (UTC)
I have adjusted the template to point here rather than to Talk:Statistical hypothesis testing as there was no discussion there at all. Melcombe (talk) 10:06, 29 January 2009 (UTC) And I have added a corresponding template to Statistical hypothesis testing. Melcombe (talk) 10:17, 29 January 2009 (UTC)
- Reasons for merging would be
- The section uses term "null hypothesis testing", and refers to criticisms of this, when it is no different from "Statistical hypothesis testing" and this already has its own section on criticism ... so all the criticism should be together.
- The article is supposedly about the idea of a "Null hypothesis" and its role in testing, not about the more general idea of "null hypothesis testing" which is covered in "Statistical hypothesis testing" ...so the whole section has no relevance here.
- Melcombe (talk) 10:24, 29 January 2009 (UTC)
- Do you mean that only the Null_hypothesis#Controversy section should be merged? How to do that? --Pot (talk) 12:05, 29 January 2009 (UTC)
- Well I think the publication-bias stuff should be moved elsewhere also, and that it would be better off in Statistical hypothesis testing. However there may be better ways of restructuring all that needs to be included, possibly by splitting it off into a separate aticle for the sophisticats. Melcombe (talk) 13:42, 29 January 2009 (UTC)
- Do you mean that only the Null_hypothesis#Controversy section should be merged? How to do that? --Pot (talk) 12:05, 29 January 2009 (UTC)
I have moved the blocks of text on criticism and publication bias to Statistical hypothesis testing. Melcombe (talk) 15:31, 10 March 2009 (UTC)
Introductory definition.
Could I suggest that the concept of the null hypothesis be introduced in the article as this:
"something not proven, yet not contradicted by the data"
It can then be elaborated upon. If people agree that this definition is not a misrepresentation, then it may be less cryptic and more understandable to a general readership.
Feedback anyone? 121.73.7.84 (talk) 11:03, 28 May 2009 (UTC)
- No, this is not accurate. You have some data. You, based on your experience, make a null hypothesis (e.g. the data is Gaussian). At this stage, the hypothesis may be well contradicted by the data, but you have not yet done so. At the next stage, you do statistical hypothesis testing. You analyse the results: if the test does not contradict the data to a certain significance, then you do not reject the null hypothesis. Only at this stage is your statement true. --Pot (talk) 11:46, 29 May 2009 (UTC)
OK, that makes sense. I've looked the definition up in different dictionaries and found varying definitions, so the clarification is useful. My Oxford dictionary states: a hypothesis suggesting that the difference between statistical samples does not imply a difference between populations. Your definition is clearer.
121.73.7.84 (talk) 02:16, 31 May 2009 (UTC)
The current introductory definition may seem right to people who already know what a null hypothesis. It is certainly of no use to anybody else. In reading this talk page I noticed some remarks that might help produce a definition for the general reader - remarks regarding "no effect" and "no" difference from what was already supposed. Something like that might bring into relief the reason for the word "null" in "null hypothesis." The Tetrast (talk) 15:41, 9 July 2009 (UTC).
- I think that the text introduced by an anonymous user and deleted by user:Melcombe was headed towards the right direction, even if incorrect. The text changed the first line from "formally describes some aspect of the statistical behaviour of a set of data" into "formally describes the behaviour of a set of data as unbiased or normal". What about something like "formally describes a statistical hypothesis made on a set of data"?
- I think we need something that is more clear than the current wording to people not very accustomed to statistics. Something that, in the first line, may not be formally exhaustive, but that explains the main concept as concretely as possible. --Pot (talk) 11:32, 9 September 2009 (UTC)
- I think the problem here (and in the section below) is in trying to do too much in a lead-in, when there needs to be two things: a simple-to-understand lead-in and a more formal/complete introduction section containing a good definition. As an aside on Pot's comment is that the immediately preceding edit (19:45, 20 August 2009) had created the present first two sentences out of a longer more meaningful single sentence which might be worth going back to. Before moving towards expanded definitions, it would be good to first consider whether this article should be merged with alternative hypothesis as null and alternative are almost always considered together and as there is no necessity to have separate articles for every single item of statistical terminology. Such a joining might even make it easier to construct good definitions. Of course it would be important not to duplicate what is/should be in significance testing and/or Statistical hypothesis testing. Actually the last of these contains a lot of definition for "null hypothesis". A reason for not relying on "statistical hypothesis" to provide meaning in the first sentence is that it relies on readers knowing what this is. Melcombe (talk) 12:53, 9 September 2009 (UTC)
Why "null"?
I think the article should explain in the definition section why the null hypothesis is called "null". There is virtually an unlimited number of "hypotheses" that will correspond to the definition currently provided by the article, while, generally in the context of a specific experiment, the null hypothesis will correspond quite naturally to a well defined situation. For example, if the experiment is to test the efficacy of some clinical intervention, the null hypothesis will normally correspond to absence of efficacy (no difference between treatment group and control group). If the experiment is to see the effect of exposure to a risk factor, the null hypothesis will usually correspond to the "no effect" situation (disease occurence is equally likely in exposed and non-exposed populations), etc. In general, the null hypothesis tend to represent, basically, the statu quo situation. This should be said up-front in the definition. --Dessources (talk) 20:02, 16 July 2009 (UTC)
- But null hypotheses are not always used "in the context of a specific experiment". In fact in real-world statistical analyses they are almost never "in the context of a specific experiment". If look at Cox's list of reasons why a hypothesis test might be done (in hypothesis testing article I think), only 1 out of 6 reasons is directly equivalent to this sort of experiment to test a difference ... but all of them have a null hypothesis of some sort. Melcombe (talk)
- I mentioned experiment as an example of an instance of the use of the statistical concept. Experiments are an important application of statistical concepts, in particular in the biomedical field. The current definition is defective in the sense that it does not provide the full meaning of the term "null hypothesis" - it lacks specificity. There are clear situations where no one would hesitate in choosing which situation corresponds to the null hypothesis, and which corresponds to the alternative hypothesis. When assessing the efficacy of a new treatment, clearly the null hypothesis states that the teratment makes no difference, and rejecting it supports the alternative hypothesis that the new treatment is efficacious (or deleterious). Opting for a null hypothesis stating that the new treatment is efficacious or deleterious and an alternative hypothesis stating that the treatment has no effect, would surely be confusing and would be an inappropriate use of the concept of null hypothesis. And yet, the definition, as it is currently stated, would not rule out such possibility. There is a semantic element missing in the definition. Furthermore, this definition is a bit convoluted, and does not correspond to standard usage. Most definitions available on the Web seem to do a better job. Perhaps going back to an authoritative and reliable source might help finding a wording that is both simpler and more specific that the current definition.
- --Dessources (talk) 18:56, 17 July 2009 (UTC)
- I think the "semantic reason" for the term you are trying to impute here is not connected to the reason Fisher used the term null hypothesis, so some care would be needed. Melcombe (talk) 13:01, 9 September 2009 (UTC)
- But null hypotheses are not always used "in the context of a specific experiment". In fact in real-world statistical analyses they are almost never "in the context of a specific experiment". If look at Cox's list of reasons why a hypothesis test might be done (in hypothesis testing article I think), only 1 out of 6 reasons is directly equivalent to this sort of experiment to test a difference ... but all of them have a null hypothesis of some sort. Melcombe (talk)
Coin example hypothesis selection
This example is basically saying that with the 5 heads test data it is possible to conclude that "coin is biased (towards heads)", as presumably shown with first selection of null hypothesis, but you can not conclude that "coin is fair", as presumably shown in the second null hypothesis!? But if you are able so conclude that "coin is biased (to any direction)" then you surely can conclude that "coin is not fair". So the selection of first hypothesis is actually the correct idea as it can conclude what we want to conclude, and the second can not. Or am I missing something? Yebbey (talk) 18:29, 1 March 2010 (UTC)
The point I suppose is that you are not allowed to choose the null hypothesis after seeing the data. The type of hypothesis testing is designed to reject a true null hypothesis at most 5% of the time. You were asked to check if the coin was fair, with no indication of which way it might be biased. 5 heads and 5 tails are equally unlikely outcomes. If you were to make a conclusion that the coin was biased with both these outcomes, you would be concluding that the coin was biased more then 5% of the time, even with an unbiased coin. This is not allowed. On the other hand, if you had prior information that the coin was biased towards heads, and made this your alternative hypothesis before looking at the data, you can make a conclusion. Of course, 5 flips is way to few. In practice, you would do many more. Perhaps the example should be changed, but that requires working out the correct probabilities for (say) 100 flips. —Preceding unsigned comment added by PeterWT (talk • contribs) 16:36, 2 March 2010 (UTC)
I'm admittedly not an expert on the null hypothesis, but the logic used in the introduction paragraph is certainly wrong. Say your null hypothesis is, "The coin is not biased," and you decide to flip the coin 1000 times. The probability of the coin coming up heads exactly 500 times is 2.52% -- and this is the most probable outcome, so the probability of the coin coming up heads any given number of times is even less. Therefore, no matter how many times your coin comes up heads, you can conclude by the logic of the current introduction, that the null hypothesis is false. You'll find that your coin is biased no matter what, even if it comes up heads exactly 500 times! Msuperdock (talk) 19:01, 23 June 2010 (UTC)
Copy-edits
Guild of Copy Editors | ||||
|
Made a bunch of changes. Hope you like them. Lfstevens (talk) 21:21, 7 June 2010 (UTC)
Jump in readership
went from a steady 1 to 3k traffic, to 143k today, thanks to this XKCD cartoon.Mercurywoodrose (talk) 17:20, 30 April 2011 (UTC)