Talk:Randomized controlled trial

This is the talk page for discussing improvements to the Randomized controlled trial article.
This is not a forum for general discussion of the article's subject.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find medical sources: Source guidelines · PubMed · Cochrane · DOAJ · Gale · OpenMD · ScienceDirect · Springer · Trip · Wiley · TWL

Template:Vital article

This article has not yet been rated on Wikipedia's content assessment scale.
It is of interest to the following WikiProjects:

Please add the quality rating to the {{WikiProject banner shell}} template instead of this project banner. See WP:PIQA for details.

Pharmacology B‑class Mid‑importance

	This article is within the scope of WikiProject Pharmacology, a collaborative effort to improve the coverage of Pharmacology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.PharmacologyWikipedia:WikiProject PharmacologyTemplate:WikiProject Pharmacologypharmacology
B	This article has been rated as B-class on Wikipedia's content assessment scale.
Mid	This article has been rated as Mid-importance on the project's importance scale.

Please add the quality rating to the {{WikiProject banner shell}} template instead of this project banner. See WP:PIQA for details.

Medicine B‑class Mid‑importance

	Medicine portal This article is within the scope of WikiProject Medicine, which recommends that medicine-related articles follow the Manual of Style for medicine-related articles and that biomedical information in any article use high-quality medical sources. Please visit the project page for details or ask questions at Wikipedia talk:WikiProject Medicine.MedicineWikipedia:WikiProject MedicineTemplate:WikiProject Medicinemedicine
B	This article has been rated as B-class on Wikipedia's content assessment scale.
Mid	This article has been rated as Mid-importance on the project's importance scale.

Please add the quality rating to the {{WikiProject banner shell}} template instead of this project banner. See WP:PIQA for details.

Nursing B‑class Mid‑importance

	This article is within the scope of WikiProject Nursing, a collaborative effort to improve the coverage of Nursing on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.NursingWikipedia:WikiProject NursingTemplate:WikiProject NursingNursing
B	This article has been rated as B-class on Wikipedia's content assessment scale.
Mid	This article has been rated as Mid-importance on the project's importance scale.

Citizendium Porting (inactive)

This article is within the scope of WikiProject Citizendium Porting, a project which is currently considered to be inactive.Citizendium PortingWikipedia:WikiProject Citizendium PortingTemplate:WikiProject Citizendium PortingCitizendium Porting

Please add the quality rating to the {{WikiProject banner shell}} template instead of this project banner. See WP:PIQA for details.

Statistics B‑class Top‑importance

	This article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.StatisticsWikipedia:WikiProject StatisticsTemplate:WikiProject StatisticsStatistics
B	This article has been rated as B-class on Wikipedia's content assessment scale.
Top	This article has been rated as Top-importance on the importance scale.

What do we do about the inherent limitations? Recommendations?

Kudos on the limitations section, it is an accurate and concise enumeration of the consensus issues. Unfortunately, the sections on Randomization and Limitations are disjointed.

It is probably easier to focus on scientific limitations and not conflict of interest biases. To some extent, overcoming the limitations is about effective study design, which is probably too much to summarize here. Nonetheless, the authors of this page have hinted at suggestions with a very well written section on "Randomization".

But again, "Randomization" and "limitations" doesn't lead the reader to consider "ok now what do I do about it?" — Preceding unsigned comment added by 24.5.84.124 (talk) 03:29, 29 December 2012 (UTC)[reply]

Given our understanding of its limitations, isn't it time that we discourage references to RCT as a "gold standard?" It is a very fine tool, among many. It provides the best answer to a very specific question. But to get the right answer, we must ask the right question, and for many significant clinical (as well as social science) questions, RCT may not be the best tool at all. — Preceding unsigned comment added by 75.39.140.235 (talk) 14:59, 13 August 2013 (UTC)[reply]

Why

Why are there empty sub-headings? Guardian 00:15, 14 July 2006 (UTC)[reply]

control(led)

which to use in RCT, controlled vs control, is debatable. the number of google hits is 10x larger for the former so we should name it first as it is more widely accepted and understood. —The preceding unsigned comment was added by Mebden (talk • contribs) 10:16, 10 January 2007 (UTC).[reply]

The adjective controlled is grammatically correct. Tayste (edits) 22:44, 21 February 2010 (UTC)[reply]

Not so fast!

First, there are a number of inherent biases in what each particular search engine spits out (or claims to spit out). Not only that, but as Wikipedia is (perhaps unduly) influential in search engine hits, it can create a circular argument.

Second, a "control test" is readily recognised as a phrase in which the word "control" acts as an noun adjunct (what used to be known as an 'adjectival noun'). A "randomised control test" is also a grammatically correct rendition of a "control test" involving randomisation.

—DIV (137.111.13.4 (talk) 05:37, 4 March 2014 (UTC))[reply]

Duplication

Just noticed that the whole section under Urn randomisation is an exact copy of the section before.DanHertogs 15:22, 19 April 2007 (UTC)[reply]

It looks like it happened at this edit. I'll undo it. Good catch. Burlywood 18:42, 19 April 2007 (UTC)[reply]

Content inaccurate?

Challanging content: In intro paras -I dont believe randomisation ensures equal allocation - you can still have unequal allocation of confounding factors if you are unlucky. Others agree? (I have made no edits)

For any fixed covariate, the law of large numbers suggests that the method of random assignment has balance on that covariate, and large-deviation arguments suggest that the probability of great imbalance is very small. With lots of covariates (with complicated dependencies also), it is hard to say, but it is certainly plausible that randomization will fail to balance some covariates (say of 40 or more covariates, each of which some researcher may suspect of being confounding). In fact, there is substantial concern about covariate imbalance: See recent issues of Statistical Science, e.g. by Rosenberger or Paul Rosenbaum, etc.

I changed the introduction, to avoid the problem indicated by the previous editor (while still keeping the main message, that randomization is a good thing)! Kiefer.Wolfowitz (talk) 22:41, 27 June 2009 (UTC)[reply]

How random is random, and does it matter?

Should there be a section discussing HOW random the random allocations are? There are big statistical differences between sampling a stochastic process, using a pseudorandom number generator with adequate cycle length, and calling RAND() in Excel ... And what impact on Phase I, II or III trials might these differences have? Is it standard practice for study designers to describe randomization techniques? Daen (talk) 14:17, 19 November 2008 (UTC)[reply]

Clusters and correlation as a difficulty

This article should mention the difficulties associated with having to randomise clusters of individuals. An example is when a single intervention must be used for all subjects at a particular location for some reason of practicality, e.g. interventions are methods of care which are randomised to clinics or hospitals, therefore patients are clustered. I've just created cluster randomised controlled trial to delve into that topic more deeply, as time permits. Tayste (edits) 22:56, 21 February 2010 (UTC)[reply]

This is actually a challenging topic. I suggest referencing the "status quo" methods with a real world example for these complex topics (clusters and correlation). ANOVA is often used to ask such questions: "what is the measured variance within a group VS what is the measured variance between groups". Even with these groupwise statistical tests, the problem is at least as hard as figuring out a way to "measure" the similarity between a single patient sample observation and the "expected" observation. The more criteria we add, the more we run into multiple hypothesis problems Type-I errors. In a nutshell: even the simple problem is hard and becomes much harder when you consider the vast number of ways to "measure" the "distance" between two patient "samples".

Correlation is a very broad topic. Are you trying to correlate (and thereby cluster) patient samples, study features, or whole populations? I have had to review these issues as part of a informatics doctoral thesis. The more complex the method the less likely it is to be adopted in a RCT (or clinical practice).

http://web.psych.unimelb.edu.au/jkanglim/ANOVAandMANOVA.pdf http://en.wikipedia.org/wiki/Anova http://en.wikipedia.org/wiki/F-test http://en.wikipedia.org/wiki/Crossover_study

Sample size calculation

Should the article have a section on how to estimate the sample size that would be necessary in order to detect an effect of a given size? Or is that covered somewhere else in WP? Tayste (edits) 23:14, 21 February 2010 (UTC)[reply]

Opening paragraph

The brackets in the opening paragraph make the introduction to this article read really, really badly. I'm going to try make it a bit better. Tkenna (talk) 00:22, 6 May 2012 (UTC)[reply]

I've made a few further revisions to conserve some key epidemiological concepts, such as the experimental character of this study design. On a separate point, I feel many of the distinctions made in the last paragraph of the lead (ie "The terms "RCT" and randomized trial are often used synonymously, but some authors distinguish between "RCTs" which compare treatment groups with control groups not receiving treatment (as in a placebo-controlled study), and "randomized trials" which can compare multiple treatment groups with each other.[3] RCTs are sometimes known as randomized control trials.[4] RCTs are also called randomized clinical trials or randomized controlled clinical trials when they concern clinical research;[5][6][7]) are largely dated or WP:UNDUE (eg ref. 3). Consensus to reframe this paragraph and move some of the referenced material out of the lead? —MistyMorn (talk) 15:55, 6 May 2012 (UTC)[reply]

As I said on your talk page, I appreciate your edits on this article. I think it reads much better now while retaining the key points. AS far as reframing the paragraph goes, you'll have no opposition from me, you're clearly competent with these matters. Regards, Tkenna (talk) 16:24, 6 May 2012 (UTC)[reply]

This is one of the most important medical articles in Wikipedia. People can't understand medicine without understanding randomized, controlled trials. If people can't understand the introduction, they'll never get through the rest of the entry.

And yet this reads like an academic paper, written for people who already know what a randomized controlled trial is, written to show off how many polysyllabic medical terms the writer knows. It defines "randomized controlled trial" with the term "clinical trial." If a reader doesn't know what a randomized controlled trial is, they're not likely to know what a clinical trial is either. Before you use a term like "clinical trial," you have to explain what it is. In fact, most readers who don't know what a randomized controlled trial is won't know what a "scientific experiment" is either. This introduction has to be completely rewritten in plain English, preferably with one or more WP:RSs to support it. I would suggest seeing how professional writers, like New York Times reporters, have done it, rather than trying to create a definition out of your own head.

WP:NOTJOURNAL "Scientific journals and research papers. A Wikipedia article should not be presented on the assumption that the reader is well versed in the topic's field. Introductory language in the lead and initial sections of the article should be written in plain terms and concepts that can be understood by any literate reader of Wikipedia without any knowledge in the given field before advancing to more detailed explanations of the topic. While wikilinks should be provided for advanced terms and concepts in that field, articles should be written on the assumption that the reader will not or cannot follow these links, instead attempting to infer their meaning from the text." --Nbauman (talk) 06:31, 24 March 2014 (UTC)[reply]

Randomized clinical trials

RCT may also refer to randomized clinical trials. How we can incorporate this information? And which article wold be better to incorporate this information. Is this the good article to put this information? --Abhijeet Safai (talk) 06:12, 25 November 2012 (UTC)[reply]

An RCT or a RCT?

In the article a RCT and an RCT are used concurrently. Which one should be used? --130.229.5.232 (talk) 14:57, 30 April 2015 (UTC)[reply]

I vote for "an RCT" as when reading, I spell out the initials rather than the words they stand for. Maybe there's some guidance in the WP:MOS somewhere? Tayste (edits) 23:29, 30 April 2015 (UTC)[reply]

This APA style page suggests "to use your ears (how the acronym is pronounced), not your eyes (how it's spelled)". Tayste (edits) 23:35, 30 April 2015 (UTC)[reply]

Placebo versus Wait List - Serious Error on Page

Currently this excellent article contains a serious error. It states that "groups receiving the experimental treatment are compared with control groups receiving no treatment (a placebo-controlled study) or a previously tested treatment (a positive-control study)."

A no treatment or wait list group is not a placebo group. A wait list group is a group that is treated (for ethical reasons) after the wait period. By way of contrast, a placebo group is a group that receives an inert substance or treatment, not no treatment. This engages the placebo response, the treatment effect of receiving something that elicits the expectation effect. The placebo effect is substantial, eliciting improvement of over 30% in most studies.

One other minor quibble; in my field (behavioral psychology) what the authors refer to as "a positive-control study" is usually called an "active treatment." Grenheldas (talk) 04:08, 6 May 2016 (UTC)Grenheldas[reply]

Dr. Peters's comment on this article

Dr. Peters has reviewed this Wikipedia page, and provided us with the following comments to improve its quality:

General comment:
The article is mostly dealing with medical RCTs, which is of course fine since most of RCTs have taken place in the medical sector. Yet, in some parts, for example external validity, arguments that do not apply to medical studies but to social science studies are made. It would thus be better to structure the article in a way that this is more obvious. Two solutions are possible: Either address the differences between medical and social science RCTs already in the beginning and thus, discuss social and medical RCTs side by side in each section or splitting the article in two parts, with medical RCTs in the beginning and a section on social science RCTs in the end (as it is done to some extent at the moment but simply clearer).
Update to section 10: Randomized controlled trials in social science RCTs have recently gained attention in social sciences. In the field of economics, for example, a shift from theoretical studies to empirical work, particularly experiments, can be noted for the last decades (Hammermesh 2013). While the method is the same as in medical research, conducting RCTs in order to evaluate policy measures is different to medical RCTs when it comes to implementation. Several researchers have discussed these issues, which include, for example, choosing the right level of randomization, data collection or alternative randomization techniques (see, for example, Glennerster and Takavarasha 2013 or Duflo et al. 2008). Although RCTs have improved the internal validity of studies in the social science disciplines by minimizing selection bias in the last decade, they struggle with external validity, also in comparison to medical RCTs since issues like general equilibrium effects do not occur in medical RCTs. A recent systematic review by Peters, Langbein and Roberts (2016) analyzed 94 published articles in top economics journal between 2009 and 2014 and found that a majority of studies do not take external validity issues into account properly.
Update to section 10.2: International development section: RCTs have been applied in a number of topics throughout the world. A prominent example is the PROGRESA evaluation in Mexico, where conditional cash transfers were found to be beneficial on a number of levels for rural families and, based on the results of the RCT, the government introduced this as a policy (studies using PROGRESA are, among others, Attanasio et al. (2012) or Gertler (2004)). Other domains with evidence from a large array of interventions in developing countries include, among others, health (for example Miguel and Kremer 2003 or Dupas 2014), (micro-)finance sector (for example Tarozzi et al. (2014) or Karlan et al. (2014)) or education (for example Das et al. (2013) or Duflo et al. (2011, 2012).
Update to section 10.4 Education section: One of the first RCT in social science worldwide was the STAR experiment, which was started in 1985 and designed to determine the effect of small classes on short- and long-term pupil performance (for example Chetty et al. 2011).
Literature: Attanasio, O., Meghir, C. and Santiago, A. (2012). `Education Choices in Mexico: Using a Structural Model and a Randomized Experiment to Evaluate PROGRESA´, Review of Economic Studies, 79(1): 37-66. Chetty, R., Friedman, J. N., Hilger, N., Saez, E., Whitmore Schanzenbach, D. and Yagan, D. (2011). `How Does Your Kindergarten Classroom Affect Your Earnings? Evidence from Project Star´, Quarterly Journal of Economics, 126(4): 1593-1660. Das, J., Dercon, S., Habyarimana, J., Krishnan, P., Muralidharan, K. and Sundararaman, V. (2013). `School Inputs, Household Substitution, and Test Scores´, American Economic Journal: Applied Economics, 5(2): 29-57. Duflo, E., Dupas, P. and Kremer, M. (2011). `Peer Effects, Teacher Incentives, and the Impact of Tracking: Evidence from a Randomized Evaluation in Kenya´, American Economic Review, 101(5): 1739-1774. Duflo, E., Glennerster, R. and Kremer, M. (2008). `Using randomization in development economics research: a toolkit´, in (P. Schultz and J. Strauss, eds.), Handbook of Development Economics: 3895-3962, Amsterdam: North Holland. Duflo, E., Hanna, R. and Ryan, S. P. (2012). `Incentives Work: Getting Teachers to Come to School´, American Economic Review, 102(4): 1241-1278. Dupas, P. (2014). `Short-Run Subsidies and Long-Run Adoption of New Health Products: Evidence from a Field Experiment´, Econometrica, 82(1): 197-228. Gertler, P. (2004). `Do Conditional Cash Transfers Improve Child Health? Evidence from PROGRESA’s Control Randomized Experiment´, American Economic Review, 94(2): 336-341. Glennerster, R. and Takavarasha, K. (2013). `Running randomized evaluations – a practical guide´, Princeton University Press: Princeton and Oxford. Hamermesh, D.S. (2013). `Six Decades of Top Economics Publishing: Who and How?´, Journal of Economic Literature, 51 (1), 162-172. Karlan, D., Osei, R., Osei-Akoto, I. and Udry, C. (2014). `Agricultural Decisions after Relaxing Credit and Risk Constraints´, Quarterly Journal of Economics, 129(2): 597-652. Miguel, E. and Kremer, M. (2003). `Worms: Identifying Impacts on Education and Health in the Presence of Treatment Externalities´, Econometrica 72(1): 159-217. Peters, J., Langein, J. and Roberts, G. (2016). `Policy Evaluation, Randomized Controlled Trials, and External Validity – A Systematic Review´, Economics Letters, forthcoming. Discussion paper version published as: Ruhr Economic Papers 589: RWI.
Tarozzi, A., Mahajan, A., Blackburn, B., Kopf, D., Krishnan, L. and Yoong, J. (2014). `Micro-loans, Insecticide-Treated Bednets, and Malaria: Evidence from a Randomized Controlled Trial in Orissa, India´, American Economic Review, 104(7): 1909-1941.

We hope Wikipedians on this talk page can take advantage of these comments and improve the quality of the article accordingly.

We believe Dr. Peters has expertise on the topic of this article, since he has published relevant scholarly research:

Reference : Gunther Bensch & Jorg Peters, 2012. "A Recipe for Success? Randomized Free Distribution of Improved Cooking Stoves in Senegal," Ruhr Economic Papers 0325, Rheinisch-Westfalisches Institut fur Wirtschaftsforschung, Ruhr-Universitat Bochum, Universitat Dortmund, Universitat Duisburg-Essen.

ExpertIdeasBot (talk) 16:05, 24 August 2016 (UTC)[reply]

Krauss article referenced under "Disadvantages"

The article entitled "Why all randomised controlled trials produce biased results" - which has been repeatedly added despite being taken down several times - is riddled with errors and not a wise addition to this page. It has been widely panned by experts in trial methdology for several deeply inaccurate and misleading statements. Specifics include

- complete misunderstanding of the purpose and necessity of achieving "balance" in trials, ignoring extensive historical literature on the subject from world leaders in this area:

Altman DG. Comparability of randomised groups. The Statistician 1985; 34, 125-136.

Senn SJ. Testing for baseline balance in clinical trials. Statistics in Medicine 1994; 13:1715–1726

Senn SJ. Baseline balance and valid statistical analyses: common misunderstandings. Applied Clinical Trials 2005; 14:24–27.

Senn SJ. Seven myths of randomization in clinical trials. Statistics in Medicine 2013; 32L 1439-1450.

- suggestion of "re-randomisation" in pursuit of greater balance that (putting aside above statement about misunderstanding the need for balance) would be impossible to implement for the majority of RCT's (most major trials take several years to enroll their full study cohort; it is neither feasible nor desirable to wait until the entire cohort is enrolled to begin treating patients) and also ignores the purpose of "randomisation" as well as better strategies to achieve balance such as covariate-adaptive randomisation and minimisation

- discussion of "simple-treatment-at-the-individual-level" limitation ignores evolutions in trial design, such as I-SPY 2, that allow multiple comparisons of complex treatment combinations across and within specific patient subgroups. Appears to be entirely unaware of advancing literature in this area.

- discussion in "small sample bias" section makes a hilariously wrong statement about probabilities:

"An example is that the stroke trial with 624 participants reports that at 3 months after the stroke, 54 treated patients died compared to 64 placebo patients. This main outcome is the same likelihood as getting 10 more heads than tails by flipping a coin 624 times. "

This is not even close to being correct.

The coin-flip scenario refers to the probability of getting 317 heads in 624 tosses of a fair coin, which is a relatively simple problem to compute based on a series of Bernoulli trials with p=0.5 for a head on a single toss; the probability of getting exactly 317 heads in 624 tosses is about 2.9 percent and the probability of getting 317 or more heads is about 35.9 percent.

The probability of the observed results in the stroke trial is a more complex calculation with several additional parameters to estimate: the probability that we would observe the event rate in one treatment arm (54 deaths in 312 patients) versus the event rate in the other treatment arm (64 deaths in 312 patients) under the null hypothesis that the two arms share an unknown success probability (and that’s before we account for the timing of events as well).

If one must have a simplified analogy, it is somewhat closer to the probability of getting 54 sixes on 312 rolls of the treatment die versus 64 sixes on 312 rolls of the placebo die, although still not quite correct, that would have been a much closer description to this.

It is highly distressing to see such a misunderstanding of the probabilities used to assess trial results on display in an article by someone concerned with improving trial methodology.

- discussion of the unique time period assessment ignores statistical techniques specifically designed to analyze unequal follow-up time

- discussion of the background-traits-remain-constant assumption ignores substantive literature on mediation analysis in clinical trials to determine whether changes in participant behaviors are explanation for presence or absence of treatment effect

- discussion of "average treatment effects limitation" ignores the existence of adaptive-enrichment designs and other innovations in trial design that derive personalized estimates for efficacy from larger trial's results.

- misleading statement about a trial's results only "generalizing to 3 percent of the US population" - the trial in question was targeted at a specific population of people at high risk for diabetes; there is no concern whether the findings apply to newborn infants, kindergartners or 99-year-olds (all of whom are also in the "US population" but have no need for this trial program). This is like criticizing a breast cancer trial's results for not being generalizable to men. — Preceding unsigned comment added by 128.147.197.37 (talk) 20:30, 14 May 2018 (UTC)[reply]

Edit warring warning issued: User talk:128.147.197.37#May 2018

Please provide citations backing up your claim above that "It has been widely panned by experts in trial methdology for several deeply inaccurate and misleading statements". You also may wish to review our pages on WP:OR and WP:EW. --Guy Macon (talk) 22:45, 14 May 2018 (UTC)[reply]

It's been criticized in a letter, here. JuanTamad (talk) 02:52, 15 May 2018 (UTC)[reply]

I have looked at the letter that JuanTamad linked, and at the original piece. The issues raised in the original piece seems to have been discussed more carefully by earlier and more prominent scholars, and without arriving at the conclusions articulated in this piece and in the accompanying text currently in the Wikipedia article. The inclusion of this piece would appear to violate Wikipedia guidelines on false balance and due weight, WP:DUE and WP:FALSEBALANCE. Thus I don't think the original piece merits inclusion in this Wikipedia article. I suggest that the paragraph be removed. Fanyavizuri (talk) 15:19, 16 May 2018 (UTC)[reply]

I think the Krauss article might be worth a mention, along with a statement that it has been heavily criticized (in the letter so far and in tweet threads as exaggerating the disadvantages and being incorrect or misleading by people considered more expert. JuanTamad (talk) 03:56, 17 May 2018 (UTC)[reply]

I think I agree that the correct points in the article are worth mentioning, but I differ in that I think those ones should be attributed to Senn, Fisher, Imbens, and other original scholars in this area, not to this article that misstates and confuses those correct points. The incorrect ones are not worth mentioning, and the article isn't newsworthy enough itself (no major media coverage by NY Times or something like that) to make it deserve mention for purely news reasons. Is there a reason why you think this article is especially worth mentioning, other than that someone tried to put it in this wikipedia page? Many thanks for the constructive conversation. Fanyavizuri (talk) 12:51, 17 May 2018 (UTC)[reply]

Notably, in relation to the paragraph that is currently in the wikipedia page, perhaps the most concrete point (about average effects) was made by Cox in the 1950s. I would be happy to suggest alternative text making that point. The substance of the underlying journal article here also concerns the CONSORT guidelines, which (as they should be) are already prominently mentioned at the top of the wikipedia page. So I suggest that we make sure the Cox point on average effects is made and attributed appropriately, and that the rest of this be removed. Does that make sense? Fanyavizuri (talk) 14:08, 17 May 2018 (UTC)[reply]

Hopefully I edit this page correctly, as I am still new to this: the additional points here are well-made and more calmly reasoned than I was initially :) As noted, the Krauss article makes a handful of correct points, but those are significantly better described in prior work. The piece also makes a number of confused or outright incorrect points (Stephen Senn and Douglas Altman, two giants of the field, have written multiple articles discussing balance and randomisation; the author cited none of those and claimed that the issue was "not extensively discussed"). That, along with a catchy-clickbait-style title that has led many uncritical readers - who may not be expert enough to spot & understand the article's flaws - to a highly misleading conclusion that "all trials are biased" from the title alone. I echo one of the earlier comments - the article appears to be here only because someone tried to add it, presumably thinking it would get this article additional visibility and enhance someone's reputation. In support of that point, at least 2 well-known physician-researchers have posted on Twitter, revealing that the author (who knows neither of them personally) sent unsolicited emails asking them to promote his article to their Twitter followings:

https://twitter.com/hswapnil/status/996000139008356353

https://twitter.com/VinayPrasadMD/status/996048086492327936

In summary, the Krauss article ignores and fails to cite significant prior works in the field; it makes several incorrect statements that are easily contradicted by the prior work; and it seems to me a blemish on what is otherwise a very nice page explaining many of the complexities and nuances of RCT's. For one more example, the author of this piece discussed the need for re-randomization in pursuit of "balance" but entirely ignored the existence of covariate-adaptive randomization, which is a superior and a more practical method if one wants to enforce balance on covariates (again, this is not a necessary condition for valid inference, as extensively discussed by both Altman and Senn in the references above). There are many prior references on this; the Wikipedia page contains a description of it; and the author of this piece ignored or was unaware of this. 128.147.197.37 (talk) 14:12, 18 May 2018 (UTC)[reply]

I believe I agree with the substance of the comment above by 128.147.197.37. (One suggestion to 128.147.197.37: type four tilde characters at the conclusion of your comment, and it will automatically sign the comment with your IP address or username and date in a way that is easy for the rest of us to see). So I think I am in agreement with 128.147.197.37 on substance, and I may be in agreement with JuanTamad on the need to make the substantive and correct points (one of which may have been absent from the wikipedia page before), but I don't know whether JuanTamad has come to what I believe to be the consensus that 128.147.197.37 and I share: that the new "catchy-clickbait-style title" article itself (as 128.147.197.37 pretty accurately describes its title) should not be cited in the discussion in this wikipedia page. I could make such an edit, were we all to agree on those points. Fanyavizuri (talk) 13:54, 18 May 2018 (UTC)[reply]

The Krauss article on which text in the Wikipedia article is based is ill-informed and completely invalid. It will not only confuse the readers of the Wikipedia article but will cause harm because readers will be left with the idea that rigorous randomized experiments have shortcomings that they simply do not have. Krauss has no training that qualified him to write the Annals of Medicine article in the first place. For blatant misunderstandings to be published when virtually all of us who have been doing research on clinical trials methodology for decades fundamentally disagree with Krauss is hard to understand. It is wrong for Wikipedia to perpetuate ideas that should never have been accepted in a peer-reviewed journal in the first place. Harrelfe (talk) 13:34, 19 May 2018 (UTC)[reply]

It isn't our place to decide that what a peer-reviewed journal publishes is wrong. We are to report what the sources say. Find a source that supports your claim that The Krauss article is ill-informed and completely invalid.

"If Wikipedia had been available around the fourth century B.C., it would have reported the view that the Earth is flat as a fact and without qualification. And it would have reported the views of Eratosthenes (who correctly determined the earth's circumference in 240BC) either as controversial, or a fringe view. Similarly if available in Galileo's time, it would have reported the view that the sun goes round the earth as a fact, and Galileo's view would have been rejected as 'original research'. Of course, if there is a popularly held or notable view that the earth is flat, Wikipedia reports this view. But it does not report it as true. It reports only on what its adherents believe, the history of the view, and its notable or prominent adherents. Wikipedia is inherently a non-innovative reference work: it stifles creativity and free-thought. Which is a Good Thing." --WP:FLAT

--Guy Macon (talk) 15:27, 19 May 2018 (UTC)[reply]

You apparently did not see the reference above (https://www.bmj.com/content/361/bmj.k1561/rr) to the letter to the editor that has been published about this article. The letter is written by individuals who have studied randomized clinical trials in detail. Though one might legitimately disagree about claims against published literature in general, the article in question has been refuted by two of the premier medical statisticians in the world, Altman and Senn, whose credentials are impeccable. The quality of Krauss' article is the same as an article on economics that I as a biostatistician would write. Harrelfe (talk) 19:34, 19 May 2018 (UTC)[reply]

Summarizing. Harrelfe and 128.147.197.37 have clearly articulated, based on reading both the Krauss article and the followup by Altman and Senn, and on their own expert knowledge applied in that reading, that the Krauss article should not be cited here. I have looked at both pieces, and agree with their assessments. Guy Macon asked whether the quality of the Krauss article had been assessed in a peer-reviewed context, and that answer (the Altman and Senn piece) was supplied, so that a stance on the Krauss article's quality is not Original Research. JuanTamad raises the issue that if something meaningful is in the article, it could be reflected in this wikipedia page. I think the only thing to be sure about is the Cox (1950s) average effect point, for which it would be inappropriate to make Krauss the citation. Wikipedia, via its guidelines on WP:NPOV, does have a stance on giving undue weight to arguments, and to representing perspectives on a topic in proportion to those perspectives' traction outside wikipedia. The scholarly exchange we witness between Krauss, Altman, and Senn makes clear that the field of statistics has not taken Krauss' views on board. Thus, neither should wikipedia. I move that the paragraph referencing the Krauss piece be removed, and that, if people like, the relevant Cox citation be added to the article, either in the same place or in a more appropriate place. I am happy to do both of these things. It is healthy that we have had this discussion, so that it is not a matter of who clicked last in the edit war (and that this isn't entirely decided by the apparent sockpuppetry by the initial contributor of the Krauss content). But having said that: have all key points been addressed, so that this may proceed, or is there uncertainty about (a) the existence of the Altman and Senn article; (b) the stature of Altman (or Senn) in the field; (c) anything else? Fanyavizuri (talk) 20:32, 19 May 2018 (UTC)[reply]

(Note: the following was posted on my talk page by someone unfamiliar with Wikipedia and with where we discuss things. I am moving it here. --Guy Macon (talk) 14:20, 20 May 2018 (UTC))[reply]

A comment for the wiki talk:

I am not familiar with how wiki talks work - I just read wiki articles. A colleague told me about this wiki talk that you have contributed to: https://en.wikipedia.org/wiki/Talk%3ARandomized_controlled_trial#Krauss_article_referenced_under_"Disadvantages". It is worth mentioning that Krauss (the author of the article) responded to the response by Senn and Altman as seen here https://www.bmj.com/content/361/bmj.k1561/rr-0 Also, the Krauss article was peer-reviewed and published in the journal Annals of Medicine. The responses on that piece from Senn and Altman (https://www.bmj.com/content/361/bmj.k1561/rr) and in turn from Krauss (https://www.bmj.com/content/361/bmj.k1561/rr-0) and other comments were however not peer-reviewed and are not 'articles' but just replies or comments that anyone can submit and that do not go through any peer review process. I think the reference to Krauss's article should remain on that wiki page, because the sentence from the article included on the wiki page is factual, and because the Krauss article has gone through the peer-review process and been published in the journal Annals of medicine, while none of the other comments or responses have gone through the peer-review process or been published - they are replies that anybody can submit and have uploaded on BMJ's website. I hope you agree. — Preceding unsigned comment added by 2A02:908:1A7:6D40:E999:52BC:6989:33A9 (talk) 13:33, 20 May 2018 (UTC)[reply]

The fact that Krauss answered the letter to the editor is of no consequence. He's still very much mistaken even though he raised a couple of points that are correct. Are you saying that you will trust the research of someone not trained in clinical trials over Senn and Altman who have a combined 80 years of direct experience in researching and writing about clinical trials and have been involved in the conduct of dozens of clinical trials? And the peer-review process at Annals of Medicine was highly defective. Annals of Medicine made a major blunder in accepting a highly inaccurate article in which the author had major misunderstandings about how clinical trials work. That paper would never have been accepted in Annals of Internal Medicine or other major medical journals. I've been directly involved in clinical trials for 40 years myself so that makes 120.

Put another way, the number of articles written about randomized clinical trials by knowledgeable writers in respected journals number more than 10,000. Why should the Krauss article be cited instead of these? And if you want to know about one of the major mistakes Krauss makes, read my blog article explaining why randomized clinical trials are more generalizable than even optimists believe. My qualifications are listed here. Harrelfe (talk) 00:29, 21 May 2018 (UTC)[reply]

I guess what gives the Krause article some credibility is that it's from the London School of Economics so probably deserves a mention just for the record, since wikipedia is the sort of the official record you might say. I think at the end would be good, after everything else. Maybe: "In a provocatively entitled article, ....." and in the same sentence link to the letter condemning the article and anything else that might appear in the future. JuanTamad (talk) 02:27, 21 May 2018 (UTC)[reply]

Thanks to Guy Macon and JuanTamad for the careful thought given to this. I still agree with Harrelfe, but for the additional reasons given by Wikipedia guidelines. As guided by WP:RECENT, Wikipedia can sometimes mistakenly give undue emphasis to a debate as it plays out, rather than to its resolution once it has done so. I agree that the text proposed by JuanTamad is intended to be even-handed, but Wikipedia guidelines on WP:NPOV would still lead me to believe that this would give relatively undue weight to the arguments inolved. WP:NPOV guides us that we should represent perspectives on a topic in proportion to those perspectives' traction outside wikipedia. The scholarly exchange we witness between Krauss, Altman, and Senn makes clear that the field of statistics has not taken Krauss' views on board. Thus, neither should wikipedia. If this eventually becomes a bigger scholarly methods debate in this journal or elsewhere, Wikipedia should reflect that. It clearly hasn't become this yet. The article under discussion has been cited exactly twice, and I think only by the critique and its reply. These represent the sum total of the citations of that author recorded by google scholar. Senn's other work has been cited more than ten thousand times. Altman, 300,000. Our continued conversation on the even-handedness that is necessary here is inherently misrepresenting the state of scholarship in the field of statistics. I think that we should remove any text about this piece now, and wait a few months to see whether this gains traction. I have seen this waiting approach used in the context of other active debates. There is no reason to cover it in the encyclopedia entry until the merits and importance of the debate has become clear, which takes a bit more time. So, on the grounds of WP:NPOV and WP:RECENT, I think we should strike the current paragraph and wait to see what unfolds rather than report in a potentially unbalanced way on a discussion of as-yet-unknown importance in scholarly journals. (I think the importance is known, and is low, but I don't see Guy Macon and JuanTamad concurring with that as a consensus view, unless I am mistaken. So I am proposing removal of the text on procedural rather than substantive grounds, even though I think substantive grounds are justification enough, as I think do Harrelfe and 128.147.197.37.) Fanyavizuri (talk) 13:40, 21 May 2018 (UTC)[reply]

I agree with removal of the text because for now as it gives undue weight to Krause and waiting as proposed by Fanyavizuri. JuanTamad (talk) 10:45, 22 May 2018 (UTC)[reply]

It seems that consensus has been building in my time away this weekend; just a few more things to add, broadly related to appeals-to-authority and how that could/should inform the discussion (and thanks to those that have linked Wiki's policies on some of these things, as by my own admission I am new to Wiki's editing process). First, re: the above comment that the Krauss article was peer-reviewed while Senn and Altman's subsequent letter were not, it is reasonable to point out that peer-reviewed manuscripts ought hold more weight than one person's opinion; my counter would be that Senn and Altman have a lengthy history of peer-reviewed publications in this area, much of which was (inexcusably) ignored by Krauss in writing his article, which directly explain many of the points that he states incorrectly (again, these are factual inaccuracies, not differences of opinion). I cited a few of these publications above in my first comment above. This is likely why Harrellfe (one of the world's leading experts on clinical trials) expressed incredulity at the Annals of Medicine's peer-review process which allowed this into print; Guy Macon, you've pointed out (fairly) that it is not Wikipedia's job to determine what should pass peer-review, but it is worth noting that now three of the leading experts in the history of this field (Senn, Altman, and now Harrell) have publicly expressed a disbelief that this article made it past peer-review. Also - if the original author is given some credibility by association with the London School of Economics (as a post-doctoral researcher), surely that is counterbalanced by the high credentials of Senn (professor for many years at University College London and the Luxembourg Institute of Health, author of three textbooks on clinical trials & statistical issues in drug development), Altman (professor of statistics in medicine at the University of Oxford, founder of the Equator Network for health research reliability, chief statistical advisor to the British Medical Journal, has statistical methods named after him), and Harrell (founding chair of the department of biostatistics at Vanderbilt, statistical editor of multiple journals, consultant to the FDA on clinical trial design and methodology, also has statistical methods named after him). In truth, I would prefer to avoid appeals-to-authority, but focus on whether the content is meritorious (regardless of the source) - if one agrees with me on that, please also note that the content of the Krauss article has essentially been pre-emptively "responded" to in dozens of prior articles by Senn, Altman, Harrell, and others. It is not their fault that a young scholar, perhaps eager to make a name for himself, simply missed or ignored their works in the area and plunged ahead with a provocatively titled article, then shoved it onto Wikipedia in the hopes he could garner more clicks. 128.147.197.37 (talk) 12:48, 22 May 2018 (UTC)[reply]

Looks like we're ready to strike the relevant paragraph on at least the basis of undue weight and waiting, if not also substance. I think this agreement is common to JuanTamad, Harrelfe, 128.147.197.37, and myself. (We haven't heard from Guy Macon over the weekend, but does this seem right?) I'll go ahead (or anyone else could go ahead) and do this in the next day or so if there are no objections. I again thank everyone involved for the calm and constructive tone of the discussion. Fanyavizuri (talk) 14:07, 22 May 2018 (UTC)[reply]

I trust the judgement of my fellow editors on this. Whatever the consensus is, I am good with it. --Guy Macon (talk) 20:21, 22 May 2018 (UTC)[reply]

Done. Fanyavizuri (talk) 20:51, 22 May 2018 (UTC)[reply]

Thank you -Frank Harrell — Preceding unsigned comment added by Harrelfe (talk • contribs) 00:34, 23 May 2018 (UTC)[reply]

A separate issue is that, if you look at the original posting user's contribution history, you'll find that all of his edits involve inserting citations to articles by Alexander Krauss, including this one. Seems like a pretty obvious case of WP:CITESPAM. The user hasn't engaged in any edit warring, but is inserting what are pretty clearly self-references willy-nilly, so I'm not quite sure how to handle this situation. WeakTrain (talk) 17:02, 15 July 2018 (UTC)[reply]