Why Most Published Research Findings Are False
This paper by John Ioannidis (right) has become popular amongst a growing ‘truth movement’ and is being used a a general purpose tool to dismiss mainstream scientific papers regardless of their content.

However, the paper uses a fallacious argument based upon a common misconception of probability theory and is therefore highly misleading.
Ioannidis tries to claim that the probability of a research claim being true is dependent upon the papers referenced, which in turn have a certain probability of being true. Now if any of these references is false then the conclusions drawn from them are likely to be false. This is fair enough but the formalisation in terms of probability theory is highly flawed and attempts to quantify the likelihood of a paper being ‘true’ are doomed to failure.
From the very start of the paper we have:
“As has been shown previously, the probability that a research finding is indeed true depends on the prior probability of it being true (before doing the study), the statistical power of the study, and the level of statistical significance (Wacholder et al)“.
So there is a ‘prior probability‘ of a study being true even before the study has been performed, even before any data has been produced! It seems that the probability of it being true is independent of the actual outcome of the study! How to interpret this?
“The probability that a research claim is true may depend on study power and bias, the number of other studies on the same question, and, importantly, the ratio of true to no relationships among the relationships probed in each scientific field.“
This sentence has literally no sensible interpretation and hence the rest of the paper is meaningless.
What is meant for example by the “probability that a research claim is true”?
Nothing. ‘Probability’ applies to the outcomes of random events but a “research claim” is not a random event, it is either true or false and that is that.
If I toss an unbiased coin repeatedly, individual outcomes are unpredictable but in the long term we will see heads about half of the time so that we can say that the “probability of heads is 0.5”. This is not true of a research claim though as there is no repeated experiment and no random event here, the paper claims the same thing every time you look at it.
Ioannides has a point to make but the mathematical formulation is incorrect and leads to a nonsense result.
For example:
One way to formalise this is to say “I have 100 papers and I know that 10 of them are false and I pick one at random from a pile to use as a citation. Now I know that the probability of picking a false paper is 10%”.
This is correct but nobody picks papers at random like this.
In addition, If I were to add another 100 correct papers to the pile then the probability of me picking a false paper is now only 5% – the probability has changed! Ioannidis admits this: “The probability that a research claim is true may depend upon .. the number of other studies on the same question“.
This should be a red flag. The probability of your own paper being correct will change throughout time depending upon how many other studies have been published by other people in the meantime and this is whether you cite them or not!
By the time you publish your paper it may be less likely to be true than when you just finished writing it, even though nothing in your paper has changed!
This is too confusing for words and all stems from the incorrect formulation in the first place. Similar language indicating similar misconceptions is used throughout the paper.
The Wacholder paper
The Ioannidis paper references a paper by Wacholder et al and relies upon it heavily for the basic (flawed) idea. It is from here then that the root misconception originates. The paper is heavily focused on genetic association studies but Ioannidis in his paper tries to imply that the techniques are applicable to all areas of science.
“It can be proven that most claimed research findings are false” – No!
“Most Research Findings Are False for Most Research Designs and for Most Fields” – You may not say this until you have actually read ‘Most research findings in Most Fields’
Assessing the probability that a positive report is false: an approach for molecular epidemiology studies – Wacholder et al
“Classical frequentist statistical theory, which is most commonly taught in applied biostatistics courses, does not specifically address these probabilities. In classical theory, the truth of H0 and HA is considered unknown, not random.” – Wacholder et al
Classical theory is correct. A hypothesis is not a random event – it is a hypothesis and its truth or otherwise is unknown. You can toss a coin and you will get a different result each time but on average you will get heads 50% of the time – this is randomness. A coin toss can be thought of as a random variable and the outcome a random event. On the other hand, you can read a hypothesis as many times as you like and it is always the same; it is not a random variable and reading it does not have a random outcome.
“Therefore, we must go outside classical theory to consider H0 and HA probabilistically.”
The H’s are the hypotheses under consideration but they are not probabilistic entities. There is a reason that classical probability theory is the way that it is. It is correct, rigorous, useful, logical, structured and comprehensible. Good luck with ‘going outside‘ this theory.
“We define the prior probability (π) as π = Pr(HA is true).“
This is not a definition; simply writing that one thing is equal to another is not a definition. The problem here is that the right hand side is meaningless as there is no such thing as the ‘probability of a hypothesis being true’.
A coin tossing example
I have a coin and I suspect that it might be biased:
- Null hypothesis: Coin is not biased Pr(Heads) = Pr(Tails) = 0.5
- Hypothesis of bias: Coin is biased Pr(Heads) = 0.6, Pr(Tails) = 0.4
- Experiment: Toss the coin a few times and assess the data against the hypotheses
The prior probability then is the probability that my hypothesis is true – but what is this? What is the probability that the coin is biased? We can’t say. We know nothing about the coin as yet and so we can’t even say that it is a half. The truth or otherwise of the hypothesis is clearly unknown. Classical theory is correct.
Experiment 1: Toss the coin twice
Result: HH – Two heads
Calculation from model: The probability of seeing this result given the null hypothesis is 0.25, the probability of seeing it with respect to the hypothesis of bias is 0.36 i.e. greater than the null hypothesis
Q: What is the probability then that my hypothesis is true?
A: This question does not make sense and there is no answer to it
All we can ever do in general is to calculate the probability of seeing the data we have with respect to one ‘model’ or another. This result makes it seem that the hypothesis of bias is more likely but that is a function of the data seen.
Experiment 2: Toss the coin twice
Result: HT – One head followed by a tail
Calculation: The probability of seeing this result given the null hypothesis is 0.25, the probability of seeing it with respect to the hypothesis of bias is 0.24 i.e. less than the null hypothesis
The probability of seeing the data we are seeing depends upon the data and the hypothesis. There are an infinite number of hypotheses and each will give a different probability of seeing the same data.
Here we have lost a bit of confidence in our hypothesis but we cannot quantify this. We are always restricted by the fact that we only have a finite amount of data.
To emphasise: we calculate the probability of seeing the data given the hypothesis; not the probability that the hypothesis is ‘true’ as this is simply meaningless; no figure can be attributed to it. In an attempt to validate the hypothesis all we can do is to generate more and more meaningful data to test it against. This will in turn generate more and more confidence as to whether or not this data can be produced by the model but it will never be possible to assign an actual probability of the model being ‘true’.
Again, there may be an infinite number of other hypotheses that will fit the data slightly better than the one we are considering.
Maximal likelihood estimation
Although there are always going to be an infinite number of models (hypotheses) that can be used, it is possible, under certain circumstances, calculate which of these produces the biggest probability of seeing the data we are seeing. This is a maximum likelihood estimator for the observed data.
“In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statistical model, the observed data is most probable.” – Wikipedia
Note again that this is not ‘the probability distribution that is most likely to be true’.
Corollary 5: The greater the financial and other interests and prejudices in a scientific field, the less likely the research findings are to be true. – Ioannidis
Well we know what he means but this is layman’s language, has nothing to with probability theory and is impossible to formalise in mathematical terms. Ioannidis is playing mix ‘n’ match by using the word ‘likely’ to suggest a connection with probability theory but without supplying a formal definition.
Consider the coin tossing experiment: are the results any less valid if they are funded by big pharma? The conclusions of an experiment should depend upon the results alone and if scientists are doing anything else then they are at fault in this respect.
The lack of a mechanism
From the language used in the paper it is obvious that Ioannidis has in mind the techniques used by those in the field of genetics who are searching for connections between areas of the genome and either the physical features of organisms or the specifics of cellular activity.
To my knowledge there is no accurate description of a physical mechanism connecting a genome to a physical trait and therefore no statistical model can reasonably be formulated. In the coin tossing example we know how the overall result is achieved as the sum of independent and identically distributed individual coin tosses, but in the field of genetics nobody knows how things work and so such a formal theory is not possible.
Imagine trying to analyse the results of the coin tossing whilst not knowing anything about coins!
It seems that this is what geneticists are doing though, they are looking at genome sequences, looking at experimental results and then attempting to deduce some sort of causal link between the two but without mentioning the word ‘causal’. Instead, words like ‘significance’ and ‘p-number’ are used, but without a formal theoretical model this just amounts to confusion. Without a plausible mechanism, it is as scientifically valid to look at connections to astronomical events as it is to enumerate the base pairs of DNA. See: The DNA delusion
References:
Why Most Published Research Findings Are False – Wikipedia article
https://en.wikipedia.org/wiki/Why_Most_Published_Research_Findings_Are_False
Why Most Published Research Findings Are False – PDF
https://upload.wikimedia.org/wikipedia/commons/8/8e/Ioannidis_%282005%29_Why_Most_Published_Research_Findings_Are_False.pdf
What is a z-score? What is a p-value? – ArcMap
https://desktop.arcgis.com/en/arcmap/10.7/tools/spatial-statistics-toolbox/what-is-a-z-score-what-is-a-p-value.htm
Assessing the probability that a positive report is false: an approach for molecular epidemiology studies – Wacholder et al
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7713993/
Maximum likelihood estimation – Wikipedia
https://en.wikipedia.org/wiki/Maximum_likelihood_estimation