Science doesn’t “prove” anything – it simply provides data upon which a decision can be made. (I cringe when I see advertisements stating “clinically proven.” ) It is still up to the researcher / reader to decide the value of said research. One might provide proof in a court of law, or derive a proof as a mathematician. I’ll defer to physicists on this one, but one might even say that physics has proven the existence of a certain particle along the space-time continuum, i.e., that such a thing has at least at one time existed.
But in the biological sciences, there is convention to use inferential statistics, based on mathematical probabilities, to make decisions. At the root of many of these methods is a standardly accepted probability value (or p-value), typically 0.05 or sometimes 0.01, which means that the likelihood of the difference or statistical effect in question happening by random chance (not due to the intervention under scrutiny in the experiment) is <5% or <1%, respectively. Whether an effect is statistically significant is based on probability, not on some binary “yes” or “no” mathematical derivation.
My intent is not to make this a statistics lesson. (Kudos on making it through the above, by the way!) The important point that delineating whether or not substance X, treatment Y, training program Z or intervention ABC (the independent variable) improved performance, muscle size, strength, endurance or what have you (the dependent variable) in a given study may not tell the whole story. In fact, when the effect of an intervention is the same, simply having more subjects will reduce the p-value. A 2% difference may be meaningless in a practical sense and not statistically “significant” (p > 0.05) when 10 subjects were studied, Including 100 subjects in the study and the p value will be lower, meaning that one could claim statistical significance simply because more subjects were studied..
What does this mean? It means that, in a certainly sense, use just p-values to gauge the value and significance (see what I did there) of scientific findings does not tell the whole story. If you’re not a scientist, simply looking at the data – the averages and maybe even peeking at the variation among subjects by looking at the standard error bars – may tell you more than p-values. A 0.2 second improvement in 100m dash may have practical significance for a high level sprinter, but fail to reach statistical significance when few subjects were tested. For you a coach or trainer, it may be more convincing to find a p-value of 0.10 in connection with a 15% strength gain in a small study than a p value < 0.001 when group differences in strength are 5% in a large cohort-based analysis.
To get to this level of understanding however, you may have to read the study, or at least look at the pictures (figures). The abstract probably won’t cut it. (Yes, I did it again…) The knowledge you derive from science seems to have something to do with the effort you put into understanding it. (And now, I’m tempted to see if a study has examined this… J )