Doing "science" can be an arduous business. Always limited by financial and ethical constraints, researchers are having a seriously hard time to produce practically relevant results. Some studies are conducted with cells, others with worms, rodents, guinea pigs, real pigs and apes; and it goes without saying that the applicability of results from these model experiments to a real-world human scenario varies considerably. Aside from the model (cell lines, rodents, pigs, ...) and its oftentimes limited validity, there are yet a three other things that matter:
- The type and study design of the study / experiment the paper you are looking
- The experimental and statistical procedures the scientists used
- The often-overlooked difference between statistical and real-world significance
In the following paragraphs we are going to take a brief look at all three of these things every physique enthusiast, and even every educated layman should know about "scientific studies".
#1 - Type of the paper and general study design
The first thing you can and actually have to do, whenever you look at a reference to or abstract of a scientific paper is to identify what exactly it is you are looking at. It's a scientific paper, of course, but what kind of study is it based on?
Is it based on an analysis of several other studies? In that case it would be a review. Does it describe the methods and results of a lab study or a clinical trial? Well, then it would qualify as an experimental study. If it's neither about cell lines in a petri dish, rodents in a cage or subjects who participate in a controlled trial, it's probably dealing with data from several hundred people who haven't been part of a controlled experiment, but subjects in an epidemiological study.
All of these formats have their strengths and weaknesses. Reviews can give you a good idea of the contemporary scientific evidence, experimental studies can answer very specific questions and epidemiological studies are ideal means to form new hypothesis about the potential roots of problems such as the obesity epidemic.
There are yet also weaknesses to each of these formats. Important weaknesses, you should be aware of, whenever you are leafing through a study like this and/or see someone reference, reviews (and meta-analysis), experimental and epidemiological studies.
TABLE 1: OVERVIEW OF THE DIFFERENT TYPES OF STUDIES AND THEIR MOST IMPORTANT WEAKNESSES (IN PARTS INSPIRED BY AN ARTICLE BY SEAN CASEY)
The overview in Table 1 can provide with a limited, albeit sufficient idea of the specific weaknesses of the different study formats and the notion that none of them will ever be able to show the whole picture at once. It's the synopsis and not the result of an individual study or study type scientists who are working in the field of health, exercise and nutrition science must take into account and it's this primacy that's telling you, as an educated layman, not to overestimate, the weight and relevance of a single paper– no matter, whether it's a review, meta-analysis, randomized clinical trial or large scale epidemiological study with more than 100,000 participants.
#2 – Experimental & Statistical Procedures
It would appear as if the influence the experimental and statistical procedures have on the validity and significance of scientific research was something that's way beyond the grasp of the average layman (and –woman). Fortunately, this is not necessarily the case. There are in fact a whole host of practically relevant "design flaws", each and every of you can easily identify. Mentioning all of them would be beyond the scope of this article, but with the following four examples you should be able to identify the most commonly cited offenders in the nutrition and supplement industry pretty easily:
- No control group – If a study accesses the effects of Supplement A on skeletal muscle gain and says: "After 8 weeks of training the subjects gained X pounds of lean muscle", but the study doesn't have a control group, it's even possible that the missing control group would have gained more muscle than the supplement group.
- Irrelevant control – In view of the fact that the average strength trainee consumes at least a protein supplement to speed up his and her muscle gains, a study that compares Supplement B to a water or carbohydrate control and says: "Increases size gains by 20% over placebo" is practically irrelevant for the aforementioned average trainee, for whom an adequate placebo would have had to contain the same X grams of protein he / she is consuming after every workout.
- Unbalanced groups – If you randomize subjects to groups and don't conduct an analysis of significant differences in the baseline levels you may end up having a group of very fit guys in the placebo arm of the study and a bunch of rookies in the active arm. If you have them perform the same workout with / without Supplement C, the rookies would see larger relative increases in fitness, strength and muscle size than the veterans, anyway. A sentence like "there were no significant difference in any of the measured parameters at baseline" is thus a must for any study that claims to provide reliable evidence.
- No or questionable adjustment of results – In epidemiological studies scientists are forced to "adjust" the data by statistical means. The intention is to "subtract" influences such as smoking, overweight, existing health issues etc. which have an independent effect on a given research interest, such as the relationship of vitamin D to the risk of heart disease. Now, you will find only few studies that don't adjust by one means or another. The statistical shenanigan that's behind the various models that are used here, is yet rarely disclosed and could – especially in areas, where our understanding of the interactions between various parameters is still so limited as in the case of vitamin D, falsify the results. Practically speaking it may thus make sense to take a look at both, the raw and the adjusted data to estimate, whether the difference is or isn't realistic.
It goes without saying that the previous examples only scratch the surface of the intricacies of experimental and epidemiological study design and evaluation. And still, even if the only thing you are taking away from thing #2 every laymen should know about scientific studies is that a certain degree of healthy skepticism will spare you unnecessary expenses and, in a few unfortunate cases, even exposing yourself to potential health hazards.
#3 – Statistical vs. Real-World Significance
Thing #3 even laymen should know about scientific studies is that no single study will provide indisputable, fact-like results. In the end, every study is designed to verify or falsify a single or a set of scientific hypothesis. Unfortunately, these answers rarely come in a convenient "yes" or "no" format. Rather than that, they are open to interpretation – scientific interpretation, of course. A practice that must follow a relatively strict set of rules, rules which involve, among other things, the determination of a so-called, all-determining "p-value".
In the field of exercise and nutrition science the unintended consequences of the "P < 0.05"-principle sometimes border the absurd. Due to the fact that the absolute effect size is irrelevant, a 500mg difference in lean mass gains that was measured over a 12-week period will appear as a "significantly lager gain in lean muscle tissue" in the abstract of a paper that compares Supplement D against a placebo supplement. For the average customer, who spends $90 for a 12-week supply of the given product, on the other hand, those 500mg of extra mass are absolutely non-significant, or let's rather say "relevant" - real-world (relevance) and statistical significance are thus two very different animals.
"Chocolate Helps You Lose Weight" and "Firemen Are Arsonists"
You don't think there is a benefit from this allegedly simple set of things every layman should know about scientific papers,? Well, just make sure you don't forget any of them and you will soon realize that your perspective on headlines like "Chocolate helps you lose weight" will change. Now that you are aware that they are based on the epidemiological observation that regular chocolate eaters are leaner than people who eat chocolate only occasionally, you will realize that the author of the press release made (willingly or not) on the false premise that epidemiological studies would be able to reveal causal relationships between two variables.
Aside from not necessarily being causally related, the correlation of regular chocolate consumption (could) and lower BMIs could after all also be one of the famous cases of "reverse causation". This means, it's not the regular chocolate consumption that makes the study participants lean, but rather their leanness that has them gravitate towards the regular consumption of chocolate, simply because they can afford it... it's after all not the firefighters who sets the fire, and that despite the fact that they appear to be involved in every blaze.
I see you're smiling. That's good, because it shows that the chocolate and firefighters, and maybe even some of the previous examples got you thinking, right? Congratulations! You've just learned the most important lesson in reading scientific papers: Don't rely on the others to do all the thinking for you.