By: CJ Gotcher, SSC
Scientific methods have done more to advance human health, wealth, and happiness than anything else in the last 200 years. The research process is not perfect, however, and there is a trend in strength and conditioning to accept outrageous claims as fact and effective methods as nonsense on the basis of peer review. Though well meaning, this is misguided.
To understand why, let’s try to answer a basic question: comparing two programs, which one will make you stronger faster?
Simple enough, right? Not quite. A definitive answer would require a randomized comparison trial:
- Get enough subjects to detect a meaningful difference.
- Effectively randomize two groups with similar baseline characteristics.
- Have qualified, motivated coaches lead the programs.
- Take measurements along the way.
- Compare the change in strength.
This will never happen, not because of ill intent but because we’re dealing with a complex system.
Start with the sample. Basically, researchers use statistical models to ask the question: “If I want to say with confidence that a 10 pound difference in the squat increase between 2 groups isn’t a fluke, how many subjects do I need?” When budgets allow, they recruit that number plus a few extras just in case.
The problem is that these models are designed for dice rolls, not people. Many factors can affect subjects’ execution of and response to the program as well as their performance on the testing days. When you only have 12 subjects in each group, one or two outliers can skew the best-collected data badly.
Then we get to randomization. Most subjects are selected by convenience — usually college students or hospital patients — so it’s easier to get similar subjects. This means the findings only apply to that small group. To get around that, you‘d need such a large sample, across such a broad range, that you could compare subgroups, each sufficiently large to make meaningful conclusions. Right…
Why not use online coaching for this research? It’s scalable and the data is already digital. First, it’s still not truly random. The subjects are self-selecting and paying to participate. The biggest problem, though? Compliance. Even if you required subjects to video every set, everything else is self-reported. Self-report is inaccurate (at best), and it’s not random. Trying to impress a researcher (or coach), we report being ‘better’ at everything than we actually are.
Let’s say you have an infinite budget and thousands of honest lifters carefully monitored. There are still confounding variables. Researchers and funders are usually testing out their program. Are they motivated to coach the compared program? A coach’s apparent enthusiasm for and belief in the efficacy of a program has a big impact on subject compliance.
Let’s say we solve that problem: how do you measure strength? When we ask how strong someone is, we’re referring to a person’s ability to produce force (and therefore lift more weight). What if the two compared programs use different exercises? Do you do a 1 rep max of every trained lift? Compare increases in your working 5-rep weights? Use clinically validated exercises that can be easily measured but are functionally useless, like the single-leg extension? Do you measure changes in muscle size? Neural drive? All of it? If there are differences in bodyweight between the two groups, what matters more, improvement in the absolute weight on the bar or in relation to bodyweight?
Here’s where the futility kicks in. Let‘s say we solve all of these problems. Do we have a definitive and useful answer of what’s best? Hardly. We’ve only compared 2 programs out of hundreds. If there is no clear ‘winner,’ both sides can point to different data as proof of victory (“Sure, they lifted more, but my subjects gained slightly more muscle mass, and if the study had been longer…”). Second, unless the ‘loser’ accepted the design before it was published, they can always find an error in the conclusion or study design, modify their offering (new and improved!), or retreat to a narrower claim (“our program is better for X population, which wasn’t studied”).
And this all assumes good faith, competence, and rigor on behalf of the researchers. The National Strength and Conditioning Association (NSCA) was recently found to have knowingly published false and misleading information about CrossFit in its most prestigious peer-reviewed journal. Systematic reviews of supplements and training methods find sharp differences between manufacturer-funded and independent studies. The Starting Strength Science Committee has demonstrated the vast array of routine errors in study design, statistical methods, and rigor that make it difficult to draw meaningful conclusions from exercise science. The systematic reviews and meta-analyses that should combine these small and limited studies into larger useful data are hamstrung by the fact that sample sizes are still small, and combining lots of garbage still leaves you with garbage. Essentially, exercise science is a JV player facing a Pro-level challenge: understanding human adaptations to stress.
Whew. Rant over.
Before I come across as a witch-burning, science hating troglodyte, I’m still convinced that science is the best tool we have. The question is how to best use the research, knowing its limitations.
- Stay up to date on relevant published research in the field.
- Critically evaluate those studies and help educate your lifters as questions arise.
- Apply known-working programs and principles as starting guidelines and adjust to a lifter’s needs.
- Take a scientific approach to programming. Experiment with changes one variable at a time and note when a lifter doesn’t respond as expected.
- When research offers a plausible better method, apply it, one variable at a time, to multiple lifters, and carefully evaluate the results.
- Be humble about any claim to ‘optimal’ or ‘perfect’ methods, especially if their first principles depend on the exercise science literature.
- Pay attention to your training. Although you’re a human-shaped meat sack like everyone else, you are not the average of some study sample.
- Communicate with your coach. They’re human, so they can make mistakes, and all those factors mentioned above could be relevant to your programming.
- Be patient. No ‘perfect program’ will give you ‘20 times the results in just 4 minutes a day!’