Posted on 09 May 2014 22:17

This is not the first post where I talked about how numbers can easily impress and mislead us. I mentioned numbers and "proofiness" in The Data Dump in Fitness Information: Time to Get Back on Track. Another closely related post is Quantitative Measurements and Quality Evaluations.

Our [western] culture is a bit obsessed with measurement. In all sorts of fitness realms we see measurements - numbers - being assigned to things that cannot readily be measured, and sometimes things that cannot be measured at all. Numbers have, perhaps, too much power to impress. Science guys will tell you all about statistical significance, and maybe statistical correlation (a little on that below) but we can be mislead by much more mundane and easily understood numbers. It often starts with what we can and cannot measure. Before we begin, note that since this is about numbers, which I am not very good with, I could have screwed up some of the example figures. So, don't hesitate to let me know.

Look at educational programs and you'd think we could measure learning. It is not so easy as these programs would have you think. But, if you can't readily measure the results, you can't sell the program very easily. Also, there are hundreds of "tests" on the internet supposed to measure something about your personality. Are you an introvert or an extrovert? Not all concepts are easily measured.

The same things exist in fitness information. Especially in strength training. I used to complain about the highly precise formulas given to people to calculate the exact bands to use on their banded bench press or other exercise. But that is nothing compared to the kinds of things fitness pros put numbers on when such numbers just cannot be had.

When we measure something, we have to be measuring what we claim to be measuring. This is called "validity." So, we claim to measure things that are very difficult to measure, and we are often not measuring what we claim to be measuring. Therefore, be aware that when you see numbers, they may not really make any sense at all. Don't be overly impressed by them. There is no way to quantify how much of our results comes from nutrition, or training, yet people assign numbers to these all the time.

Also, of course, more precise numbers are more impressive. Yet, the truth is often that the more precise a number is, the more it is bullshit. If I tell you that someone lifted 335.5 lbs, that number can be easily verified by looking at the weight. If I tell you there are exactly 6,568 females who do deadlifts in the state of New York, you'd be a fool to believe me. That doesn't mean that precise numbers are bad, just don't be automatically more impressed by them than by less precise numbers. Like 6,500 female deadlifters in New York, give or take. But I'm still lying.

# The Average Male Deadlifts How Much?

We need to also be aware of the word "average." That is because it is used in different ways. Lots of times, we mean it as the "mean."

This has stymied many a strength coach, and it has stymied me, not a strength coach, per se. Looking at an average adult male and an average deadlift could give you impressive numbers, unless you're a statistician, who would throw up their hands in disgust at your discussion of "average."

So, say we get a whole bunch of people who lift to give us their deadlift number. We want to be representative, so we get the guys with huge lifts. Heck, we already know the ones that are famous for big deadlifts. So, we write that down. Then we get a whole bunch of other guys to give us their numbers. We get actual competitive powerlifters and just regular gym rats. We need an *average*, man! We get one thousand. We've got a few very very strong guys with deadlifts over 800 and we've got a whole lot of guys with deadlifts from the range of 250 to 400. Lots of lifters in the 300 range. And then there are those scattered 500's, 600's, etc. Nevermind a lot of them are giving untrue numbers. We use them for better or worse. We add all the numbers up and divide by the number of deadlifters in our sample.

After we're done, we use the mean we came up with and tell everybody the average adult male deadlifts this much amount…So, there are lots of variables. We included any and all comers. But let's ignore that and focus on something that, right to this day, is screwing up the heads of strength trainers everywhere: Those few lifters with huge deadlifts. Over 700. Over 800. Some even way over. They screwed up our mean. A statistician would call them **outliers**. We wanted to be representative. Well, these lifters are NOT representative! To understand this, pull a list out of the list. Take a guy with a huge deadlift and put him with lifters in the 250 range, realizing there are plenty of those guys. Suddenly, we realize how "out of place" the big lifters are. So like this:

850

250

250

250

250

1850

1850 divided by 5 gives us 370. Look at those numbers and look at our result. That is looking pretty weird, isn't it? Now imagine that some of the guys in our sample never hardly deadlift at all. They can pull say, 200, or 150. You getting it yet? You will. You can easily agree that you can't "lump in" the 850 guy with the 250 guys.

So, let's take more numbers:

900

850

800

525

500

435

400

375

350

350

350

325

325

325

325

300

300

300

300

300

285

285

250

250

200

Just random numbers, but do you see something going on? There are some numbers that repeat a lot. And the more numbers we include the more we see that some repeat. This means that instead of the mean, it may have been better to look at the mode. The mode is the most common value in a number list. In the first 5 numbers, the mode is obviously 250. We just completely threw out that outlier. In the second one the mode is 300. That is still another way of saying *average*. Which would be a better answer, the mean or the mode?

And then there is the median. In the big list the median is 325. It is the point where half the numbers are above it an half are below it. 325 might be a much better number for our 'average' deadlift. Those big deadlifts at the top can't pull it out of whack. Our mean, after all, is 396.2. That would be a more precise number but if I gave that as an average deadlift, I would hope you'd call bullshit on my overly precise number! Then, I'd hope you'd look at the actual numbers and wonder how in the world we could support an 'average' deadlift of almost 400 given these specific numbers. 325 is seeming much more reasonable, if not near as precise. Given all that, it is true that these three different values, mean, mode, and median, will often end up being near the same.

Yet, if we really looked at deadlifters, those outliers would throw us off even worse! Take a look at the numbers of some of the world's strongest deadlifters:

Bennidict Magnusson: 1015

Andy Bolton: 1008

Konstantine Konstantinovs: 939

Gary Frank: 929.5

Mikhail Koklyaev: 907

Vince Urbank: 906

George Leeman: 906

Eric Lilliebridge: 900

Chris Duffin: 900

Andrey Malanichev: 892

Brad Gillingham: 881

Mark Felix: 881

Chris Hickson: 800

Given all this, perhaps you can see why I made a big thing about the fact that a deadlift over 500, is indeed a very good deadlift!

# Percentages Often Lie!

A big mode of bullshitting (see what I did?) is to talk about relative change instead of absolute change, and compare percentages. Percentages are another number that should always give you pause, especially when one percentage is being compared to another. It is almost always inappropriate to compare percentages. Why? Because the base numbers from which the percentages are derived are important. Would you want 75% of my deadlift or 50% of Andy Bolton's? If you're smart, you'll take the lower seeming number. Andy Bolton lifts over 1000lbs. I've eked out a little more than half of that on my very best days, which may be long past. You shouldn't accept 100% of my income either, if offered an alternative of 0.5% of Bill Gate's income. Pretty easy to understand, I think.

But, pretty obvious too. The big way of bullshitting with percentages is with relative percentages. If I told you that a lifter increased his reps, in ONE workout, by 67%, you'd be suitably impressed. You'd then be sore at me if I revealed that I meant they went from 3 to 5 reps.

Let's say I told you I increased a lifters max by 33% in three years. You might think, well, that's not too bad, but not really that very impressive, either. But, what if the lifter started at 300 in year one, and:

**Year 1**: 300

**Year 2**: 375

**Year 3**: 400

Woah. Hold up a minute. Year one to year two, we had a 25% increase in the lift. Then, in year two to year three, we had a 6.7% increase. Those kinds of results may be reality. You know that. You get the reality of strength training. But, the people I'm selling to, they don't. I'd rather tell them about a 33% increase, overall, then tell them about a 6.7% increase over an entire year. So, where you start matters, and you start wherever will get you a more impressive number.

I can use percentages to make almost nothing sound way impressive. Hey, I increased my client list by 100% last year! I went from one to two clients. Relative percentage makes that a 100% difference. But it is actually only an absolute increase of 1 client. If you go from 100lbs on your bench press to 200 lbs it's an increase of 100%. But if we compare your results to another lifter on another program, who went from 125 to 215, we get an increase of 72%. We then use those numbers to say you got around 39% better results, because of the difference between 72% and 100%. In reality, given the starting points and the ending points, the two results are actually very similar. And the second lifter, in absolute terms, is still stronger. We cannot say anything beyond that.

We see much more of these misleading percentages in messages about our health. We may hear that eating cured meats daily increases our risk of pancreatic cancer by 20%. In fact, every day, you will probably come across a statistic that tells you how your risk of some scary illness, especially cancer, can be increased. When we see numbers like 20%, we easily start thinking of those numbers in concrete terms, like "if I eat bacon every day I will have a 20% increase of pancreatic cancer. But, in reality, you do not even know your risk of pancreatic cancer. If you were to find out, as is likely, that the actual risk for your segment of the population is 1 or say, 1.4% you might feel a bit different. That means eating cured meats would increase your risk of pancreatic cancer to 1.7%. If it is a true statistic, which is probably is not.

Say you found out that your risks of colorectal cancer actually dramatically increases as you age, regardless of bacon. You'd suddenly be more worried about your next birthday than bacon or anything else deemed to increase your risks. And, aging does increase your risk. It's still not huge, but aging continues to be the key. Say you're a male at age 60. In the next ten years, you have a 1.32% chance. In the next 20 years, you have a 3.08% chance. In the next 30 years, you have a 4.39% chance. In other words, there's not that much of a chance. Still worried about bacon? Don't hold me to those statistics. They may have changed but they ain't far off.

# Statistical Correlation

You know good and well that correlation does not equal causation. But when a number gets attached to it, which is called "statistical correlation" it suddenly becomes so much more important seeming. Correlation just means that two things vary together, in a predicable way. If one goes up the other goes up, or if one goes up the other goes down. What if I say there is a 0.7 correlation between physical fitness and self-esteem? You, concerned about your low self-esteem, decide to get in shape. Well, that's not a bad thing. However, although that correlation is high, since an absolute correlation would be 1.0, it still does not mean that physical fitness causes higher self esteem. It could be that people with higher self-esteem are more likely to become, and remain, physically fit. We cannot show that being in better shape *caused* their higher self-esteem. It could just be a characteristic of someone with higher self-esteem.

When it comes to statistics, we cannot pass through without mentioning that old line, usually attributed to Mark Twain: "There are lies, damned lies, and statistics!" In fact, it was Benjamin Disraeli, which Mark Twain tells us in his autobiography, according to Joel Best, sociology professor and author of Damned Lies and Statistics: Untangling Numbers from the Media, Politicians, and Activists, which, with no small amount of humor, urges us to think critically about statistics, and numbers in general.

# Graphs

Of course you know that graphs can be used to mislead you. Do a Google search of "misleading graphs in media" and you'll get lot of examples of this in action.

**More Critical Thinking Articles**

This page created 09 May 2014 22:17

Last updated 20 Feb 2017 20:33