[Headline graphic: Billy Beane (left) and Paul DePodesta (right). The General Manager and assistant General Manager, respectively, for the 2002 Oakland A’s and who inspired Michael Lewis’ book, “Moneyball.” (Photo by GabboT; used under the CCA-Share Alike 2.0 Generic license.)]
By Kent R. Kroeger (Source: NuQum.com; November 25, 2020)
As he announced his resignation from the Chicago Cubs as that organization’s president of baseball operations earlier this month, Theo Epstein, considered by many the High Priest of modern baseball analytics, made this shocking admission about the current state of baseball:
“It is the greatest game in the world but there are some threats to it because of the way the game is evolving. And I take some responsibility for that because the executives like me who have spent a lot of time using analytics and other measures to try to optimize individual and team performance have unwittingly had a negative impact on the aesthetic value of the game and the entertainment value of the game. I mean, clearly, you know the strikeout rates are a bit out of control and we need to find a way to get more action in the game, get the ball in play more often, allow players to show their athleticism some more and give the fans more of what they want.”
Epstein’s comments were painful for me on two fronts. First, he was leaving the only baseball team I’ve ever loved, having helped the Cubs win the only World Series championship of my lifetime. Second, he put a dagger in the heart of every Bill James and sabermetrics devotee who, like myself, have spent countless hours pouring through the statistical abstracts for Major League Baseball (MLB) and the National Football League on a quest to build the perfect Rotisserie league baseball team and fantasy football roster.
There is no better feeling than the long search and discovery for those two or three “value” players who nobody else thinks about and who can turn your Rotisserie or fantasy team into league champs.
In a direct way, sports analytics are the intellectual steroids for a generation of sports fans (slash) data geeks who love games they never played beyond high school, if even then.
Epstein’s departure was not entirely a surprise. The Cubs have not come close to their glorious World Series triumph in 2016—though it has to pin that on Epstein. The Cubs still have (when healthy) one of the most talented rosters in baseball. Instead, the surprise was Epstein’s targeting ‘analytics’ has one of the causes of baseball’s arguable decline.
Like many baseball fans, I’ve assumed baseball analytics—immortalized in Michael Lewis’ book “Moneyball” about the 2002 Oakland A’s and its general manager Billy Beane, who hired a Yale economics grad, Paul DePodesta, to assist him in building a successful small market (i.e., low payroll) baseball team—helped make the MLB, from top-to-bottom, more competitive.
In the movie based on Lewis’ book, starring Brad Pitt and Jonah Hill, this scene perfectly summarizes the value of analytics in baseball (and, frankly, could apply to almost every major industry):
Peter Brand (aka. Paul DePodesta, as played by Jonah Hill):
“There is an epidemic failure within the game to understand what is really happening and this leads people who run major league baseball teams to misjudge their players and mismanage their teams…
…People who run ball clubs think in terms of buying players. Your goal shouldn’t be to buy players. Your goal should be to buy wins, and in order to buy wins you need to buy runs.
The Boston Red Sox see Johnny Damon and they see a star who’s worth seven-and-a-half million dollars. When I see Johnny Damon what I see is an imperfect understanding of where runs come from. The guy’s got a great glove and he’s a decent lead-off hitter. He can steal bases. But is he worth seven-and-a-half million a year?
Baseball thinking is medieval. They are asking all the wrong questions.”
While Beane and DePodesta may have lacked world championships after they introduced analytics into the process, the A’s did have nine winning seasons from 2002 to 2016 during their tenure, which is phenomenal for a small-market, low-payroll team.
At the team-level, the 21st-century A’s are the embodiment of how analytics can help an organization.
But is Epstein still right? Has analytics hurt baseball at the aggregate level?
Let us look at the facts…
Major League Baseball has a Problem
Regardless of the veracity of Epstein’s indictment of analytics for its net role in hurting the game of baseball, does professional baseball have a problem?
The answer is a qualified ‘Yes.’
These two metrics describe the bulk of the problem: (1) Average per game attendance and (2) World Series TV viewership. Since the mid-1990s, baseball game attendance relative to the total U.S. population has been in a near constant decline, going from a high of 118 game attendees (per 1 million people) in the mid-1990s to 98 game attendees (per 1M) in the late 2010s (see Figure 1). At the same time, the long-term trend is still positive. That cannot be discounted.
Figure 1: MLB per game attendance (per 1 million people) (Source: baseball-reference.com)
While the relative decline is significant, the real story of MLB attendance since the league’s inception in the late-19th century is the surge in attendance after World War II, a strong decline after that until the late-1960s, and a resurgence during the 1970s and 80s. In comparison, the attendance decline per capita since the mid-1990s has been relatively small.
Consider also that despite a per capita decline in game attendance since the 1990s, total season attendance has still grown. In 1991, 56.8 million MLB tickets were sold; by 2017, 72.7 million tickets were sold. This increase in gross ticket sales has been matched by a steady rise in MLB ticket prices as well. The average cost of an MLB baseball game in 1991 was $142, but by 2017 that figure increased to $219 (a 176 percent increase). In that context, the 15 to 20 percent decline in game attendance (per capita) seems more tolerable and far from catastrophic. In fact, if it weren’t for this next metric, baseball might be in great shape, even if its relative popularity is in decline.
The TV ratings and viewership for MLB’s crown jewel event, the World Series, has been in a near straight-line decline since the mid-1970s when Billy Martin’s New York Yankees and the Tommy Lasorda-led Los Angeles Dodgers were the sport’s dominant franchises, and happened to be in this nation’s two largest cities. Big market teams in the World Series is always good for TV ratings.
As seen in Figure 2, average TV viewership for the World Series (the orange line) has declined from a high of 44.3 million in 1978 (Yankees vs. Dodgers) to just under 9.8 million in the last World Series (Dodgers vs. Rays).
Figure 2: The TV Ratings and Viewership (average per game) for the World Series since 1972 (Source: Nielsen Research)
Even with the addition of mobile and online streaming viewers—-which lifts the 2020 World Series viewership number to 13.2 million—the decline in the number of eyeballs watching the World Series since the 1970s has been dramatic.
In combination with the trends in game attendance, the precipitous decline in live viewership offers one clear conclusion: Relatively fewer people are going to baseball games or watching the them on TV or the internet. That’s a formula for an impending financial disaster among major league baseball franchises.
While stories of baseball’s imminent death are exaggerated, baseball does have serious problems. But what are they exactly? And how has analytics impacted those probable causes?
Are baseball’s problems bigger than the game itself?
Before looking within the game of baseball itself (and the role of analytics) to explain its relative popularity decline, we must consider the broader context.
Sports fans today demand something different from what MLB offers
Living with a teenage son who loves the NBA and routinely mocks my love of baseball, I see a generational divide that will challenge any attempt to update a sport once considered, without debate, to be America’s pastime. Kids (and. frankly, many of their parents) don’t have the patience or temperament to appreciate the deep-rooted intricacies of a game where players spend more time waiting than actually playing. Only 10 percent of a baseball game involves actual action, according to one study. For kids raised on Red Bull and Call of Duty, baseball is more like a horse and buggy than a Bugatti race car.
And the in-game data supports that assertion. In 1970. a nine-inning major league baseball game took, on average, two-and-a-half hours to complete. In 2020, it takes three hours and six minutes. By comparison, a World Cup soccer match takes one hour and 50 minutes from the moment the first whistle blows. An NBA game takes about two-and-a-half hours.
Baseball is too slow…and getting slower.
[For a well-constructed counterargument to the ‘too slow’ conclusion, I invite you to read this essay.]
In contrast, the NBA and World Cup soccer possess near constant action. Throw in e-games (if you consider those contests a sport) and it is reasonable to conjecture that baseball is simply a bad fit for the times. Even NFL football, whose average game takes over three hours, has challenges in that regard.
Did analytics lead to longer baseball games? Let us examine the evidence.
Figure 3 shows the long-term trend in the length of 9-inning MLB games divided into baseball ‘eras’ as defined by Mitchell T. Woltring, Jim K. Rost, Colby B. Jubenville in their 2018 research paper published by Sports Studies and Sports Psychology. However, I did re-label their ‘post-Steroids era’ (from 2006 to the present) as the ‘Analytics era.’
Figure 3: Average length of a 9-inning MLB game since 1946.
Though I will share upon request the detailed statistical analysis of the intervention effects of the baseball eras on the average length of MLB games, the basic findings are straightforward:
(1) The average length of 9-inning MLB games significantly increased during the ‘Integration,’ ‘Free Agency,’ and ‘Analytic’ eras, but did not increase during the ‘Expansion’ and ‘Steroids’ eras.
(2) The long-term trend was already pointing up before the ‘Analytics era’ (+50 seconds per year), though analytics may have had a larger marginal effect on game length (+78 seconds per year).
As to why the ‘Analytics era’ saw an increase in game times, one suggested explanation is that the ‘Steroids era’ disproportionately rewarded juiced-up long-ball hitters who tended to spend less time at the plate. In contrast, though the ‘Analytics era’ also has emphasized home run hitting, the players hitting home runs are now more patient. According to baseball writer Fred Hofstetter, pitchers have also changed:
“This (increase in game times) won’t surprise anyone who follows the game closely. The general demographic change trending into 2020:
- Patient hitters are replacing free swingers
- Hard-throwing strikeout-getters are replacing pitch-to-contact types
Pitchers who throw harder tend to take more time between pitches.9 Smart hitters take more pitches. There are more pitches with more time between them. The result is a rising average of time between pitches.”
Are these changes in the game related to analytics? It is hard to know given the concurrent (and assumed) decline in steroid use in the 2000s MLB, but the apparent consensus is that the pitcher-batter dynamics since 2000 have been more sophisticated and time-consuming than during the ‘Steroid era.’
My conclusion on the impact of analytics on the length of MLB baseball games: Unclear.
Are there other aspects of baseball affected by analytics?
Investigating the role of analytics in 21st century baseball is complicated by the confounding effects of other changes going on in the game around the same time — the most obvious being MLB’s increased enforcement of its performance enhancing drug policies. But sports writer Jeff Rivers notes another ongoing trend: this country’s best athletes are increasingly choosing football and basketball over baseball, though this trend may have been going on for some time.
“Major League Baseball used to offer its athletes the most prestige, money and fame among our nation’s pro team sports, but that hasn’t been true for decades,” writes Rivers. “Consequently, Major League Baseball continues to lose in the competition for talent to other major pro team sports.”
It is also possible analytics have exacerbated this supposed decline in athlete quality by discouraging some of baseball’s most exciting plays.
“The focus on analytics in pro sports has led to more scoring in the NBA…but fewer stolen bases and triples, two of the game’s most exciting plays, in pro baseball,” asserts Rivers.
Is there really a distinct ‘Analytics era’ in baseball?
Another problem in assessing the role of baseball analytics is that the ‘Analytics era’ (what I’ve defined as 2006 to the present) may not be that distinct.
Henry Chadwick invented the baseball box score in 1858 and, by 1871, statistics were consistently recorded for every game and player in professional baseball. In 1964, Earnshaw Cook published his statistical analysis of baseball games and players and seven years later the Society for American Baseball Research (SABR) was founded.
In the early 1970s, as statistics advanced as a topic among fans, Baltimore Orioles player Davey Johnson was writing FORTRAN computer code on an IBM System/360 to generate statistical evidence supporting his belief that he should bat second in the Orioles lineup (his manager Earl Weaver was not convinced, however).
In 1977, Bill James published his first annual Baseball Abstracts which, through the use of complex statistical analyses, argued that many of the popular performance metrics — such as batting average — were poor predictors of how many runs a team would score. Instead, James and other SABRmetricians (as they would be called) argued that a better measure of a player’s worth is his ability to help his team score more runs than the opposition. Instead, the SABRmetricians initially preferred metrics such as On-Base Percentage (OBP) and Slugging Percentage (SLG) to judge player values and would later prefer combining those metrics to create the On-base Plus Slugging (OPS) performance metric.
[Note: OBP is the ratio of the batter’s times-on-base (TOB) (which is the sum of hits, walks, and number of times hit by pitch) to their number of plate appearances. SLG measures a batter’s productivity and is calculated as total bases divided by at bats. OPS is simply the sum of OBP and SLG.]
Batting averages and pitchers’ Earned-Run-Averages (ERA) have been a systematic part of player evaluations since baseball’s earlier days. Modern analytics didn’t invent most of the statistics used today to assess player value, but merely refined and advanced them.
Nonetheless, there is something fundamentally different in how MLB players values are assessed today than in the days before Billy Beane, Paul DePodesta and Moneyball.
But when did analytics truly take over the talent acquisition process in major league baseball? There is no single, well-defined date. However, many baseball analysts point to the 2004 Boston Red Sox, whose general manger was Theo Epstein, as the first World Series winner to be significantly driven by analytics.
Something unique and profound was going on in major league baseball’s front offices from the time between Billy Beane’s 2002 A’s and the Boston Red Sox’ 2007 World Series win, their second championship in four years.
By 2009, most major league baseball teams had a full-time analytics staff working in tandem with their traditional scouting departments, according to Business Administration Professor Rocco P. Porreca.
So, why did I pick 2006 as the start of the ‘Analytics era’? No definitive reason except that is roughly the halfway point between the release of Lewis’s book Moneyball and 2009, the point at which most major league baseball teams had stood up a formal analytics department. It would have been equally defensible to set 2011 or 2012 as the starting point for the ‘Analytics era’ as many of the aggregate baseball game measures we are about to look at changed direction at around that time.
The Central Mantra of Baseball Analytics: “He get’s on base”
Lewis’ book Moneyball outlined the baseball player attribute 2002 A’s assistant general manger Paul DePodesta’s sought after most when evaluating talent: Select players that can get on base.
This scene from the movie Moneyball drives home that point:
As the 2002 A’s scouting team identify acquisition prospects, the team’s general manger, Billy Beane singles out New York Yankees outfielder David Justice:
A’s head scout Grady Fuson: Not a good idea, Billy.
Another A’s scout: Steinbrenner’s so pissed at his decline that he’s willing to eat a big chunk of his contact just to get rid of him.
Billy Beane: Exactly.
Fuson: Ten years ago, David Justice—big name. He’s been in a lot of big games. He’s gonna really help our season tickets early in the year, but when we get in the dog days in July and August, he’s lucky if he’s gonna hit his weight…we’ll be lucky if we get 60 games out of him. Why do you like him?
[Beane points at assistant general manager Peter Brand (aka. Paul DePodesta)]
Peter Brand: Because he get’s on base.
This was the fundamental conclusion analytic models would start driving home to a growing number of baseball general managers after 2002. Find players that can get on base.
Theo Epstein was among the first general managers to drink the analytics Kool-Aid and he did it while leading one of baseball’s richest franchises — the Boston Red Sox.
Analytics is not the cause of baseball’s systemic problems. For those who assume major league baseball is a sinking ship, analytics did little more than re-arrange the deck chairs Titanic.
Send comments to: email@example.com
“Major League Baseball and its fans would be more than happy to throw you and Google boy under the bus if you keep doing what you’re doing here. You don’t put a team together using a computer.
Baseball isn’t just numbers. It’s not science. If it was, anybody could do what we’re doing, but they can’t because they don’t know what we know. They don’t have our experience and they don’t have our intuition.
You’ve got a kid in there that’s got a degree in economics from Yale and you’ve got a scout here with 29 years of baseball experience.
You’re listening to the wrong one now. There are intangibles that only baseball people understand. You’re discounting what scouts have done for a hundred and fifty years.”
All models are wrong, but some are useful. (George E. P. Box, 1979)
“An approximate answer to the right problem is worth a good deal more than an exact answer to an approximate problem.” — John Tukey
Statistics are like bikinis. What they reveal is suggestive, but what they conceal is vital.
All generalizations are false, including this one.
A big computer, a complex algorithm and a long time does not equal science.
— Robert Gentleman
Statisticians, like artists, have the bad habit of falling in love with their models.
— George Box
think it’s much more interesting to live not knowing than to have answers which might be wrong.
By Richard Feynman
He uses statistics like a drunken man uses a lamp post, more for support than illumination.
— Andrew Lang
“It’s easy to lie with statistics; it is easier to lie without them.”
— Frederick Mosteller
Say you were standing with one foot in the oven and one foot in an ice bucket. According to the percentage people, you should be perfectly comfortable.
-Bobby Bragan, 1963
On two occasions I have been asked [by members of Parliament], ‘Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?’ I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
Do not trust any statistics you did not fake yourself.
— Winston Churchill
We are drowning in information and starving for knowledge.
Rutherford D. Roger
Correlation doesn’t imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing ‘look over there’. – Unknown
“You are like a river. You go through life taking the path of least resistance. We all do—all human beings and all of nature. It is important to know that. You may try to change the direction of your own flow in certain areas of your life—your eating habits, the way you work, the way you relate to others, the way you treat yourself, the attitudes you have about life. And you may even succeed for a time. But eventually you will find you return to your original behavior and attitudes. This is because your life is determined, insofar as it is a law of nature for you to take the path of least resistance.”