Beyond the Field

As I stepped out of the car, the intense 100˚F heat and high humidity washed over me. On June 4th, 2011, the Cardinals were scheduled to play against the Cubs at Busch Stadium in St. Louis. It was the first baseball game I had ever been to and although I was never much of a sports fan, I couldn’t wait to watch the game. As I walked into the giant brick stadium, I marveled at all of the World Series pennants that hung on the wall. Thousands of people poured in as we pushed our way through the crowd. The freshly manicured field, gorgeous downtown St. Louis skyline, and a setting sun met our eyes as we took our seats.

It was a very fun and exciting game to watch; the Cardinals would go on to win 5-4 (ESPN 2011). What I didn’t realize while I sat and watched the game was how much math was being calculated as this game unfolded. A staggering amount of data was being recorded each minute of the game. Even I was among the data recorded; a single attendee that was a part of 43,195 people that attended that evening (ESPN 2011). And it wasn’t just that game, but every game that had been occurring up this point over the last 100 years.

The statistics behind baseball is astonishing. There is a vast amount of data that has been recorded, with entire societies and classes dedicated to researching and learning more about the science behind it. Teams now use much of the analyses to figure out how much value to assign players, where to place them in a lineup, and spot weaknesses in other teams. It got me thinking: If I were to start a team today, what would be the best way to find out whom to put on my team? Statistics would lead the way.

What Makes a Great Hitter?

There are a number of factors that go in to choosing the great members that make up an excellent team. Knowing when and, more importantly, when not to swing is key (Dummies 2015). A batter’s likelihood of hitting a ball can be measured by dividing the number of hits, \(H\), by the number of times they are officially at bat, \(AB\) (Basic Mathematics 2015). It’s known as the Batting Average, \(AVG\), and can be expressed as


Going to various universities, I would try to find a number of candidates who have incredible averages. Let’s say we have a bunch of them who have a small standard deviation between them, say their batting average is \(.375±.1\). How do I know which hitters to choose? Well, in order to win in baseball, at least on the offensive side, we need to score as many runs as possible. So the player who can get the most number of bases with the least number of at bats would be more valuable. This statistic is known as the Slugging Percentage, or \(SLG\) (Basic Mathematics 2015). It can be expressed as total bases \(TB\) divided by the number of official at bats \(AB\), or more concisely:


The problem with \(SLG\) is that it only measures official at bats. AB does not record events such as sacrifice balls, walks, or hit by pitches, which when used at the right time can be the difference between winning and losing a game. It’s leaving out a bunch of potential runs that the batter is earning. So, we can add these factors into our formula. Adding walks, hit by pitch, and sacrifice flies, we can get a better representation of how well they will perform. It’s known as the on-base percentage, expressed as

\[OBP =\frac{H + BB + HBP}{AB + BB + HBP + SF}\]

where \(H\) is number of hits, \(BB\) is number of walks, \(HBP\) is hit by pitches, and \(SF\) is sacrifice flies (Fangraphs 2015). Given this information, I now know what to look for in my offense and how to make the tough decision of who to choose from in a sample of excellent batters. Next up, I’ll need some pitchers who can keep the other team from scoring.

What Makes a Great Pitcher?

A great pitcher can change an entire game single-handedly. They control the strikes and walks that occur. But once the batter makes contact with the ball, the rest of the defense is left up to the players in the field. We’re going to focus primarily on the pitcher, however. If the rest of the defense is playing at league average, we can use a metric known as the Fielding Independent Pitching (Fangraphs 2015).

\[FIP = \frac{13HR+3BB–2K}{IP} +C\]

The constant \(C\) is to make the figure look more like the scale used for Earned Run Average (ERA). The \(ERA\) is another formula used to show how many runs a pitcher lets allows in a given game (Basic Mathematics 2015).

\[ERA = 9\times\frac{ER}{IP}\]

The lower the averages of these two metrics, the less likely the other team is to score. So now to go and find some pitchers with excellent \(FIP\) and \(ERA\) scores.

Lining Up For Wins

The way the hitters are arranged is key to getting the few extra runs a team will need to win games. Often, the first hitter, or lead-off, will be a top hitter and a quick runner (Beyond 2009). Since there will be good hitters coming after him, he’ll need to use the speed to steal bases (Beyond 2009). The second spot typically has someone with a high OBP so that they can start to get some runs in (Beyond 2009). Ideally, at this point the first guy is ready to score, so the third hitter should have a high average to help get that run in (Beyond 2009). From here on, it is pretty much just sorted by talent, from highest to lowest. This is because on average, those batting first will have a higher number of plate appearances, and thus a higher chance of earning more runs (Beyond 2009).

Sabremetrics is an incredible science and is a fantastic way of studying statistics and baseball. This is only a small number of tools that the individuals who study sabremetrics use, but it shows the incredible amount of knowledge one can gain from studying patterns. Teams that incorporate sabremetrics into their decisions will be able to create very powerful groups. If I had to choose the players who went onto a team I created, sabremetrics would lead the way.


ESPN. 2011.
Dummies. 2015.
Basic Mathematics. 2015.
Fan Graphs. 2015.
Beyond the Box Score. 2009.

[Someone else is editing this]

You are editing this file