Analytics and Anecdotes is a four part series whereby I attempt to examine the growing influence of statistics and analytics in some sports -popularised by the Brad Pitt film, Moneyball- while it’s apparent inability to permeate existing structures of assessing players in others. In this first part, we seek to establish the importance of statistics, it’s uses, advantages, disadvantages and it’s relationship with scouting. In the next three parts, I will analyse three sports- baseball, basketball, and soccer- who have all used statistics with varying degrees of success and try to reason out why that is.
With the advent of technology- more specifically, computers- and the inevitable link between computing and mathematics, it was only natural that statistics were going to part of everyday life. And the sports world is no exception. I don’t know where exactly it started but it took a major boost with Major League Baseball and the story of the Oakland Athletic. Subsequently, other sports adapted it- the NBA being one of the more receptive leagues followed by the NFL. However, some sports like association football (soccer or just football) are still heavily anti-analytics. Why is that? Can advanced statistics in team sports be of any use?
In this blog, we discuss the basic problems and questions that advanced statistics both poses and deals with before examining a couple of sports- one with a growing proportion of teams using statistics and the other with a reluctance to use advanced statistics.
Statistics versus scouting
The main conflict that arises from the advent of stats and analytics is it’s direct objectivity being in sharp contrast to scouting. Now, for those of you who are unfamiliar, scouting is a more ancient method of assessing players by looking at them in action- whether in practice or in a game.
Scouting relies on human observation- a method that can be subject to human error. There are many examples for this in football (soccer) where this method is used. One of the more prominent ones that come to my mind is that of Bebe, a Portuguese player who, based on a scouting report, was purchased by Manchester United- one of the world’s most reputed clubs. But as it later turned out, he was a big flop at United and soon left the club.
Statistics on the other hand, eliminates the human element. It deals with absolute numbers. If you have points, you have points. If you don’t, you don’t. There are clearly defined black and white areas which makes it easy for teams and players to assess a whole variety of things. This is the one advantage that it has which has been exploited by baseball. A batting average is one of the most widely used, and accurate representations of how good a person is when he steps up for a team during an inning in baseball.
To what extent one takes precedence over the other, in large part, depends on the sport and a lot of factors that govern the sport as we explore further below.
Individuality in Team Sports
To what extent statistics can predict the strength of a team, the strength of it’s opponent and subsequently the outcome of it’s game depends on the individuality in the game. What I mean by this is that if any game in any sport can be broken down into specific matchups by design of the game itself or by tactics, it has a higher chance of matching the predicted outcome that analysts come up with. It makes sense logically too. The more matchups you can produce in your favour- the more likely you are to win. For example, the Denver Broncos of the NFL took advantage of a young offensive line for the New England Patriots last year and their own advantage of speed in the individual matchups on defence, especially involving the linebackers, to force opposing quarterback Tom Brady into having a bad game. The Broncos won the AFC Championship and subsequently went on to do the same to Cam Newton and the Carolina Panthers in Super Bowl 50 to win it.
For example, take baseball- the poster boy for analytics. In most cases in baseball, it is a direct contest between two people- the pitcher and the batter. Other things like the number of errors that a fielder commits also affects a game but not as much as people think it does. In the end, it boils down to the pitcher against the batter for the most part.
Games and even areas of games that can be broken down into these specific matchups- where it is either a one-on-one or a small group of people against another small group of people- are more highly influenced by statistics.
Some sports, like basketball represent a transition state. Basketball- at least in America- is and for the foreseeable future will be in a state of flux. Basketball, with the increasing advent of Europeans and South Americans in the NBA, is undergoing a sea change on both the offensive and the defensive side of the ball. We will deal with more on that when we talk about basketball and the potential for analytics to change the game. But suffice to say that basketball now finds itself in a position where even though it is a team game, there is still enough individuality in it to allow for statistics to exploit it- which is why statistics finds it’s relevance in the game.
Soccer on the other hand is a fluid sport. There are far too many players on the pitch (twenty two), and far too many variables for the sport to have analytics assess the efficiency of an individual in the game. Individuality in soccer is hard to come by and more often than not, is only present in a few situations- which is something teams are looking at these days in order to exploit those matchups. This is especially true of teams who play through the wing. But the paucity of these opportunities and the availability of easy counter measures to some of these situations purely because of the number of players on the pitch and the ability of one to cover for the other without compromising on a team’s style of play on attack or defence makes it a hard task to analyse a game or a team or a player with the aim exploitation of a pressure point.
Relevance of Statistics
Soccer has an interesting statistic called possession. To the layman, it is simply the amount of time that a team has control of the ball in a game expressed as a percentage of the total amount of time the ball has been in the field of play. Interestingly, this is a statistic that has always divided pundits. Is possession important?
Barcelona of the glory years of Xavi, Iniesta, Messi, Puyol and Pique made it the focal point of their game. And it led to success repeatedly. But Leicester City won the Premier League last season with the least average possession for any Premier League winner.
The relevance of statistics- what statistics to use and what not to use is a question that has plagued sports science for a long time. What is going to help a general manager decide whether to draft this kid from college or that kid from college in to pro sports? The quest for relevance has led to the development of what some like to call advanced statistics.
I’m going to take the example of basketball and explain this.
A basic statistic would be something like the points per game (a simple calculation of total points divided by the total number of games) or the field goal percentage (simply a percentage of the number of shots that go in to the total number of shots). An advanced statistic is one which, to put it simply, involves a significant amount of mathematical calculation in order to provide a true picture of how and why a player gets the numbers that he does, at the same time, simplifying all this data down to a few numbers. An example is the True Shooting Percentage- which takes into account the fact that three-pointers are worth an extra point as well as the free throws made and attempted- to give a single number that tells you how good a shooter a player is rather than a simple field goal percentage. We’ll discuss this more when I talk about statistics in basketball further in the series.
Positional Importance and Interdependence
What statistics matter and don’t matter is also dictated by two things in team sports- the importance of position and the interdependence of various positions in the team (in other words, the system the team uses).
We’ll start with the first because it’s easier to explain. Let’s take the example of soccer. A goalkeeper’s ability is measured simply put by the shots he saves to the total number of shots he faces. It’s a simple statistic and one I like to call, a “vital statistic” because it represents the primary job of a player in that position.
The second, dictates what other statistic takes importance. For example (I’m sticking with soccer), the number of passes that goalkeeper Victor Valdes, and centre-backs Gerard Pique and Carles Puyol made is important because Barcelona liked to play the ball from the back and keep it on the ground. This is not a statistic that is important to every team but is important to Barca because it represented the foundation of what they were trying to do- build a patient possession game on the ground to counter their lack of aerial prowess.
Similar analogy can be made in basketball. A point guard is traditionally defined by his assists. But scoring becomes important as a statistical category when he is also the player around whom the system is based- like Russell Westbrook at OKC Thunder or James Harden (pic) with the Houston Rockets this season.
The relevance of various statistics- especially advanced statistics is one that provides a numerical basis for many things. These include assessing a player, or a group of players and more importantly, it provides a platform for assessing them in a given situation or a given system. It allows general managers and coaches to decide if they need a player or can get rid of him.
However, it also has it’s own pitfall. Just as it can be used to evaluate your team, it can also be used to evaluate another team. And that means, that if you’re looking to recruit a player, you have to take into account the system he plays in and the type of players he plays with. The biggest example that strikes me at this point for this is Andy Carroll.
Transferred in January 2011 to Liverpool FC from Newcastle United, he was expected to be a good buy with a keen knack for goalscoring after having a good first half of the season with the Magpies. But he failed to get going at Anfield and was shunted out two years later. Analysis of his performances at Newcastle and Liverpool showed a key difference. At Newcastle, Carroll had a strong central midfield – including Kevin Nolan- who would play long balls to him up front right down the middle. Carroll would then play as a true target-man and fashion chances for himself and his teammates. At Liverpool, the key difference was that he was expected to do the same with crosses instead of long balls down the middle. In theory the strategy should have worked, especially with accomplished crossers like Stewart Downing. But Carroll found it hard to work in this new style and struggled.
Maybe I wouldn’t be talking about this if he had succeeded but this is something that is seen consistently- the ignorance of an opponent’s system when you’re assessing a player. This is because the system is always greater than the sum of it’s parts and sometimes, it makes certain players look better than they actually are. The vice versa is also true. Failing systems can make certain players look worse than they actually are. What I’m saying is, assess the player using the statistics he is putting up but also assess the environment he plays in. A guy putting up twenty goal seasons for Sheffield United is not on the same plane as a guy who scores only twelve or thirteen goals for Tottenham Hotspur in the same number of games.
There’s something called the intangibles in sports, that some argue, cannot be assessed by statistics. The term “intangibles” describes- to put it rather simplistically- a natural ability and aptitude for the game. And believe it or not, this is actually very important. A natural affinity for the game is the foundation on which coaches build the player that a person becomes. Without that, there is no point even if the best coaches in the world put in the maximal effort.
The mental aspect of a game also comes into account when we talk of intangibles. Is a guy determined enough? Does he want to work hard enough? Is he capable of leading a team? Is he capable of motivating a team? Can he be a part of the team and do no matter what we ask him to for the greater good of the team?
These are questions that can only be assessed by personal evaluation of a player- in practice, in the game, off the playing arena, how he handles the media, the pressures of being on a winning team or the disappointment of being on a losing one and so on.
But recently, the Football Manager series of games, is looking to try and use statistics to predict a pattern in these areas too. How often is a player brusque with the media, how much does a manager reveal to the media about his team, does a goalkeeper underperform in big games or does he enjoy playing in them and give his best- these are aspects that the FM games have attempted to deal with mixed success. Is it possible that one day, we may well be looking at an algorithm that can explain a player’s tendencies both on and off the field? Maybe. We’ll look more into that when we talk about football later on.
Statistics and sports is a marriage of convenience so far. Some sports have benefitted immensely from the advent of statistics while others are still trying to find their way to work together with analytics. In the next few parts of the series, we examine baseball, basketball and soccer to see what the pros and cons are for each of them as far as analytics go. Till then, goodbye!