Analytics: ExpG the Holy Grail of soccer stats

Harry-Kane;-Tottenham

Tottenham's Harry Kane. (AP)

Spend five minutes looking through any soccer analytics articles and you are bound to come across the term Expected Goals (ExpG), the advanced statistic that has quickly become the most popular—and at times controversial—among soccer analysts.

Given its widespread use in analytics it is worth taking the time to unpack what it is and why it’s used, and examine its strengths and weaknesses.

What are Expected Goals?

Expected Goals calculate the probability that any given shot will be converted into a goal while taking into account factors such as distance to the goal, angle to the goal, speed of attack and part of the body. For example if a shot has a 30 percent chance of being scored it is worth 0.3 ExpG.

Expected Goals may sound like a complicated concept, but at its core the statistic is just attaching a number to an idea that every soccer fan is already familiar with: the quality of chances. After watching a game you usually have a pretty good idea of which team created the better chances and Expected Goals are an objective way of comparing these chances.

Sometimes the team that creates better chances isn’t always the team that creates more chances, which is one of the weaknesses of solely shot-based statistics. Last season’s Tottenham-Hull City game is a good example of this phenomenon. Hull outshot Spurs 14-10, but Tottenham created the better quality chances. The shot totals may not have reflected that Tottenham was the better team, but the ExpG totals did, as Tottenham had 1.6 ExpG to Hull’s 1.1.

Why are ExpG numbers useful?

The simple reason that Expected Goals have become as prevalent as they are is that they have incredible predictive power.

There are plenty of games where a team creates better opportunities than their opponents and still lose the game. However, if a team consistently creates better chances than their opponents over the course of an entire season they almost always find themselves near the top of the table. The same is true of Expected Goals.

A team’s Expected Goal Difference is very closely correlated to their final table position and the statistic does a good job of predicting total points as the season progresses.

The following graph looks at the twenty Premier League teams’ point totals from the 2014-15 season and their Expected Goal Differences. The relationship between the two is very close.

How should we interpret ExpG totals?

Even after understanding the concept and buying into the idea that the statistic is useful there is still the question of what the ExpG numbers actually mean and how to read them. The answer to this question is quite simple: ExpG numbers should be treated in the same way we treat goal numbers.

The idea is that over a large enough sample a team’s ExpG numbers will converge to—or at least come close to—the team’s total goal numbers. So if a team’s aim is to only concede 30 goals in a season then any ExpG against total around 30 should be seen as a good result and anything higher would be worrying.

Often ExpG is actually a better indicator of underlying talent levels than the goal numbers themselves. If a team concedes significantly more ExpG than goals maybe they just got lucky playing teams that had poor finishing and their defending wasn’t actually as good as their goal numbers alone would suggest, or vice-versa.

ExpG can also be used the same way on an individual level. A player’s goal scoring ability can be extrapolated from their ExpG numbers. If a player is scoring a lot but their ExpG numbers are low then the scoring run may not be sustainable. If a player is going through a dry patch but has high ExpG numbers we should expect their form to turn around. Mario Balotelli last season with Liverpool is a good example of a player who had higher ExpG numbers than goals.

What are the weaknesses of ExpG?

There are two major problems with ExpG statistics that require us to look beyond the numbers themselves.

The first is that not all players are created equal. If Lionel Messi and Andros Townsend both take a shot from the exact location with the exact same ExpG total one is more likely to score than the other (and it doesn’t take too much analysis to figure out which one). This is a flaw, but certainly not a fatal one.

Many different analysts have looked at ExpG and chances in general and come to the conclusion that what separates the best attackers from the pack is their ability to create chances, not necessarily their ability to finish them. This means that while Messi may be a better finisher than Townsend, what really makes him an elite player is that he creates a higher number of better quality chances than Townsend, which is something that will show up in ExpG numbers.

The second issue with ExpG models is that they don’t take into account the position of defenders, which obviously factors into how likely a shot is to be converted. This is a problem that can only be solved with more data and one that clubs are almost certainly working on at this very moment.

Expected Goals don’t tell us everything, but they are an objective way of evaluating chance quality and an incredibly useful tool that makes sense in the broader context of the sport.

Data courtesy of Opta


Sam Gregory is soccer analytics writer based in Montreal. Follow him on Twitter

When submitting content, please abide by our submission guidelines, and avoid posting profanity, personal attacks or harassment. Should you violate our submissions guidelines, we reserve the right to remove your comments and block your account. Sportsnet reserves the right to close a story’s comment section at any time.