All Shook Up: Understanding Liverpool’s defensive fragility using clustering

By Freddie Wilson @thewonderofmu

Liverpool’s defence is often subjected to a fair amount of scrutiny; whilst Jurgen Klopp has created a brilliant attacking force, matters at the other end of the pitch still have room for improvement. 

Up to and including their fixture against Burnley on the 16th September, they have conceded 9 goals, albeit 5 of those came from Man City’s trawling. I have assessed Liverpool’s defence and come up with some reasons why they are prone to leaking goals.

Even though they have conceded 39 chances, which is less than both Manchester United (41) and Chelsea (46), it is the quality of chances that Liverpool have concede which is pivotal. 23% of their chances conceded are rated “great” or “superb” (the two highest chance ratings provided by Stratagem) which accounts for 63% of their expected goals conceded. By contrast, only 15% of Manchester United’s and 9% of Chelsea’s chances conceded are rated in these top categories, accounting for 52% and 45% of total expected goals respectively.

This is where the concept of expected goals comes into its element. According to a basic model, Liverpool have allowed 7.38 expected goals at a rate of 0.19xG per chance conceded. This is significantly higher than Man United (4.94 xG, at an average rate of 0.12 xG/chance conceded) and Chelsea (4.72 xG, average rate of 0.1 xG/chance conceded). Liverpool are conceding high quality chances, which is one of the reasons they have a fragile defence.

Where on the pitch are they going wrong?

Let’s have a look at where Liverpool are conceding chances and analyse where these chances originate from.

(Note that “Primary type” is the method of creation for the chance)

lpool chance conceded

Diagram 1: Liverpool’s chances conceded – Dots


Diagram 2: Liverpool’s chances conceded – Arrows

Note that these arrows represent the final pass/cross/dribble in the build-up to a chance. So the base of the arrow can be thought of as where the “assister” plays the ball from, and the head of the arrow is where the chance is taken from. 

These visualisations contain plenty of information and given sufficient time one could study them in order to determine the fundamental aspects of Liverpool’s chances conceded. However, using clustering techniques (k-means clustering to be specific), we can synthesise the data to create an easily understandable and comparable plot for the chances conceded.

In short, we take the methods of creation that have resulted in at least four chances conceded and we cluster these chances conceded into 2 to 4 groups, represented by arrows (the head being where the chance is taken from and the base being where the chance was created from). Chances conceded via open play passes have four cluster groups as they are the most common and every other category has two. The reason for requiring at least four chances conceded via the same method of creation is to ensure that the clustering is worthwhile and functional. The number of clusters per method of creation is to maintain both proportionality and symmetry, however as the season progresses and more chances are conceded, these figures will be given more consideration. More on the process of k-means clustering can be found in my article on Clustered Shot Zones By Player.

This method creates the following map.


If this graphic is to be effective, it has to be understood easily. 

  • The arrows represent the primary chances conceded. These arrows represent a summary of the chances conceded.
  • The base of the arrow is the origin of the chance, i.e. the location from which the “assister” plays the ball.
  • The head of the arrow is the location from which the chance is conceded, i.e. where the shot is taken.
  • The colour of the arrows represents the method of creation, with the key on the side. 
  • The thickness of the arrow is proportional to the average xG value of chances conceded within that cluster group. Hence the thickest arrow will represent the cluster group of chances conceded with the highest xG value per chance conceded.
  • The shading of the arrow is proportional to the number of entries represented by that cluster group. Hence the darkest arrow will represent the cluster that contains the most entries and so this can be thought of as frequency; the darkest arrow represents the most frequent avenue of creation of chance conceded.

Cluster D, the short red arrow above the penalty spot, immediately stood out to me as it is the most shaded and is relatively thick, meaning that its the most frequent cluster and carries a relatively high average xG value per chance.

We can then check this finding against the previous two diagrams. In Diagram 1, we see a few blue ‘open play’ dots to the right of the penalty spot, some of which are very sizeable to represent the high xG value. And in Diagram 2, we see quite a few blue ‘open play’ arrows in that area too, leading to the high frequency. 

Cluster E also looks potent. In general, the defensive right hand side looks to be conceding more chances than the left; the clustered arrows are skewed more to that side and if you go back to Diagram 2 you can see a similar tendency towards the defensive right.

This is a great way to get to grips with the visualisation. If you fancy having a go yourself, try checking the “Low Crosses” clusters against the first two diagrams in order to understand the reasoning behind the thickness and shading.

How comparable are cluster visual representation’s amongst the top teams?

Let’s compare Liverpool to the clustered chances conceded map for Chelsea and then Arsenal.

chelsea cluster

The first differences to note is that Chelsea’s chances conceded have primarily come via open play passes and high crosses, in contrast to Liverpool’s concessions via open play passes, free kicks and low crosses. To determine whether this is representative of differing defensive strategies or simply a difference in defensive quality would need further investigation. However, one could argue that since low crosses are likely to be more easily blocked or cleared than high crosses, this difference could be a sign of weaker defence from the full backs at Liverpool.

Secondly, Chelsea’s most vulnerable avenue of chances conceded appears to be the Cluster A on the edge of the box. This is further out than Liverpool’s most vulnerable avenue (Cluster D), which explains the slight difference in thickness; the average xG value is likely to be less the further away from goal you get. This really highlights the opening point about the disproportionate number of high quality chances conceded by Liverpool. Forcing the opposition to shoot from outside of the box has great advantages and so Liverpool permitting so many chances to be wholly within the box shows why their defence is often conceding. 


Looking at Arsenal, their clusters H, F and D look very potent. H could be explained by Arsenal playing a high line or perhaps playing long crossing teams like Leicester or Liverpool, which have quite a large bearing as the sample size is relatively small at the start of the season. Similarly to Liverpool, the fact that F is wholly inside the box is not a good sign for Arsenal, especially considering how close it is to the Arsenal goal, however it’s not as central as Liverpool’s Cluster D on their map.

The reasons behind the faults in Liverpool’s defence could be explained by a number of things; whether it’s lapses in concentration, individual errors or a lack of communica-tion, there can varying theories why. 

There is also room for improvement in these graphics since there are so many facets that could be incorporated.  Finding the right balance between amount of information and readability can be a tough act, however variables such as defensive pressure or outcome of the chance could definitely prove useful here. Hopefully this graphic can offer more insight into the particular aspects that may be causing Liverpool’s problems, instead of simple advice along the lines of “Be more like Chelsea”.

I hope that these visual representations will prove to be useful in the understanding of defensive fragilities and provide insights into comparisons between defences.

Thank you for reading. I’d also like to thank Chance Analytics for the opportunity and a very big thank you to Stratagem (@Stratabet) for the data! More of my analyses and insights can be found at: and   

If you like the content from Chance Analytics, then please consider supporting us via Patreon.

This article was written with the aid of StrataData, which is property of Stratagem Technologies. StrataData powers the StrataBet Sports Trading Platform, in addition to StrataBet Premium Recommendations.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s