Sunday, 25 February 2018

Passive & Aggressive Defensive Teams

One of the major drawbacks in quoting counting statistics in football is the varied time of possession enjoyed by teams.

I first wrote about this nearly seven years ago here when describing Stoke's incredibly disciplined approach to defending once you factored in the inordinate amount of time they spent doing it under Tony Pulis in the early days of their soon to be ending Premier League jaunt.

Defensive statistics have always been blighted by failing to account for opportunity.

It is impossible for a Manchester City defender to accumulate the volume of defensive actions made by say a WBA defender, simply because the new champions are only out of possession for around 30% of a typical game and that game only has around 58 minutes when the ball is in play.

WBA, by contrast are averaging just 40% of the total possession and ceding ~60% to the opposition.

Before we can make any meaningful descriptive attempt at a side's defensive set up, we need to make some kind of attempt to account for the unequal range of possession for each team and the amount of time that the ball spends on the pitch rather than in the stands.

We can also attempt to define where on the field a side is trying to dispossess their opponents.

Some teams are noted for the desire to press opponents higher up the pitch to create a turnover or slow down a developing attack, whereas others are more content to lie deep and only actively engage an opponent once they venture into their final third. 

Vertical distance from your own goal can be slightly misleading. If you challenge an opponent on the centre spot you are slightly closer to your own goal than if the event occurs also on the halfway line, but on the touchline.

All calculations have been made from the point of the challenge to the centre of the defending sides own goalline.

 The table above using Infogol data has counted the number of defensive actions, such as tackles, interceptions and clearances made by each team after 27 games of the current Premier League campaign.

These have been grouped by distance from the event to the centre of that side's own goal. Finally, these event numbers have been standardised to account for the actual time each side has been without the ball and a figure for defensive actions per 10 minutes of opposition possession has been calculated.

For example, Manchester City appears to have by far the least number of active attempts to disrupt or disposes an opponent in 2017/18, only making around 16 such attempts per 10 minutes of opponent possession.

So they appear happy to allow teams to circulate the ball, but they do make their most concerted efforts to intercede between 20 and 40 yards from the City goal.

In contrast, Liverpool are much more aggressive at trying to regain the ball, making over twice as many defensive actions per 10 minutes than City, as well as  engaging opponents almost once a minute at distances of 50 or more yards from Liverpool's own goal.

The final sparkline plot shows, not only the total volume of defensive actions per 10 minutes of opponent possession, but also where a side is most active in engaging their opponent.

A side's own goal is on the left of the plot and volume of actions take place further away from a side's own goal as you move towards the extreme right of the sparkline.

The majority of  the top six teams peak their defensive actions between 30 and 40 yards from goal, whereas the remainder of the league by a majority either chose or are forced to defend between 10 and 20 yards from goal.

The most prominent example of a top six team residing in a relegation threatened defensive mindset is Manchester United.

Thursday, 1 February 2018

Manchester City and WBA. The Best in Top Tier History.

The importance of league tables is only absolute after the final game has been played and your side has secured that all important Europa League spot or finished 17th spot or higher.

For the remainder of the time, but particularly just after mid season, it is your side's position relative to their nearest challengers that is most important.

Watford's current 11th may give the illusion of relative safety, but on closer inspection they are only three points above Huddersfield, who are teetering on the brink of the relegation spots in 17th position.

One way to try to quantify your side's current position is to see how close, above or below a side is from the relative mediocrity of the average points won by all sides in the season to date, whilst also accounting for the distribution of points both currently after 25 games and in the past.

Manchester City can rightfully claim to be in the running to become the most dominant title winners in the history of the 20 team top tier.

They are currently 2.56 standard deviations above the current points average per team. Their nearest historical rivals were the Manchester United team of Beckham, Giggs, Keane, Sheringham and the Neville brothers from 2000/01, who were 2.51 SD's above par after 25 games and Chelsea's 2005/06 team (2.50 SD's).

At the bottom, WBA are the "best" 20th placed team ever, being only 1.06 SD's below the average points won by teams so far.

Likewise Swansea and Southampton are the most impressive 19th and 18th placed team, respectively after 25 games.

The unusually distributed nature of the points won by sides in 2017/18 then begins to catch up with those sides whose position implies relative safety, but the proximity of their rivals suggests otherwise.

Newcastle are the second worst 14th placed side in top tier history by this measure, as are Watford in 11th and Burnley in 7th.

Here's the rest of the teams. we've got the strongest bottom four ever in relation to the average points won by a side after 25 games, along with the weakest and most vulnerable mid table teams, again in top tier history.

Monday, 22 January 2018

After the Shot xG2

Expected goals has variously been defined by advocates and opponents respectively as a more accurate summary of what "should" have happened on the pitch or a useless appendage to the final scoreline, that is neither useful nor enlightening.

The first description is perhaps too overtly optimistic for a "work in progress" that is evolving into a useful tool for player projection and team prediction.

Whereas the second, less flattering description, may also stand up to some scrutiny, particularly if the supporters of the stat ignore the uncertainty intrinsic in it's calculation, while the detractors may be blithely ignorant of such limitations.

Both camps are genuinely attempting to quantify the true talent levels of players and teams in a format that allows for more insightful debate and, in the case of the nerds, one that is less prone to cognitive bias.

The strength of model based opinion is that it can examine processes that are necessary for success (or failure), drawing from a huge array of similar scenarios from past competitions.

And in doing so without straying too far down the route from chance creation to chance conversion (or not), so that the model avoids becoming too anchored in the specifics of the past, rendering any projections about the future flawed.

Overfitting past events is a model's version of eye test biases, but that shouldn't mean we throw out everything that happens, post chance creation for fear of producing an over confident model that sticks immutably to past events and fails to flexibly project the future.

It's no great stretch to model the various stages from final pass to the ball crossing the goal line (or not).

Invariably, the process of chance creation alone has been prioritised as a better predictor of future output and post shot modeling has remained either a neglected sidetrack or merely the niche basis for xG2 keeper shot stopping.

But if used in a less dogmatic way, mindful of the dangers of over fitting, the "full set" of hurdles that a decisive pass must overcome to create a goal (or not) may become a useful component in an integrated approach that utilises both numeric and visual clues to deciphering the beautiful game.

Lets look at chances and goals created from set pieces and corners.

Here's the output from two expected goals models for chances and on target attempts conceded by the current Premier League teams in the top flight since early 2014.

The xG column is a pre shot model, typically used to project a side's attacking or defensive process, that uses accumulated information, but is ignorant of what happened once contact with the ball was made.

The xG2 column is based entirely upon shots or headers that require a save and uses a variety of post shot information, such as placement, power, trajectory and deflections. Typically this model would be the basis for measuring a keeper's shot stopping abilities.

A superficial overview of the difference between the xG allowed from set pieces and actual goals allowed leads to the by now familiar "over or under performing" tag.

Stoke had been transformed into a spineless travesty of their former defensive core at set plays, conceding both chucks of xG and under performing wantonly by allowing 42 actual goals against 37 expected.

There's little disconnect between the Potters' xG2, that examines those attempts that needed a save, but the case of Spurs & Manchester United perhaps shows that deeper descriptive digging may provide more insight or at least add nuance.

Tottenham allowed a cumulative 29.6 xG conceding just 23.

We know from keeper models that Lloris is generally an excellent shot stopper and the xG2 model confirms that, along with the ever present randomness, the keeper's reactions are likely to have played a significant role in defending set play chances.

In allowing 23 goals, Lloris faced on target attempts that worth just over 31 goals to an average keeper.

29.6 xG goals are conceded, looked at in terms of xG2 this value has risen to 31.3, so still mindful of randomness, Spurs' defenders might have been a little below par in surpressing the xG2 attempts that came about from the xG chances they allowed, but Lloris performed outstandingly to reduce the level of actual goals to just 23.

Superficially, Manchester United appears identical.

As a side they allowed 37.6 xG, but just 32 actual goals. we know that De Gea is an excellent shot stopper, therefore in the absence of xG2 figures we might assume he performed a similar service for his defence as Lloris did for his.

However, United's xG2 is just 33.1 and the difference between this and the actual 32 goals allowed is positive, but relatively small compared to Lloris at Spurs.

By extending the range of modeling away from a simple over/under xG performance we can begin to examine credible explanations for the outputs we've arrived at.

Are United's defenders exerting so much pressure, even when allowing attempts consistent with an xG of 37.6 that the power. placement etc of those on targets efforts are diluted by the time they reach De Gea?

Are the attackers themselves under performing despite decent xG locations? (Every xG model is always a two way interaction between attackers and defenders).

Is it just randomness or is it a combination of all three?

Using under and over performing shorthand is fine. But we do have the data to delve more into the why and taking this xG and xG 2 data driven reasoning over to the video analysis side is the logical, integrated next step.