It appears that the Sportdec blog, where I initially posted this, has vanished into the ether. So I figured I'd repost this here.
Once in a Blue Mean
A blog about statistics, the EPL, and Manchester City. Follow me on twitter at @HawkesTeeter
Sunday, October 28, 2018
Saturday, August 4, 2018
In Defense of Per Possession Statistics
A while back now, Ashwin Raman wrote about per possession statistics and how they can be applied in football. One of the takeaways from the article was that per possession statistics were not a suitable choice to replace per 90 statistics as way to judge a player's value, as the most efficient players with the ball are not necessarily who we would consider the stars of the game. He is certainly correct that a player's efficiency in possession is inadequate measure of value by itself, but that doesn't mean it tells you nothing useful. On the contrary, when paired with a way to measure the offensive load a player has by how many possessions they are involved in, you can get a more rounded picture of a player's attacking contribution than by per 90 stats alone.
To see why, let's take a little trip down memory lane. Way back in the Dark Ages, "analysis" of football players was focused mostly on counting statistics. A player who scored 30 goals was reckoned to be better than someone who scored 20 goals. This functioned well enough to settle some dumb arguments at the pub, but it was pretty basic and suffering from the obvious flaw that it doesn't account for the time played by the respective players. The enlightened decided that counting statistics were inferior to rate statistics and moved to goals per 90 minutes, allowing us to see the frequency a player scored a goal given a certain amount of time on the pitch. This was certainly an improvement on the old system. However, as others have shown, the 90 minutes in a game has little actual relation to the action. If West Brom constantly have fewer minutes of on-pitch action because the ball is always on the sidelines, aren't their attackers' per 90 stats artificially deflated? If one team has possession of the ball 90% of the time and the other team can never even get the ball to its forwards, do the forwards of the respective teams really have the same level of opportunity to score? A better rate statistic would have the unit of analysis be more directly related to what is happening on the pitch.
Enter per possession numbers. Let me first state that what I'm talking about here has nothing to do with possession percentage as typically calculated. Instead, possessions (or possession chains) are defined as the number of times a team has the ball in a game, making them the fundamental unit of opportunity in football (Opta does actually define this now, but as far as I know they do not publish this anywhere. I proxy this by adding up a team's shots, unsuccessful passes, unsuccessful dribbles, and dispossesions). A team will only have the ball a given number of times in each game, so what they do with it each time is critically important. A team that is efficient in creating chances with their possessions is a good attacking team; a team that turns the ball over frequently and manufactures few shots is a poor one. As a result, teams need to make sure that each possession the ball is getting to the players who can do the most with it. This is why evaluating players by per possession numbers require looking at two things: efficiency and usage.
Let's view efficiency first. Ashwin was looking at how efficient players were each time they touch the ball, and I agree with him that it is probably a better way to evaluate tendencies of players rather than a means to evaluate overall talent level. But as far as value for the team is concerned, we really should be focused on the possessions a player "uses", or has the final action of a possession. This is because there is no scarcity in the number of times a player can touch the ball in a given possession (as the Portugal-Mexico game in The Simpsons brilliantly illustrates). There is however only one person who can have the final touch on the ball, so how a team allocates those final touches is critically important. Good teams should direct the ball to players most effective at creating chances with the possessions they use, poor teams will generally have more possessions used by their defenders (or keepers) since they struggle more in build-up.
However, Ashwin's post raises a good question: if it is true teams are best served directing the ball to their most efficient players, why are the most efficient players not the stars of the game? The answer is that creation of good opportunities matters along with converting them. Being able to score tap-ins from crosses put on a plate is a good skill to have certainly, but if you can only convert chances and don't create them for yourself or others your value is limited (I'm looking at you, Jermain Defoe). In order to account for this, we have the concept of Usage Rate. This is a concept borrowed from basketball that looks at what percentage of a team's possessions a given player "uses", or has the final action, weighted by minutes played (you can find the calculation and numbers from the 2017-18 Premier League season here). Usage Rate is a way to measure the burden of creativity on a given player in a team's setup. This is not to say having a high Usage Rate is inherently good or a low Usage Rate bad. Using a ton of possessions if you can't do anything with them is a problem; conversely, for positions where turnovers are more costly you may prioritize safety over creation and a low Usage Rate is preferable. If we are looking for the best attacking players, we'd really want to find the players who create a high volume of chances and still maintain good efficiency with the ball.
For example, if we look at who had a Usage Rate above 15% and had more than 25% of their possessions result in a shot, key pass, or assist (which I'm calling Success Rate) with at least 1,000 minutes in the 2017-18 Premier League season, we get a list of 8 players: Cesc Fabregas, Eden Hazard, Alexis Sanchez, Philippe Coutinho, Kevin De Bruyne, Paul Pogba, Anthony Martial, and Christian Eriksen. To me, that seems like a pretty good list of some of the best attackers in the PL (I'll admit the inclusion of Martial surprised me a little). Use a better efficiency measure (say xG + xA per possession) and the accuracy could be improved further. If you don't account for Usage Rate, Kostas Mitroglou's 55% Success Rate looks very good. But factoring in Mitroglou has the 10th highest Usage Rate on his own team at 9% provides some much needed context, and makes Florian Thauvin (19% Usage Rate and 30% Success Rate) and Dmitri Payet (18% Usage Rate and 37% Success Rate) appear the clear stars.
All well and good you might say, but since the players above generally have very good per 90 stats as well, what's the difference? First, it enables us to make better comparisons between players than using per 90 stats since we can tell whether they accumulate those via efficiency or volume. Ronaldo and Messi are (still) the best players in the world and each averaged 8.5 shots + key passes per 90 last season. However, they are quite different in how they achieve that, with Messi using 21% of Barcelona's possessions (more than any PL player last year) and Ronaldo shooting it basically every time he touches it. Two, it provides a way to see how teams structure their attacks. Do they go long to a target man (like Peter Crouch or Andy Carroll, both of whom consistently have high usage rates)? Are they focused around a single playmaker, like West Ham in the Payet era? Are their attacks built on wing-play, like Leicester with Albrighton and Mahrez? Usage Rate is a useful tool for identifying such patterns. Finally, it can help analyzing how a player, or team for that matter, might do in a different context. A player with a high Usage Rate may see their per 90 stats drop in a new team if there are other high Usage players already present as there is only one ball to go around. This was one of the reasons I was against the Alexis to City transfer (from both sides) and on City signing Nolito. Likewise, if a team loses a player to injury or transfer, how the possessions that player used will be distributed is a key question that Usage Rates can help answer.
All this is definitely not to say that per possession stats are great for everything or that per 90 stats have no purpose. But once you account for creativity by way of Usage Rate, I do think they provide a valuable framework for thinking about the game, at least from an attacking perspective. Because the unit of analysis is more firmly rooted in the action on the pitch, they let you account for things that per 90 statistics can't. Per possession statistics will never, and should never, replace per 90 statistics but they remain under-utilized in analysis of the game.
To see why, let's take a little trip down memory lane. Way back in the Dark Ages, "analysis" of football players was focused mostly on counting statistics. A player who scored 30 goals was reckoned to be better than someone who scored 20 goals. This functioned well enough to settle some dumb arguments at the pub, but it was pretty basic and suffering from the obvious flaw that it doesn't account for the time played by the respective players. The enlightened decided that counting statistics were inferior to rate statistics and moved to goals per 90 minutes, allowing us to see the frequency a player scored a goal given a certain amount of time on the pitch. This was certainly an improvement on the old system. However, as others have shown, the 90 minutes in a game has little actual relation to the action. If West Brom constantly have fewer minutes of on-pitch action because the ball is always on the sidelines, aren't their attackers' per 90 stats artificially deflated? If one team has possession of the ball 90% of the time and the other team can never even get the ball to its forwards, do the forwards of the respective teams really have the same level of opportunity to score? A better rate statistic would have the unit of analysis be more directly related to what is happening on the pitch.
Enter per possession numbers. Let me first state that what I'm talking about here has nothing to do with possession percentage as typically calculated. Instead, possessions (or possession chains) are defined as the number of times a team has the ball in a game, making them the fundamental unit of opportunity in football (Opta does actually define this now, but as far as I know they do not publish this anywhere. I proxy this by adding up a team's shots, unsuccessful passes, unsuccessful dribbles, and dispossesions). A team will only have the ball a given number of times in each game, so what they do with it each time is critically important. A team that is efficient in creating chances with their possessions is a good attacking team; a team that turns the ball over frequently and manufactures few shots is a poor one. As a result, teams need to make sure that each possession the ball is getting to the players who can do the most with it. This is why evaluating players by per possession numbers require looking at two things: efficiency and usage.
Let's view efficiency first. Ashwin was looking at how efficient players were each time they touch the ball, and I agree with him that it is probably a better way to evaluate tendencies of players rather than a means to evaluate overall talent level. But as far as value for the team is concerned, we really should be focused on the possessions a player "uses", or has the final action of a possession. This is because there is no scarcity in the number of times a player can touch the ball in a given possession (as the Portugal-Mexico game in The Simpsons brilliantly illustrates). There is however only one person who can have the final touch on the ball, so how a team allocates those final touches is critically important. Good teams should direct the ball to players most effective at creating chances with the possessions they use, poor teams will generally have more possessions used by their defenders (or keepers) since they struggle more in build-up.
However, Ashwin's post raises a good question: if it is true teams are best served directing the ball to their most efficient players, why are the most efficient players not the stars of the game? The answer is that creation of good opportunities matters along with converting them. Being able to score tap-ins from crosses put on a plate is a good skill to have certainly, but if you can only convert chances and don't create them for yourself or others your value is limited (I'm looking at you, Jermain Defoe). In order to account for this, we have the concept of Usage Rate. This is a concept borrowed from basketball that looks at what percentage of a team's possessions a given player "uses", or has the final action, weighted by minutes played (you can find the calculation and numbers from the 2017-18 Premier League season here). Usage Rate is a way to measure the burden of creativity on a given player in a team's setup. This is not to say having a high Usage Rate is inherently good or a low Usage Rate bad. Using a ton of possessions if you can't do anything with them is a problem; conversely, for positions where turnovers are more costly you may prioritize safety over creation and a low Usage Rate is preferable. If we are looking for the best attacking players, we'd really want to find the players who create a high volume of chances and still maintain good efficiency with the ball.
For example, if we look at who had a Usage Rate above 15% and had more than 25% of their possessions result in a shot, key pass, or assist (which I'm calling Success Rate) with at least 1,000 minutes in the 2017-18 Premier League season, we get a list of 8 players: Cesc Fabregas, Eden Hazard, Alexis Sanchez, Philippe Coutinho, Kevin De Bruyne, Paul Pogba, Anthony Martial, and Christian Eriksen. To me, that seems like a pretty good list of some of the best attackers in the PL (I'll admit the inclusion of Martial surprised me a little). Use a better efficiency measure (say xG + xA per possession) and the accuracy could be improved further. If you don't account for Usage Rate, Kostas Mitroglou's 55% Success Rate looks very good. But factoring in Mitroglou has the 10th highest Usage Rate on his own team at 9% provides some much needed context, and makes Florian Thauvin (19% Usage Rate and 30% Success Rate) and Dmitri Payet (18% Usage Rate and 37% Success Rate) appear the clear stars.
All well and good you might say, but since the players above generally have very good per 90 stats as well, what's the difference? First, it enables us to make better comparisons between players than using per 90 stats since we can tell whether they accumulate those via efficiency or volume. Ronaldo and Messi are (still) the best players in the world and each averaged 8.5 shots + key passes per 90 last season. However, they are quite different in how they achieve that, with Messi using 21% of Barcelona's possessions (more than any PL player last year) and Ronaldo shooting it basically every time he touches it. Two, it provides a way to see how teams structure their attacks. Do they go long to a target man (like Peter Crouch or Andy Carroll, both of whom consistently have high usage rates)? Are they focused around a single playmaker, like West Ham in the Payet era? Are their attacks built on wing-play, like Leicester with Albrighton and Mahrez? Usage Rate is a useful tool for identifying such patterns. Finally, it can help analyzing how a player, or team for that matter, might do in a different context. A player with a high Usage Rate may see their per 90 stats drop in a new team if there are other high Usage players already present as there is only one ball to go around. This was one of the reasons I was against the Alexis to City transfer (from both sides) and on City signing Nolito. Likewise, if a team loses a player to injury or transfer, how the possessions that player used will be distributed is a key question that Usage Rates can help answer.
All this is definitely not to say that per possession stats are great for everything or that per 90 stats have no purpose. But once you account for creativity by way of Usage Rate, I do think they provide a valuable framework for thinking about the game, at least from an attacking perspective. Because the unit of analysis is more firmly rooted in the action on the pitch, they let you account for things that per 90 statistics can't. Per possession statistics will never, and should never, replace per 90 statistics but they remain under-utilized in analysis of the game.
Tuesday, June 26, 2018
17-18 Usage Rates
Finally got around to updating Usage Rates for the 2017-18 Premier League season. A couple notes:
1) Since Squawka's website has been pretty useless for a while, I used data from Whoscored. This means I went ahead and added dispossessions to the calculation.
2) The two sites appear to count goalkeeper passes quite differently, with Whoscored seeming to include a lot more of them (I assume this is because Squawka only included short passes, but I'm unsure about that).
While the above changes mean the calculation is more accurate, it does make historical comparisons more difficult.
For those unfamiliar with the general concept, I've written about the metric here, here, and here. Usage Rate is an attempt to measure the influence each player has in a team's attack by looking at the percentage of a team's possessions he "uses", or has the final action of a possession, weighted by minutes played:
(Shots+Key Passes+Assists+Unsuccessful Take-Ons+Unsuccessful Passes+Dispossessions)/
((Team Shots+Team UTO+Team UP+Team Dispossessions)/(Minutes Played/Total Minutes))
Thoughts/suggestions/comments are all welcome.
Link is here.
Friday, October 13, 2017
American Exceptionalism
The US Men's National Team will not be going to the 2018 World Cup. This fact has caused considerable consternation among American soccer fans, most of whom can't remember a World Cup without the US present. Of course, that very fact demonstrates just how young soccer fandom is in the US, as the 40 years prior to 1990 spent without a World Cup appearance don't register much with Millenials. Still, the US has the largest population of CONCACAF countries by far and one of the largest economies in the world. Surely it should be able to make use of these advantages to dominate the conference, especially now given soccer's increased popularity? Indeed, one of the main premises of the book Soccernomics was that countries like the US would come to dominate soccer, since economic development and population size are key indicators of success in international soccer. While the latter half of that statement is certainly true, I think that analysis missed a key factor of why some countries are better at certain sports than others, a factor that explains why people should temper their expectations of the USMNT.
As an example, let's consider a lesser-known sport with which I have some personal experience: orienteering. For those of you who are totally lost (pun intended), a brief explanation: orienteering is a sport where you use a map and compass to find a number of specified locations in the wilderness as quickly as you possibly can (for more information you can click here). I was actually pretty good at this sport once, so much so that I was on the US Junior National Team for four straight years. Of course, making the US team in orienteering wasn't the achievement it sounds given the US was terrible at orienteering at the youth level. Most years, it was a struggle to find a full complement to bring to the Junior World Championships, and when we did compete we were usually close to the bottom along with Hong Kong, Israel, and Ireland. Who did well? The Scandinavian countries, particularly Norway, Sweden, and Finland, and Switzerland were consistently top.
There's a reason I mentioned the countries above. Both those countries at the bottom and the top of the orienteering ladder have similar population sizes and have highly developed economies. So why did some do so well while the others didn't? Geographic factors played a part certainly (though there's a surprising amount of undeveloped land in which to run in Hong Kong, there's a lot less than in Norway). However, I think the main thing is simply the level of interest in the sport. Sweden is the birthplace of competitive orienteering, has events that regularly get over 10,000 contestants (a US meet that has 200 is big), and even had news coverage of orienteering results. Perhaps most impressively, if you say the word orienteering, you aren't looked at like you are from Mars. Norway and Finland are similarly enamored of the sport, and as a result these countries dominate year after year. Though they do not have the resources of the US, or dramatically more resources than Ireland, Hong Kong, or Israel, the higher interest in the sport means a greater share of their resources are invested in it.
This is something you see around the world: there are always countries that are much better than they "should be" at a particular sport based on their economy and demographics if they have a high level of interest in it. The Netherlands has more Olympic medals in speed skating than any other country. Bulgaria has the 4th most weightlifting medals despite its size and level of economic development. No country is a match for Canada at hockey. Jamaica, the home of my fiancee, dominates in athletics, with Usain Bolt just the latest example of a storied tradition. They literally teach every child there how to pass a baton in a relay: say the word "reach" and Jamaicans instinctively move. The high school championships, known as Champs, are the biggest annual sporting event in the country. That stuff matters. Interest affects institutional resources, the pool of athletes available, coaching: everything really.
This is true of soccer as well, and that's a problem for the US, where interest in the sport, while growing, is still well behind team sports like basketball, football, and baseball. Look at any of the institutional factors blamed for poor youth development in soccer (travel teams/AAU, high costs, youth coaching not integrated with professional teams, etc.) and you'll find them just as much in baseball, basketball, and the like. Yet America continues to produce world-class talent at those sports but not at soccer. Similarly, America still has the most Olympic medals of any country in a wide range of sports, so it's not an institutional problem. The difference is that other countries put way more emphasis on soccer compared to America and that's always going to make a difference in how good the US can be. Soccer is the number one sport in most countries around the world, and even in countries where it is not (like Jamaica), it's still usually above the interest level in the US. Honestly, it's more surprising to me that the US has done as well as it has in my lifetime given soccer is a higher priority literally everywhere else.
The idea that the US should always be making the World Cup smacks of the old, jingoistic doctrine of American exceptionalism. In fact it's America's actual exceptionalism, in this case its focus on team sports other than soccer, that shows why that thinking is seriously flawed here. Can the USMNT get better with improved coaching, better scouting, or a new "golden generation"? Of course, countries like Iceland show it can be done. But as with Iceland, those are all attempts to improve performance relative to an unchanged baseline expectation given population, economic development, and soccer's place in the country. Without a fundamental change to soccer's place in America's sports culture, America's place in soccer culture won't change either.
Junior World Orienteering Championships Opening Ceremony 2007. I am one of these people. |
As an example, let's consider a lesser-known sport with which I have some personal experience: orienteering. For those of you who are totally lost (pun intended), a brief explanation: orienteering is a sport where you use a map and compass to find a number of specified locations in the wilderness as quickly as you possibly can (for more information you can click here). I was actually pretty good at this sport once, so much so that I was on the US Junior National Team for four straight years. Of course, making the US team in orienteering wasn't the achievement it sounds given the US was terrible at orienteering at the youth level. Most years, it was a struggle to find a full complement to bring to the Junior World Championships, and when we did compete we were usually close to the bottom along with Hong Kong, Israel, and Ireland. Who did well? The Scandinavian countries, particularly Norway, Sweden, and Finland, and Switzerland were consistently top.
O-Ringen, the world's largest orienteering event. Swedish people really like orienteering |
There's a reason I mentioned the countries above. Both those countries at the bottom and the top of the orienteering ladder have similar population sizes and have highly developed economies. So why did some do so well while the others didn't? Geographic factors played a part certainly (though there's a surprising amount of undeveloped land in which to run in Hong Kong, there's a lot less than in Norway). However, I think the main thing is simply the level of interest in the sport. Sweden is the birthplace of competitive orienteering, has events that regularly get over 10,000 contestants (a US meet that has 200 is big), and even had news coverage of orienteering results. Perhaps most impressively, if you say the word orienteering, you aren't looked at like you are from Mars. Norway and Finland are similarly enamored of the sport, and as a result these countries dominate year after year. Though they do not have the resources of the US, or dramatically more resources than Ireland, Hong Kong, or Israel, the higher interest in the sport means a greater share of their resources are invested in it.
This is something you see around the world: there are always countries that are much better than they "should be" at a particular sport based on their economy and demographics if they have a high level of interest in it. The Netherlands has more Olympic medals in speed skating than any other country. Bulgaria has the 4th most weightlifting medals despite its size and level of economic development. No country is a match for Canada at hockey. Jamaica, the home of my fiancee, dominates in athletics, with Usain Bolt just the latest example of a storied tradition. They literally teach every child there how to pass a baton in a relay: say the word "reach" and Jamaicans instinctively move. The high school championships, known as Champs, are the biggest annual sporting event in the country. That stuff matters. Interest affects institutional resources, the pool of athletes available, coaching: everything really.
US sporting interests diverge from the rest of the world. |
This is true of soccer as well, and that's a problem for the US, where interest in the sport, while growing, is still well behind team sports like basketball, football, and baseball. Look at any of the institutional factors blamed for poor youth development in soccer (travel teams/AAU, high costs, youth coaching not integrated with professional teams, etc.) and you'll find them just as much in baseball, basketball, and the like. Yet America continues to produce world-class talent at those sports but not at soccer. Similarly, America still has the most Olympic medals of any country in a wide range of sports, so it's not an institutional problem. The difference is that other countries put way more emphasis on soccer compared to America and that's always going to make a difference in how good the US can be. Soccer is the number one sport in most countries around the world, and even in countries where it is not (like Jamaica), it's still usually above the interest level in the US. Honestly, it's more surprising to me that the US has done as well as it has in my lifetime given soccer is a higher priority literally everywhere else.
The idea that the US should always be making the World Cup smacks of the old, jingoistic doctrine of American exceptionalism. In fact it's America's actual exceptionalism, in this case its focus on team sports other than soccer, that shows why that thinking is seriously flawed here. Can the USMNT get better with improved coaching, better scouting, or a new "golden generation"? Of course, countries like Iceland show it can be done. But as with Iceland, those are all attempts to improve performance relative to an unchanged baseline expectation given population, economic development, and soccer's place in the country. Without a fundamental change to soccer's place in America's sports culture, America's place in soccer culture won't change either.
Friday, August 11, 2017
City, Sanchez, and Next Season
Every year there appears to be one large transfer pursuit that takes up the entire summer and unfortunately this time it's City's pursuit of Alexis Sanchez (to wit, I started writing this a month ago and nothing has been resolved). City's transfer business so far has been pretty reasonable, adding goalkeeper Ederson Moraes, attacking wing/mid Bernardo Silva, and fullbacks Kyle Walker, Benjamin Mendy, and Danilo. That said, a move for Alexis would be the proverbial "statement of intent", as it is rare to poach key players from a rival Premier League club (Walker being an obvious exception). However, I'm concerned the club may not be getting what they think they are. Alexis is a good player, but his fit in the City side is less than ideal, and his value to Arsenal is greater than it is to City. Paying more than their worth for someone who'll be giving the team less (and potentially a lot less soon) doesn't strike me as good business, particularly when there are more important areas to upgrade.
There's no doubt Alexis is a special player. There are very, very few players who can generate such a large goal threat from a wide position: he ranked 6th in Shots per 90 in 2015-16 and scored 13 goals, a mark bettered only by Mahrez in terms of non-strikers. However, that excellent record is at least slightly tempered by his Usage Rate, which shows Sanchez used more possessions than any other player in the league that year. In terms of generating shots, assists and key passes per possession used, Sanchez ranked just 6th on Arsenal. That high of a Usage Rate makes it clear how much he was the focal point of Arsenal's attack in 15-16, and as a result that puts his efficiency numbers in a somewhat better light (after all, if you're the primary focus of defensive effort you're less likely to perform well). There's little doubt he would provide more goals from out wide than Sterling or Sane could (the latter had fewer shots per 90 than Kolarov last season), if not from simple shot volume then from the fact he can actually hit the ball with his laces. If City were getting that vintage of Sanchez, I could see an argument for it.
The problem is, of course, that I don't think we are. For one thing, Alexis played primarily as a striker last season. While this led to his highest goal total ever, his Usage Rate and % of possessions with a positive outcome were very similar, and Sanchez greatly outperformed his xG. Also, as I wrote in an earlier piece, playing as a striker reduced the effectiveness of Ozil and the Arsenal attack as a whole. City have one and possibly two strikers who are better than him, so it would make more sense to put him on the wing, where he excelled previously. That of course would stunt the development of Sterling and Sane and replace the one position where we have starters below peak age with a 29 year old. There's also the fact that Sanchez played over 3000 Premier League minutes last season and is well-known for never wanting to rest. Combined with a playing style that is highly dependent on his legs, I think it unlikely he will age gracefully. You can argue that he will have reduced minutes in this City team, and I think he would were he to join, but that's not what his pricetag is going to be based on and his track record of wanting to play literally all the time suggests he will have trouble accepting that.
So why do City want Sanchez so badly? As far as I can tell, the answer seems to be balance. Indulge me a dive into the subjective here, mainly because I don't have good statistics for this, and let's break up City's attacking personnel into three broad categories based on how they primarily generate shots: passers, runners, and dribblers. Right now, I would say City have passers (Silva, KDB, Gundo, Toure) and runners (Kun, Sterling, Sane) only. Sanchez is a dribbler, something City didn't really have until they picked up Bernardo (yes, Sterling and Sane both dribble a lot, Sterling more than Sanchez on a per 90 basis actually. Still, their ability to latch on to through-balls from KDB and Silva bigger part of their value). I think Pep wants to have more people who can create their own shot off of the dribble on both wings as an option, particularly against teams that set up in a deep block that is hard to break down. Sanchez could definitely help accomplish this goal and Pep may believe getting him is necessary as a result.
The price of that would mean a drastic restructuring of the attack though. City's Usage Rates last season look like this (minimum of 1000 minutes):
Right now, Silva and KDB as passers in the advanced double-pivot below the striker take up the majority of possessions. Surrounding them with runners like Sterling, Kun, and Sane gives them good options to pass the ball and helpfully allows those players to be involved without the ball at their feet as much. Sanchez would throw a wrench into that. His Usage Rate is higher than anyone on the City team, at north of 18%, and he had a high percentage of Arsenal's unassisted shots. As Ryan O'Hanlon pointed out, he is not one to take shots directly from a teammate's pass, rather he gets the ball outside the box and then moves in to shoot. That doesn't fit with a team based around incorporating two high-volume passers from just such an area, particularly when he was unable to click with similar players (Ozil in particular) at Arsenal.
There's an argument to be made that it doesn't matter if Sanchez can replicate his previous form, so long as he helps City win the title. I would agree that the marginal value of a point is greater to City than it is to a midtable team like West Ham (another reason their transfer business was so weird). However, I think that argument fails to address the reasons why City didn't meet expectations last season though. In terms of xG, City had the best figure in the league. While we didn't overperform by G-xG, we weren't vastly undershooting. You could argue that City should have created more with the possession we had (City were just 8th in SOT/Possession), but still the overall results were quite good. Where we underperformed was in defense: Bravo obviously was a factor (after a decent enough start, in one six-week period he conceded 64% of the shots on target he faced), but there were also some systemic issues. Benjamin Pugsley has shown that City tended to have few defenders in the box when opponents' moved the ball there, which is a pretty good indication there was less defensive pressure applied. The additions in goal and at fullback were aimed to address these concerns, as the pace supplied by Walker, Danilo, and Mendy ought to allow them to get into position more easily when the ball is lost and still contribute to the attack. However, there has been no indication of City signing a central midfielder to play at the base of the formation, despite Gundogan's injury issues and the age of Fernandinho & Toure, not to mention the defensive limitations of the latter. That to me would be the area City need to address most urgently. The surplus value of Sanchez over the players potentially replaced by him is nowhere near what the value a top-class CM could provide, and that's before considering how Sanchez would necessitate a restructuring of the attack. The old football cliche of "Don't change a winning team" is pretty obvious nonsense as it ignores needed context, but the underlying notion of "If it ain't broke, don't fix it" applies in this case.
With or without Sanchez, City are favorites for the Premier League. With or without Sanchez, they are not going to be among the favorites for the Champions League. This would be a win-now move that wouldn't really move the needle on the winning part. It's doesn't fix the weaknesses City have at the base of midfield or the squad age problem and it's questionable how much this improves the attack. As an analyst, it would be fascinating to watch Sanchez and City adapt to each other; as a fan, I would prefer not to have to do so.
There's no doubt Alexis is a special player. There are very, very few players who can generate such a large goal threat from a wide position: he ranked 6th in Shots per 90 in 2015-16 and scored 13 goals, a mark bettered only by Mahrez in terms of non-strikers. However, that excellent record is at least slightly tempered by his Usage Rate, which shows Sanchez used more possessions than any other player in the league that year. In terms of generating shots, assists and key passes per possession used, Sanchez ranked just 6th on Arsenal. That high of a Usage Rate makes it clear how much he was the focal point of Arsenal's attack in 15-16, and as a result that puts his efficiency numbers in a somewhat better light (after all, if you're the primary focus of defensive effort you're less likely to perform well). There's little doubt he would provide more goals from out wide than Sterling or Sane could (the latter had fewer shots per 90 than Kolarov last season), if not from simple shot volume then from the fact he can actually hit the ball with his laces. If City were getting that vintage of Sanchez, I could see an argument for it.
The problem is, of course, that I don't think we are. For one thing, Alexis played primarily as a striker last season. While this led to his highest goal total ever, his Usage Rate and % of possessions with a positive outcome were very similar, and Sanchez greatly outperformed his xG. Also, as I wrote in an earlier piece, playing as a striker reduced the effectiveness of Ozil and the Arsenal attack as a whole. City have one and possibly two strikers who are better than him, so it would make more sense to put him on the wing, where he excelled previously. That of course would stunt the development of Sterling and Sane and replace the one position where we have starters below peak age with a 29 year old. There's also the fact that Sanchez played over 3000 Premier League minutes last season and is well-known for never wanting to rest. Combined with a playing style that is highly dependent on his legs, I think it unlikely he will age gracefully. You can argue that he will have reduced minutes in this City team, and I think he would were he to join, but that's not what his pricetag is going to be based on and his track record of wanting to play literally all the time suggests he will have trouble accepting that.
So why do City want Sanchez so badly? As far as I can tell, the answer seems to be balance. Indulge me a dive into the subjective here, mainly because I don't have good statistics for this, and let's break up City's attacking personnel into three broad categories based on how they primarily generate shots: passers, runners, and dribblers. Right now, I would say City have passers (Silva, KDB, Gundo, Toure) and runners (Kun, Sterling, Sane) only. Sanchez is a dribbler, something City didn't really have until they picked up Bernardo (yes, Sterling and Sane both dribble a lot, Sterling more than Sanchez on a per 90 basis actually. Still, their ability to latch on to through-balls from KDB and Silva bigger part of their value). I think Pep wants to have more people who can create their own shot off of the dribble on both wings as an option, particularly against teams that set up in a deep block that is hard to break down. Sanchez could definitely help accomplish this goal and Pep may believe getting him is necessary as a result.
The price of that would mean a drastic restructuring of the attack though. City's Usage Rates last season look like this (minimum of 1000 minutes):
Player | Unsuccessful Passes | Key Passes | Assists | Goals | Shots | Unsuccessful Take-Ons | Minutes | Possessions Used | Usage Rate | % Positive |
De Bruyne | 323 | 83 | 18 | 6 | 86 | 25 | 2877 | 535 | 15.61% | 34.95% |
Silva | 264 | 75 | 7 | 4 | 48 | 23 | 2760 | 417 | 12.68% | 31.18% |
Sterling | 203 | 40 | 6 | 7 | 64 | 63 | 2513 | 376 | 12.56% | 29.26% |
Aguero | 136 | 28 | 3 | 20 | 139 | 47 | 2409 | 353 | 12.30% | 48.16% |
Sane | 130 | 32 | 3 | 5 | 33 | 42 | 1788 | 240 | 11.27% | 28.33% |
Fernandinho | 289 | 32 | 1 | 2 | 36 | 9 | 2755 | 367 | 11.18% | 18.80% |
Kolarov | 279 | 14 | 1 | 1 | 32 | 8 | 2535 | 334 | 11.06% | 14.07% |
Toure | 181 | 19 | 0 | 5 | 32 | 4 | 1941 | 236 | 10.21% | 21.61% |
Navas | 97 | 20 | 0 | 0 | 8 | 1 | 1086 | 126 | 9.74% | 22.22% |
Clichy | 218 | 16 | 0 | 1 | 2 | 2 | 2123 | 238 | 9.41% | 7.56% |
Zabaleta | 98 | 8 | 1 | 1 | 7 | 5 | 1083 | 119 | 9.22% | 13.45% |
Otamendi | 235 | 5 | 1 | 1 | 15 | 3 | 2592 | 259 | 8.39% | 8.11% |
Sagna | 99 | 7 | 1 | 0 | 4 | 1 | 1346 | 112 | 6.99% | 10.71% |
Stones | 122 | 4 | 0 | 0 | 13 | 1 | 2025 | 140 | 5.80% | 12.14% |
Bravo | 109 | 0 | 0 | 0 | 0 | 0 | 1968 | 109 | 4.65% | 0.00% |
Caballero | 57 | 0 | 0 | 0 | 0 | 0 | 1452 | 57 | 3.30% | 0.00% |
Right now, Silva and KDB as passers in the advanced double-pivot below the striker take up the majority of possessions. Surrounding them with runners like Sterling, Kun, and Sane gives them good options to pass the ball and helpfully allows those players to be involved without the ball at their feet as much. Sanchez would throw a wrench into that. His Usage Rate is higher than anyone on the City team, at north of 18%, and he had a high percentage of Arsenal's unassisted shots. As Ryan O'Hanlon pointed out, he is not one to take shots directly from a teammate's pass, rather he gets the ball outside the box and then moves in to shoot. That doesn't fit with a team based around incorporating two high-volume passers from just such an area, particularly when he was unable to click with similar players (Ozil in particular) at Arsenal.
There's an argument to be made that it doesn't matter if Sanchez can replicate his previous form, so long as he helps City win the title. I would agree that the marginal value of a point is greater to City than it is to a midtable team like West Ham (another reason their transfer business was so weird). However, I think that argument fails to address the reasons why City didn't meet expectations last season though. In terms of xG, City had the best figure in the league. While we didn't overperform by G-xG, we weren't vastly undershooting. You could argue that City should have created more with the possession we had (City were just 8th in SOT/Possession), but still the overall results were quite good. Where we underperformed was in defense: Bravo obviously was a factor (after a decent enough start, in one six-week period he conceded 64% of the shots on target he faced), but there were also some systemic issues. Benjamin Pugsley has shown that City tended to have few defenders in the box when opponents' moved the ball there, which is a pretty good indication there was less defensive pressure applied. The additions in goal and at fullback were aimed to address these concerns, as the pace supplied by Walker, Danilo, and Mendy ought to allow them to get into position more easily when the ball is lost and still contribute to the attack. However, there has been no indication of City signing a central midfielder to play at the base of the formation, despite Gundogan's injury issues and the age of Fernandinho & Toure, not to mention the defensive limitations of the latter. That to me would be the area City need to address most urgently. The surplus value of Sanchez over the players potentially replaced by him is nowhere near what the value a top-class CM could provide, and that's before considering how Sanchez would necessitate a restructuring of the attack. The old football cliche of "Don't change a winning team" is pretty obvious nonsense as it ignores needed context, but the underlying notion of "If it ain't broke, don't fix it" applies in this case.
With or without Sanchez, City are favorites for the Premier League. With or without Sanchez, they are not going to be among the favorites for the Champions League. This would be a win-now move that wouldn't really move the needle on the winning part. It's doesn't fix the weaknesses City have at the base of midfield or the squad age problem and it's questionable how much this improves the attack. As an analyst, it would be fascinating to watch Sanchez and City adapt to each other; as a fan, I would prefer not to have to do so.
Sunday, June 25, 2017
Thoughts on Stats and How to Press
The Rangers Report ran an excellent article a few weeks back that focused on how simple stats could be used by teams to press more effectively. One of the main ideas was that by looking at how many times a player loses possession divided by how many times he had possession overall (particularly in the defensive third), we can see how likely he is to lose the ball, and therefore whether it makes sense to press him. It's an intuitive concept, one that I especially like since it fits in very neatly with my own work on Usage Rates. However, I think it is an idea that needs to be more thoroughly explored, and I'd like to offer my own slightly differing viewpoint on the subject.
First of all, the article in question focused on a sample of just four games. While I don't have the data split into thirds of the pitch, I do have all of the relevant stats for the full field for all Premier League teams from the 16-17 season (the raw data is available in my Usage Rate info here). I would also point out that Statszone uses the location of the end of the pass to determine which third it is in, not the beginning. Since from a pressing perspective the priority is where the ball is passed from, the classification Statszone provides is not as helpful as it could be for this purpose, particularly for a team that hits as many long balls as West Brom. The full field data for West Brom in 16-17 is below:
When the sample is broadened, it becomes clear it's a bit unfair to pick on Marc Wilson too much. Those four games were the only PL games he played in all year and he appears to have been played at LB (or left CB in a three) rather than his natural position of CB, which is compounded by the fact he is primarily right-footed (he's listed as both-footed via Transfermarkt, but according to James Yorke he shoots more often with his right so I'm calling him right-footed). While that in itself may have been a good reason to press him in those games, it doesn't help much in finding a long-term strategy. Overall though, we can see the pattern generally holds: the correlation between a player's possessions per 90 in the sample versus the full season has an R-squared of .61, while the correlation of the Loss of Possession % has an R-squared of .43 (this also improves when you add a minutes requirement to the sample, as Leko and Field screw things up).
More fundamentally though, there is an important question here: what are the statistical attributes of players an opposing coach should attempt to press directly? Part of this answer depends on the aim of pressing obviously: if you are attempting to intercept the ball in your attacking third your answer may differ from someone who is simply aiming to coax the other team into giving back possession. Still, I think this passage from Marti Perarnau's piece on pressing demonstrates the general things we should be looking for:
The Rangers Report article makes the case that it is the rate at which a player loses possession (particularly in their own defensive third) that should indicate who to press. But the Possessions Lost in the article's calculation are primarily misplaced passes, by an order of magnitude over everything else. In other words, the Loss of Possession % is not necessarily showing a lack of ability on the ball (which Perarnau would suggest is the key to a good target of pressure), but rather is showing a lack of passing ability. To me, that means the Loss of Possession % as used there does not indicate who to press, it indicates when to press: i.e. the passes from a player with a high Loss of Possession % should serve as a trigger to initiate the press since they are more likely to be lost. As for who to press, we would need to find players who are "poor at making decisions and controlling the ball", ideally those who are targets of poor passers. This would suggest looking at players with more times dispossessed, more unsuccessful touches (both of which I should point out are unavailable at either Statszone where the author pulled his data or Squawka where I got mine, but are shown on Whoscored), and poor dribble/take-on success.
So what does looking at those stats tell us? I added Unsuccessful Touches and Dispossessed to the Possessions calculation, and looked at what percentage of Possessions result in an on-ball turnover (TO) for all defenders and central midfielders. The results are below:
As you can see, it seems to be ruled primarily by position. The central midfielders and fullbacks are higher, then the centerbacks, which is basically what you'd expect. As such, I would probably give the CBs space, make sure my defense is in good position, and start my press upon passes into midfield or to the fullback. It should be noted that a lot of pressing systems attempt to force passes to the sideline, as the sideline reduces options for the man in possession (sidenote: I had a FIFA flashback typing that sentence), so this would seem to confirm that's an effective strategy against West Brom. Passes from Wilson or McAuley (who have a lot of passing errors) to Nyom or Galloway (who struggle on the ball) seem like an ideal time to press to me.
It's important to note that I have not done any league-wide comparisons on this, so I don't know if West Brom's statistics are notably different from the rest of the league. It may be that their CBs actually are more prone to on-ball turnovers than those of other teams, and that it would make sense to press them as a result. We also don't have great metrics to judge the effectiveness of pressing, so it would be difficult to judge how following these recommendations would help or not, and how that impacted a team's overall performance. Still, I think this is definitely the right area to be looking, as finding the players likely to lose the ball can only help you win it back.
(Shoutout to Nico Morales for his feedback on the tactics side of things.)
First of all, the article in question focused on a sample of just four games. While I don't have the data split into thirds of the pitch, I do have all of the relevant stats for the full field for all Premier League teams from the 16-17 season (the raw data is available in my Usage Rate info here). I would also point out that Statszone uses the location of the end of the pass to determine which third it is in, not the beginning. Since from a pressing perspective the priority is where the ball is passed from, the classification Statszone provides is not as helpful as it could be for this purpose, particularly for a team that hits as many long balls as West Brom. The full field data for West Brom in 16-17 is below:
Player | Minutes | P90 Possessions | Loss of Possession % | Last 4 Minutes | Last 4 possessions p90 | Last 4 Loss of Possession % |
Dawson | 3276 | 28.3 | 37% | 360 | 24.0 | 34% |
McAuley | 3140 | 22.2 | 35% | 153 | 12.9 | 32% |
Rondon | 2892 | 26.1 | 32% | 340 | 28.3 | 25% |
Wilson | 255 | 27.2 | 31% | 255 | 27.2 | 31% |
Nyom | 2637 | 25.5 | 30% | 310 | 26.4 | 22% |
McClean | 1480 | 23.3 | 29% | 199 | 27.1 | 35% |
Brunt | 2477 | 33.1 | 25% | 360 | 27.3 | 25% |
Robson-Kanu | 751 | 23.2 | 25% | 31 | 17.4 | 17% |
Fletcher | 3231 | 34.7 | 23% | 360 | 28.5 | 27% |
Livermore | 1301 | 36.9 | 23% | 322 | 32.1 | 24% |
Chadli | 2135 | 30.2 | 20% | 122 | 31.7 | 19% |
Evans | 2637 | 31.2 | 19% | 320 | 26.7 | 15% |
Morrison | 1751 | 40.3 | 19% | 146 | 37.0 | 13% |
Yacob | 2422 | 34.8 | 18% | 227 | 32.5 | 12% |
Field | 291 | 32.2 | 18% | 103 | 27.1 | 32% |
When the sample is broadened, it becomes clear it's a bit unfair to pick on Marc Wilson too much. Those four games were the only PL games he played in all year and he appears to have been played at LB (or left CB in a three) rather than his natural position of CB, which is compounded by the fact he is primarily right-footed (he's listed as both-footed via Transfermarkt, but according to James Yorke he shoots more often with his right so I'm calling him right-footed). While that in itself may have been a good reason to press him in those games, it doesn't help much in finding a long-term strategy. Overall though, we can see the pattern generally holds: the correlation between a player's possessions per 90 in the sample versus the full season has an R-squared of .61, while the correlation of the Loss of Possession % has an R-squared of .43 (this also improves when you add a minutes requirement to the sample, as Leko and Field screw things up).
More fundamentally though, there is an important question here: what are the statistical attributes of players an opposing coach should attempt to press directly? Part of this answer depends on the aim of pressing obviously: if you are attempting to intercept the ball in your attacking third your answer may differ from someone who is simply aiming to coax the other team into giving back possession. Still, I think this passage from Marti Perarnau's piece on pressing demonstrates the general things we should be looking for:
The ball is much easier to take from an opponent who controls it poorly. A team can collectively press the ball at the moment it’s miscontrolled because it would take time to re-establish control of the ball. This plays a part in the “opponent’s ability” as well. If the player is very poor at making decisions and controlling the ball it would be logical to put that player under immense pressure as soon as he’s about to receive it. Most players are taught to press the opponent “as the ball is traveling” because the scene cannot change dramatically within the time the presser leaves his position as the ball is moving between players.
The ball cannot dynamically change directions in the middle of its route between players (unless there is some crazy spin on the ball, which would be visible and anticipated by the players) so it is an optimal time to press the destination point of the ball. If the presser decided to leave his position while it is under the control of the opponent player (without the following layers of the press to protect the vacated space and cover him) then the ball could change direction quite easily as the opponent can simply dribble and exploit the movement of the presser.
The Rangers Report article makes the case that it is the rate at which a player loses possession (particularly in their own defensive third) that should indicate who to press. But the Possessions Lost in the article's calculation are primarily misplaced passes, by an order of magnitude over everything else. In other words, the Loss of Possession % is not necessarily showing a lack of ability on the ball (which Perarnau would suggest is the key to a good target of pressure), but rather is showing a lack of passing ability. To me, that means the Loss of Possession % as used there does not indicate who to press, it indicates when to press: i.e. the passes from a player with a high Loss of Possession % should serve as a trigger to initiate the press since they are more likely to be lost. As for who to press, we would need to find players who are "poor at making decisions and controlling the ball", ideally those who are targets of poor passers. This would suggest looking at players with more times dispossessed, more unsuccessful touches (both of which I should point out are unavailable at either Statszone where the author pulled his data or Squawka where I got mine, but are shown on Whoscored), and poor dribble/take-on success.
So what does looking at those stats tell us? I added Unsuccessful Touches and Dispossessed to the Possessions calculation, and looked at what percentage of Possessions result in an on-ball turnover (TO) for all defenders and central midfielders. The results are below:
Player | Minutes | Unsuccessful Touches p90 | Dispossessed p90 | Unsuccessful Take-Ons p90 | TOs p90 | Possessions p90 | TO% |
Galloway | 248 | 2.2 | 0.7 | 0.0 | 2.9 | 9.1 | 32% |
Morrison | 1740 | 1.8 | 1.8 | 0.4 | 4.0 | 13.8 | 29% |
Fletcher | 3235 | 1.3 | 1.2 | 0.2 | 2.7 | 11.8 | 23% |
Nyom | 2639 | 1 | 0.6 | 0.4 | 2.0 | 9.8 | 21% |
Gardner | 213 | 1.3 | 1.3 | 0.8 | 3.4 | 18.7 | 18% |
Livermore | 1303 | 1.2 | 0.6 | 0.3 | 2.1 | 12.0 | 17% |
Yacob | 2418 | 0.7 | 0.5 | 0.1 | 1.3 | 8.2 | 16% |
Brunt | 2479 | 0.8 | 0.6 | 0.1 | 1.5 | 12.7 | 12% |
Dawson | 3278 | 0.8 | 0.4 | 0.2 | 1.4 | 12.9 | 11% |
Evans | 2638 | 0.4 | 0.3 | 0.1 | 0.8 | 7.3 | 11% |
Wilson | 256 | 0.4 | 0.4 | 0.0 | 0.8 | 10.3 | 8% |
McAuley | 3143 | 0.2 | 0.1 | 0.1 | 0.4 | 8.6 | 5% |
Olsson | 591 | 0.2 | 0 | 0.0 | 0.2 | 9.6 | 2% |
As you can see, it seems to be ruled primarily by position. The central midfielders and fullbacks are higher, then the centerbacks, which is basically what you'd expect. As such, I would probably give the CBs space, make sure my defense is in good position, and start my press upon passes into midfield or to the fullback. It should be noted that a lot of pressing systems attempt to force passes to the sideline, as the sideline reduces options for the man in possession (sidenote: I had a FIFA flashback typing that sentence), so this would seem to confirm that's an effective strategy against West Brom. Passes from Wilson or McAuley (who have a lot of passing errors) to Nyom or Galloway (who struggle on the ball) seem like an ideal time to press to me.
It's important to note that I have not done any league-wide comparisons on this, so I don't know if West Brom's statistics are notably different from the rest of the league. It may be that their CBs actually are more prone to on-ball turnovers than those of other teams, and that it would make sense to press them as a result. We also don't have great metrics to judge the effectiveness of pressing, so it would be difficult to judge how following these recommendations would help or not, and how that impacted a team's overall performance. Still, I think this is definitely the right area to be looking, as finding the players likely to lose the ball can only help you win it back.
(Shoutout to Nico Morales for his feedback on the tactics side of things.)
Subscribe to:
Posts (Atom)