I poked my nose into this thread on Twitter, about finding the next Virgil van Dijk. While my measure of similarity in passing style doesn’t capture much from a defensive point of view, it should capture similarity in passing style with more nuance than the measure employed by the author in his post, (% of passes forward / % of passes to final third). There are other factors that the author has also included in his analysis which this post does nothing to address.
I got in touch with him and with his blessings, here is my partial finding the next van Dijk post.
I’ve linked the logic that I employ earlier.
The focus was on identifying players from lower leagues so I tried my method on the second tiers of German and English football The charts below compare the passing styles of various players from these leagues wth Virgil van Dijk’s passing style. I’ve included Virgil van Dijk’s statistics in both the plots as well to have some reference. I’ve excluded comparisons between the same match for van Dijk since that would lead to the distance between those performances all being 0 instead of telling us about the match to match variation in his game.
One thing to note is that this isn’t an overall similarity of two players but a similarity between the way the two players played in two paritcular matches. Depending on how we choose to go from a many matches to many matches comparison to a one player to one player comparison, we could end up with different conclusions. The many to many that I compare is the closest match that every player had for each of van Dijk’s performance and look at that distribution. Players who have played in various roles that van Dijk himself has played in, should have at least one match very similar to each of van Dijk’s performances. Players who have played in some subset of roles that van Dijk has played in but haven’t had to play in the some of the other roles will have some very similar points when the matches with the similar roles are compared, but also some points farther away for the matches where a similar role wasn’t played in. Getting from this many to many to the one to one is something I’ll talk about later after we’ve seen the numbers.
The distance measure uses four variables to calculate distance, which are all combined for each match and captured in the red elements of the plot. Since the author of the aforementioned article was only interested in the passing style, I calculated another distance measure based on just the coordinates of where the respective player passed from and passed to, which is captured in the blue elements.
Bundesliga2 has maybe 6 - 10 players of interest, namely, Tim Hoogland, Hauke Wahl, Rafael Czichos, Florian Hübner, Christian Strohdiek, Mergim Mavraj, Benedikt Gimber, Christopher Avevor, Timo Beermann, Uwe Hünemeier
The Championship has maybe 5 - 12 players of interest, namely, Fikayo Tomori, Ashley Williams, Tyrone Mings, Ben Davies, Cameron Carter-Vickers, Jordy de Wijs, Ahmed Hegazi, Liam Cooper, James Chester, Jordan Thorniley, Liam Ridgewell, Joe Rodon
After this list, there seems to be a slight jump up in distances so I’m going to conveniently assume that point as a cut-off and ignore the names from farther out.
Cameron Carter-Vickers plot indicates him to be a player who could make very similar passes to van Dijk, which is why the distance in passes made is much lower compared to other players, but given that his overall passes distance isn’t correspondingly low, it implies he doesn’t receive passes from and / or in as similar areas as van Dijk. Liam Ridgewell is another example of a player whose statistics indicate the same. If we define passing similarity as only similarlity in the passes made, we might pick these players ahead of some of the other players on the list. If we instead define is as similarity in passes made and received, then the order of names in this plot would be a better ranking of similarity.
Side note - extreme outliers to the trend of overall distance being proportional to distance of passes made, like Jake Cooper, Yoann Barbet, and Stefan Thesker may be interesting cases to study more.
You could cut and slice this in any other way too, for example instead of taking the typical similarity you could choose to check for enough number of comparisons having a low enough distance, the idea being that the player has played very similarly to van Dijk in at least some matches and maybe his role in other matches didn’t allow for close similarities from those matches. For example, Liam Ridewell has a bunch of points at around 15 overall distance but also has points farther away which pulls his median distance higher, whereas Tyrone Mings seems to hover around the 17 overall distance mark consistently, which keeps his median lower, but he may not have the same number of instances of a distance of around 15 as Ridgewell. This might imply that compared to Mings, Ridgwell is more capable of passing like van Dijk for some of the games that van Dijk played, but in some other matches van Dijk’s passing was very different from any of Ridgewell’s games. Mings had some similarity to both those sets of van Dijk’s performances but didn’t have as much of a similarity to either of those sets as Ridhewell managed with the first set. If it is the first set of van Dijk’s performances that we’re trying to find similarities with, then Ridgewell might be a better candidate than Mings.