Pages Navigation Menu

Finding Similarities

US Presswire

Last night the New Orleans Hornets moved up in the NBA Draft Lottery, snatching the top pick from the Charlotte Bobcats. Today draft season begins in earnest; the draft order is set and so are Hickory-High’s 2012 Draft Similarity Scores. These similarity scores are a system I developed to draw comparisons between each season’s draft prospects and players from previous seasons. Statistically speaking is Thomas Robinson more like Kevin Love or Kenneth Faried? The prospects are compared across 21 different statistical categories, creating a similarity score between 1-1000, with 1000 being identical statistical production. The link above contains a more complete description of the rationale and method, as well as the results for 56 of this year’s draft prospects, but in this post I want to talk about some of the benefits and limitations of these numbers.

Player comparisons are the shorthand of scouting, a way of putting a large chunk of information into an easily-digestible format. Every comparison made around draft time is a move towards the same purpose – identifying what impact a player will have in the NBA. But often these comparisons are made on factors largely unrelated to a player’s future, like skin color, the university he attended, physical similarities, or a single defining skill. My similarity scores were designed to be the cold, unfeeling, scientific answer to those comparisons.

I’ve been working with this system for three years now, and each year I find many of the same questions and criticisms. Some are related to the nature of my system and may not be solvable without sacrificing the ultimate intent. Some are my fault for putting these numbers out each season with little or no explanation on how I envision them being used. The criticisms and questions I’ve received largely fall into four categories:

  1. The similarities must be flawed because they ignore obvious physical differences or similarities between players.
  2. The similarity scores for a certain player are very low so there must be something wrong with the system.
  3. A prospect has one large and obvious difference between his closest comparable so something must be wrong with the system.
  4. A prospect has the potential to be much more than their comparables, so something must be wrong with the system.

I’d like to address some of those concerns by doing something I haven’t done in the past – examining indepth the comparisons of a few players, looking at what the numbers say and what they don’t say.

In doing that there are two principles that I think are important to keep in mind – the results of these similarity scores are comparisons, not projections; and they are comparisons of production not talent. I received more than a few outraged comments last summer when Derrick Williams‘ closest similarity came out to be Ike Diogu. The refrain was that Williams was way better than Diogu, so the system must have a flaw. But remember, the system wasn’t making observations on talent or future production. It was simply pointing out that Williams’ college production was very similar to that of Diogu.

Below are a few examples I’ve selected, which I think will clarify some of the confusion around the similarity scores. The players I’ve picked specifically address the concerns I laid out above. I’ve pulled out an example where physical similarities do not equate to statistical similarities. I’ve also pulled out an inverse example, where statistical similarity comes from very different physical packages. I’ve also highlighted some players on both ends of the spectrum from, very unique to very similar. Finally there are examples of how these comparisons might mislead us in looking at a player as they are now, and as they could be in the future.


Perry Jones and Paul George

Perry Jones Similarity Scores

One of the comparisons that has started popping up frequently the past few days is Baylor’s Perry Jones and Paul George of the Indiana Pacers. Every time I’ve heard this pair share the same sentence it has been accompanied by mention of their similar builds, and a certain degree of “smoothness” in their game. I understand there is an aesthetic component to basketball, but aesthetics and production can exist in entirely separate realms.  George and Jones certainly do have a similar look, and may wind up having similar careers but their college production was vastly different.

Jones’ closest comparable was Luol Deng, with a similarity score of 912. To find George you’d have to scroll all the way down to the 158th player in my sample, where he came in with a similarity score of 787. Jones’ college production was closer to Derrick Rose, Roy Hibbert, and Gordon Hayward than it was to Paul George. The differences between George and Jones came in several areas, but the major chasms were at the free throw line, ball handling and usage. George shot 90.9% from the free throw line, Jones knocked down just 69.6%. George turned the ball over on 21% of his possessions, Jones on 14%. George used 23.1% of his team’s possessions, Jones used just 18.1%.

While both players possess powerful athletic tools, George had at least one full college season of experience utilizing them as an offensive focal point. In doing so he scored more efficiently and did a terrific job protecting the ball. Watching Paul George play in the NBA you’d have to agree that he lacks a certain amount of offensive polish. Despite their similar looks, the statistics suggest that Perry Jones may enter the league even a step or two behind him.


Jae Crowder and Mike Dunleavy

Jae Crowder Similarity Scores

In the comments I’ve received so far on this year’s scores, the Crowder/Dunleavy comparison has drawn the most criticism. I’m not surprised as this comparison defies all the criteria that draft prospect comparisons are usually made on – physical similarities, style of play, look, feel.

Although Dunleavy is three inches taller, their listed weights were almost identical. They were born less than two months apart. They also played an almost identical number of minutes. When we delve into their actual production, the similarities continue- 2.9 steals per 40 for Crowder, 2.5 for Dunleavy; 20.0 points per 40 for Crowder, 19.2 for Dunleavy; 2.4 assists per 40 for Crowder, 2.3 for Dunleavy. Crowder shot 60.2% on two-pointers, Dunleavy 59.6%.

Of the 21 statistical categories these comparisons are based on, Crowder and Dunleavy were less than 0.5 standard deviations apart in 13 of them. They were less than 1.0 standard deviation apart in every area except TO%, where Crowder turned the ball over on an absurdly low 9% of his possessions. Certainly Crowder does not appear to have Dunleavy’s natural shooting touch. What Crowder accomplished with bruising freneticism, Dunleavy did with calculating finesse. Their paths might be different, but during their last college season each was led to a very similar place.


Most Unique

Kendall Marshall Similarity Scores

The most unique prospect this season is undoubtedly Kendall Marshall. His closest comparables came out to be Marcus Williams (827) and Deron Williams (823). No one else had a similarity score above 800. Marshall is one of just five players in my data set to have averaged at least 8.0 assists per 40 minutes. However, he’s not just unique because of his ability to distribute the basketball. Marshall also had the highest TO%, at 32%, and the lowest scoring average at just 9.0 points per 40 minutes. These similarity scores tell us not just how players are similar, but also how they are different. For a player at the extreme polar ends of so many statistical categories it’s difficult to find a match. What Marshall does on the basketball court, both good and bad, hasn’t really been wrapped up in quite the same package before.


A Small Separation

Moe Harkless Similarity Scores

In looking at the similarity scores you’ll find many players with no close similarities and a handful with very strong matches. Among the strongest comparisons this year were Darius Johnson-Odom to Maurice Ager (938), J’Covan Brown to Ben Gordon (924), Fab Melo to Robin Lopez (921) and Moe Harkless to Luol Deng (918).  The last comparison of Harkless to Deng is an example of one that seems questionable because of an extreme difference in one category. Both players share a tremendously versatile skill set, but Deng left college as a reliable outside shooter, making 36.5% of his three-pointers, while Harkless shot just 21.5% on three-pointers last season.

Three-point shooting is such a defining skill in the identity of many players that it’s hard to believe that Harkless is most similar to Deng. Especially when other similarly versatile, but shooting-challenged, wings like Kawhi Leonard, Al-Farouq Aminu and Gordon Hayward were in the sample as well. Like we saw with Jae Crowder and Mike Dunleavy it comes from the broad view of a player’s production. While Harkless and Deng were far apart in their long range shooting proficiency coming out of college, their production was less than 1.0 standard deviation away in 14 of the 21 statistical categories.


Luol Deng

If you spend any time looking through these similarity scores you’ll notice that Deng pops up over and over again. I found him in the top three comparables for ten different prospects before I stopped counting.

The reason he makes so many appearances is that his statistics were very average across the board. I don’t mean average compared to the entirety of college basketball, but average compared to other draft prospects. In fact Deng was less than 1.0 standard deviations away from the data sample average in every statistical category except age. For players who don’t find themselves in the extreme in any statistical category, Deng will always pop up as a similarity.

For curiosity’s sake I ran Deng through the system. His closest comparable was Rudy Gay with a similarity score of 944.


Harrison Barnes’ Development

Harrison Barnes Similarity Scores

Barnes opted to return to school rather than enter last year’s draft, which means I didn’t include him in the 2011 Draft Similarity Scores. I did however run his numbers for a post at HoopSpeakU at the beginning of this college basketball season. At that point his closest comparable was Terrico White, with a similarity score of 928.

After this season Barnes’ closest comparable was Tobias Harris, with a similarity score of 919. Barnes separated himself from White by upping his per 40 minute scoring average from 19.5 to 21.2, getting to the line an extra 2.2 times per 40 minutes, increasing his 3PT% from 34.4% to 35.8% and being more selective about when he used the three-point shot, going from a 3PTA/FGA ratio of 0.39 to 0.26.

None of those changes is earth-shattering but together the indicate a player who thoughtfully made himself more efficient offensively. The point is that these similarity scores are static representations of a fluid concept – player production. They are a snapshot of a moment in time. They give us some information about where a player might be headed, but everything is subject to change.


Even after all this explanation and analysis, I’m sure there are still plenty of questions and critiques out there. Feedback is always welcome, positive or negative, constructive or destructive.

You can find all the similarity scores for 56 different draft prospects here. If there’s anyone I’ve overlooked that you’d like me to include, leave a comment or let me know on Twitter, @HickoryHigh.

  • Eric

    Excellent clarifications and examples. Thank you! One question I have that you’ve probably answered somewhere but I must have missed it: what data from the NBA players are you using to compare? Are you using their production from this past season in the NBA or their production when they were in college however many years back? If it’s the latter, which years from their college careers are you using? Thanks.

    • Ian Levy

      Thanks Eric. All the data for both the draft prospects and the comparison prospects is from their last season in college. So the Moe Harkless/Luol Deng comparison is based on Harkless’ stats from this past season and Deng’s stats from his last season at Duke before being drafted. NBA stats don’t enter into the equation.

  • Pingback: Finding Similarities | 81Bases

  • Pingback: Automatic Player Linker and Newsfeeds » Sports Reference » Blog Archive

%d bloggers like this: