Pages Navigation Menu

2011 Draft Similarity Scores

The NCAA’s deadline to withdraw from the NBA draft has passed and the field is now set. Frustrated with the amount of subjective analysis surrounding the draft, especially comparisons based mostly around physical appearance, I attempted last spring to create a statistical means for comparing draft prospects to draft prospects from previous seasons. Here is a little explanation of the impetus for this project:

For years, every guard with exceptional leaping ability was potentially the next Michael Jordan. Every long white player who can shoot is the next Larry Bird, Keith Van Horn or Adam Morrison; depending on the era. Although, in some parts of Rhode Island they’re referred to as the second coming of Austin Croshere. Every point guard from Gonzaga is the next John Stockton, every huge, awkward center is the next Greg Ostertag and every shot-blocking center with African roots is the next Dikembe Mutombo. These comparisons, based on skin color, position, the college they attended or one singular attribute, do a diservice to the players and fans alike.

The idea was to create an objective method for comparing players, instead of having to rely so heavily on subjective observation. I recognize that this is largely an over-reaction which creates its own set of problems. This system is based on college statistics, and just those from the season before the player was drafted. Therefore, it doesn’t capture potential, patterns of development, personality, or athleticism; besides how it is tangentially reflected in a player’s production. What these similarity scores are meant to identify are the players who produced at the most comparable level to each draft prospect. John Hollinger, of ESPN, and Kevin Pelton, of Basketball Prospectus, both have similar systems, which haven’t been released yet for this year’s draft. However, theirs compare draft prospects to NBA players and are much more valuable in predicting potential career arcs for each player. Mine is merely a snapshot - at this moment in time Player A’s college production is most similar to Player B’s.

I’ve changed my technique a little since last year. I’ve refined the statistical categories, as well as slightly changing the output. The categories I settled on this season are:

  1. Height
  2. Weight
  3. Minutes per Game
  4. Points per 40 minutes
  5. Offensive Rebounds per 40 minutes
  6. Defensive Rebounds per 40 minutes
  7. Assists per 40 minutes
  8. Steals per 40 minutes
  9. Blocks per 40 minutes
  10. Personal Fouls per 40 minutes
  11. 2PT Field Goal Percentage
  12. 3PT Field Goal Percentage
  13. Free Throw Percentage
  14. Free Throw Attempts per 40 minutes
  15. 3PT Attempts per Field Goal Attempt
  16. Assists per Field Goal Attempt
  17. Pure Point Rating
  18. Percentage of Team’s Possessions Used
  19. Points per Possession
  20. Field Goal Attempts per Possession
  21. Turnovers per Possession

My database for comparison uses every collegiate player selected in the 1st Round of the draft going back to 2001. A few 2nd Round picks from last year’s draft are included as well. Each prospect for this year’s draft is compared to that set in each of the 21 categories. I took the absolute value of the standardized difference between each player’s performance in each category, and multiplied it by 10. That total is then subtracted from 1000. Each similarity score is then on a scale from 0-1000, 0 representing complete opposites, 1000 representing a perfect match. The technique was borrowed from this Basketball-Reference article. My system is not perfect, many of the comparisons are not as statistically close as I would have hoped, others are certainly head-scratching. As usual, I’ll take what I’ve learned and try to improve on it in the future.

For this year’s list of prospects I took the first 40 collegiate players listed in Draftexpress.com’s 2011 Mock Draft. The list is below. My plan is to create a separate page for each prospect showing the results. On each page you will find links to draft profiles, and a comically over-sized table showing the stats for the 10 most similar players. I will be populating the list below with links as the pages are finished. However, all the calculations are done and the spreadsheet can be accessed here. As always, feedback and comments are always welcome.

  • Pingback: The MVP — or MPP — of Every NBA Team (and the WoW Weekend Podcast) | The Wages of Wins Journal

  • Crow

    I look forward to reading the player pages.

    Did you use variable weights for the categories? Ideally I think it is better to try to do so as B-R did as they are not all equally important. Being “neutral” on weights as some others have done (including Kevin Pelton in the past) is actually quite distorting in my view.

    B-R list was pace neutral and looked at rebounding %s based on 100 opportunities. Your list does not. I prefer the B-R choice on these things. I guess it might have been partly based on time but the data was probably available at KenPom.com if you want to take that path in the future.

    The graphics on the player pages involve a lot of columns. At full magnification there are still somewhat hard to read for me. Breaking the graph into say 2-4 pieces would allow bigger font and easier reading. It would especially be helpful to read the similar player and similarity score side by side (either by simply added another name column on the right side or having a separate chart with just that).

    When all players are done it might be good to have a summary file by position of the top 10 or so draftees and say their top 5 comps and their similarity scores. Maybe with a few of the most important categories used or the whole list but broken up into 2-4 chunks for bigger font / easier reading.

  • Crow

    If you do use variable weights on the categories or do so in the future I’d also suggest trying different weights for different positions, different weights for different expected NBA roles (i.e. #1 scorer, second tier scorer vs low tier scorer) or different weights for combinations of position & role. I think that would produce better, perhaps much better “similar players” at least in terms of how they looked in college and maybe how they will look in the NBA. At least it would be another view to consider and compare against the first less “directed” cut.

  • Crow

    Some people like to ignore position almost entirely or entirely but in a similarity system I think position is a legitimate criteria to require in at least one cut of the analysis.

    High similarity scores of a SF with a PG and vice versa or a shooting PF and a traditional C and so on is somewhat interesting but mainly I want to know the best comparisons of a player with players who play the exact same position as the player currently plays or likely will play in the NBA. By listing 10 players you can see similar positions and different ones but for me “the most similar guy” it almost always going to be one with the exact same position.

  • ilevy

    Thanks for the comments Crow. As always your feedback is insightful and much appreciated.

    - I did look at using different stats, for example rebound percentages from KenPom when I started this project last season. However his stats only go back to 2003, which would have only given me seven seasons for comparison when I put it together for last year’s draft. Using these numbers from Draftexpress let me start with 10 seasons worth of prospects. The statistics are probably a little less valuable but I choose the benefit of a bigger sample size. Switching over for next year might be a good idea.

    - I’ll see what I can do with the graphs, and will definitely try and put together a summary page as we get closer to the draft.

    - I did not weight my categories. I’ve thought about it both times I’ve done this. The original idea was to create a database where all those subjective judgements were removed, and therefore I didn’t feel comfortable assigning weights to any of the categories. I often feel like position designation or even role designation gets in the way of evaluating players. Roles and positions, and projections of each, can be pretty hard to pin down in the transition from college to the pros. If a college power forward’s production is most similar to a small forward’s production I think that tells us something significant. Weighting by role might eliminate some of those issues, but also requires me assigning my subjective judgement on what each player’s role is or should be in the NBA, subjective judgements being what I was trying to avoid. I said in the piece that this system is an over-reaction to some of the things that frustrate me about other scouting and draft analysis. I recognize that this causes a new set of problems, one of which being that my system my not prove to have any sort of predictive power. As I refine it for next time, or maybe even between now and the draft, I probably need to really hone in on what purpose I most want it to serve: Giving the best possible information about a player’s chances of success in the NBA, or taking a stand by trying to show the information in a new (possibly less informative) way.

  • Crow

    Alright. Thanks for saying thanks and the reply explaining your perspective. Our approaches differ somewhat but of course you should strive to meet your purpose by the method you think best for your purpose.

  • Mike

    Can I suggest you add wingspan as well? A Player like Kevin Willis (7’0″ tall, wingspan ~6’6″) is very different to Bismack Biyombo (6’9″, tall wingspan 7’7″).

  • ilevy

    I thought about that Mike. Unfortunately reliable wingspan measurements are available for a much smaller pool of players meaning less possible comparisons. It might be something I can add in the future though.

  • Pingback: Tuesday Bolts – 5.16.11 | Daily Thunder.com

  • http://godismyjudgeok.com/DStats/ DSMok1

    One thing missing that I’d love to see included is age. Age seems to be a big indicator of future upside.

    Nice work, though!

    • ilevy

      Thanks DS! Age was a big brain fart on my part. If I get ambitious, I’ll work it in and redo the tables. If not, I might have to leave it for next season. You’re right though, there is no reason it shouldn’t be included.

  • Pingback: Truth About It » ShareBullets: A Wizards/Bullets Draft Lottery Story

  • Pingback: Lunchbox Links – Toronto Update

  • Pingback: Detroit Pistons Draft Dreams: Chris Singleton « PistonPowered

  • Pingback: Detroit Pistons Draft Dreams: Iman Shumpert « PistonPowered

  • Pingback: Detroit Pistons Draft Dreams: E’Twaun Moore « PistonPowered

  • Pingback: Tristan Thompson the big winner on John Hollinger's Draft Rater

  • Pingback: Consolidating the Wages of Wins, a poster, a tool, NBA Draft Links and some more fun | The Wages of Wins Journal

  • Mike

    BTW: Want me to write a web based system to automate this for you? Happy to!

  • Crow

    Earlier I said the charts are somewhat hard to read for me. Actually is very very hard and I can’t read anything but the names off the charts even after blowing the font up repeatedly. Everything else is guesswork and too much trouble. I’d like to read them but I can’t. Bad eyes on my part I guess but I really am amazed that anyone can read them as is.

  • http://ducksonthewire.com Kevin Kroeger

    I think one big additon to the comparisons of college talent would be a category of “Average RPI of opponents”. This could definitly help compare Jimmer’s stats vs. Kemba’s or more importantly Jimmer vs. JJ Redick.

  • Pingback: Sophomore Similarity Scores « HoopSpeakU

  • Pingback: » They Predicted Linsanity Hooponomics

%d bloggers like this: