Projecting the 2014 NBA Draft
USA Today Sports
There’s an overload of information when it comes to evaluating draft prospects. Prospect profiles galore, big boards, DraftExpress strength/weakness videos. Sorting through and weighting all the info is a chore. And it’s mostly guesswork, as there’s no historical way to look at what matters and what doesn’t.
Draft analysts will talk about a player being a good perimeter defender, but there are so many different definitions of good and ways to judge it. And a huge part of player evaluation on the defensive end is narrative driven confirmation bias. Just look at the sudden “Andrew Wiggins will be a defensive stud” theme.
There are takes on prospects even harder to prove and parse than that. Things like “passiveness” or “good shooting stroke.”
Despite their flaws, we know that many of these tape-watching based opinions matter and have done an OK job sorting through the draft and finding the best and the worst players.
But again, it’s difficult to tell which parts of eyeball based evaluations matter and which parts don’t. Which is why the fit on the above graph is nowhere near perfect.
So I’m going to go at things from a completely different angle, looking just at how players’ college statistics predict how they’ll play in the NBA.
A prospect’s numbers provide a relatively good baseline of how good a player will be–better than the baseline offered by how GM’s have picked over the same time period.
One big argument against this approach is that there’s no context in behind the stats, and that’s the entirely point. We have no idea whether or how most of the context surrounding draft prospects matters.
There’s weight in the fact that nobody with Doug McDermott’s block and steal rate has succeeded in the NBA. There’s not as much weight in saying “he plays good team defense.” There’s no easy way to prove that second statement and even if there were, we have no idea how or if it translates to the NBA.
Now, to explain how college stats are turned into something meaningful.
The general format of the draft model is a massive multiple regression looking at the correlation between players’ NBA success and their college statistics (and some other info). The variables are chosen with a stepwise regression and there is cross validation to ensure that they are statistically significant and there is no overfitting.
The Dependent Variable
The dependent variable–the one we’re trying to project–is how a player performs in his first eight seasons in the league. That span of time allows for the most complete and accurate data set.
The fit with college data is better the more NBA seasons are used to judge how productive a player was. The issue with this is that there just haven’t been eight years since the 2007+ drafts.
To get around this issue, I project how the players taken in the 2007-2010 drafts will perform in their remaining years until they reach year eight. My NBA projection engine uses league average adjusted player seasons going back to 1980 and looks at how statistically similar players within half a year of age to the player in question have developed, then applies this change to the player in question.
The stat used to measure performance is ASPM–an all-encompassing box score metric found to be the most accurate publicly available box score metric in an analysis by FiveThirtyEight’s Neil Paine.
Players who don’t play in a given season are given a value of -3, around where most estimate replacement level is. I also go through and adjust for injuries to make sure the Derrick Rose‘s of the world don’t get a few replacement level seasons put into their impact estimate.
The Independent Variables
After testing around 40 different variables from scoring efficiency, to rate statistics, to combine stats, and interactions between all of these variables, the final model uses 8 pieces of information: Age, TeamSRS (point margin + strength of schedule), TS%, ORB%, AST%, STL%, BLK%, and fouls committed per 40 minutes. There are also interactions between these variables and other constants.
There is one regression, with all positions included. While separating by position looks like it improves the fits of the regressions, it changes one large sample size into five smaller ones (or three, if you do wings/bigs/points) and it lessens the out of sample predictive ability. And, in general, the same stats from position to position are correlated with success, not just the ones that would be traditionally associated with each position. Rebounding is really important for guards and assists are really important for centers.
I use the last two seasons of players with multiple years in college. Seasons that don’t occur right before the draft still bring a lot of predictive ability.
I tested all combine stats, height, weight, etc. and none proved significant, even in interactions with other variables. That doesn’t say that these aren’t important. They can tell you very important things about specific players, but there are no general trends with the combine stats.
The sample size is 333 players drafted between 2003 and 2010.
The model’s projected eight year ASPM explains nearly 42% of the variance in actual eight year ASPM.
Here are the in-sample (2003-2010) top-20 projected players:
And here are the out of sample top-10 projected players in the 2010 and 2011 drafts:
As you can see, it’s far from perfect. But it is objectively more accurate than the actual draft, and more importantly, looks at things from a completely different angle. It provides not only additional information, but a basis to to skeptically dig deeper on consensus opinions.
So with all that in mind, here’s the projected top 30 for this year’s class:
First off, the model sees this draft as reallllly deep. The first guy below a minus-one ASPM projection (which is still solid role player levels) is past the 20th spot on the board. And there are eight players with positive projections. Some notes on individual players:
- Jordan Adams is the wacky name that sticks out at the top, as he goes in the 30s in most mock drafts. There’s nothing super notable about Adams–the guy just puts up really good numbers across the board. He’s undersized and generally thought of as unathletic, but Adams shines in two “applied athleticism” stats–offensive rebound rate and steal rate.
- Marcus Smart’s status from before the 2013 draft took a hit this year because of his team’s struggles, despite his numbers improving.
- Elfrid Payton has slowly been rising up most traditional draft boards, and the numbers like him. He put up lots of assists and un-point guard like block and offensive rebound rates.
- Joel Embiid would have the top projection in this class, if not for his committing of 5.8 fouls per 40 minutes.
- In a funny twist of fate, Jabari Parker’s defensive numbers are much better than his offensive ones.
- I hadn’t heard of Jarnell Stokes before running this, but he put up solid efficiency numbers and was a monster on the offensive glass.
- Andrew Wiggins put up middling stats. Not bad, by any means, but just nothing impressive.
- Noah Vonleh would be much higher if not for an insanely low assist rate.
- Doug McDermott and Nik Stauskas are two notables that don’t appear in the top 30. They come in at 35 and 33, respectively. Both are hurt by miniscule offensive rebound, block, and steal rates. And yes, nobody with combined block and steal rates as low as McDermott’s has been a valuable NBA player.
- Cleanthony Early is a guy who has risen since the NCAA tournament. But he’s very old, doesn’t excel in anything, and doesn’t appear here.