# Projecting the Draft Through Numbers

US Presswire

Editor’s Note: This project is a collaboration between Jacob Frankel (@jacob_frankel) and Hickory-High’s own Cole Patty (@Cole_Patty). They also received a data-grab assist from Jameson Draper (@JamDraper).

Introduction

The NBA Draft is one of the most important components in building a team, but how prospects are judged can be subjective, ambiguous, and quite often erroneous. Un-quantifiable terms like “motor”, “upside”, and “athleticism” reign supreme. Very rarely do you hear advanced statistics used as a way to judge a player. Analysts rave over players’ wingspan while leaving Steal Rate by the wayside, despite the former being a slight negative indicator and the latter a clear positive. Attaching labels like “bad character” to guys we’ve seen on TV for 30 hours, because of how hard they dive after loose balls, isn’t always accurate.

In basketball, using statistics for everything is not the “right” thing to do, which old schoolers often believe is a statistician’s goal. Data and the eye test should be used hand in hand, but there isn’t much data to complement the eye test when it comes to the draft. With that in mind, Cole Patty and myself looked at data from drafts of past years and used that data to help us build a predictive model for this year’s draft prospects.

Methodology

In short, we built the model using a regression with data from drafts of yore. The basics of a regression – we put in a bunch of independent variables and one dependent variable. The regression told us how much of the the variance in that dependent variable was explained by all the independent variables and gave us an equation to find the dependent variable if we have all the independent ones. You may see where this is going.

The independent variables we collected were a players’ advanced statistics (via KenPom.com), his team’s strength of schedule, and his combine measurements. The dependent variable was the players’ regularized adjusted plus/minus for his 4th season in the league (read this for a thorough review of RAPM). We then used the formula provided by the regression to predict the 4th RAPM of this years’ draftees.*

*There were separate regressions for bigs and smalls, with bigs performing better.

Another thing we wanted to incorporate was the expected value of a player picked with a certain pick. What’s better for a franchise, to find a +2.0 player with their second pick or a +1.0 player with their 30th? I went back and looked at RAPM data for players in drafts 1987-2003. Here’s what I found:

We used DraftExpress.com’s mock draft to get an idea of where each player this draft would be picked and then found their marginal projected value: projected value – expected value of expected pick. This can help make it clearer which players are steals. For example, Erik Murphy projects as a -3.5 player in his fourth season, pretty bad. Compared to what a team would expect to get out of a player in the area of the draft he will probably be picked in though, this is pretty good. Thus, he has a high marginal value.

Results

Before we look at this draft’s results lets look at how it fared in previous years. Let’s go back to the first number the regression provided us, called the R-squared, which gives an estimate as to how much of the variance in actual RAPM is explained by the independent variables.

As you can see it does an acceptable job, explaining 57% of the variance. The correlation coefficient is 0.76. This isn’t perfect, but it isn’t a dud either. I also grouped data into bins, to see how good it was in general terms projecting how good (or bad) players would be, but not exact RAPM numbers.

As you can see, it performs very well projecting, in general terms who will be good and who won’t. Now, some actual player names from drafts 2004-2008.

So we ran our initial regression results on this year’s data and the results seemed a little off:

First of all, in the other four drafts we tried the method on, only three guys had ratings above +3.0 (Paul, Love, and Conley). For there to be two +3.0 guys (not even including Nerlens Noel) in what is supposedly the weakest draft in years is a little strange. So I dove into the data and found one root cause for the incredibly high ratings: Chris Paul. That’s right, the point god was so damn good in his fourth season that he on his own created a general tendency to overrate.

In Paul’s fourth year he was a +9.0, by far the highest in our dataset (Kevin Love was second with a +6.2) and a massive outlier. So what happens when I just delete the row labeled “Chris Paul”, run the regression again, and plug this class’ numbers into the equation?* At this point we also included non-combine players into our results, with a separate regression that took into account just college advanced statistics and basic measurements. The results of this regression weren’t quite as good, but will still work. The correlation between the actual RAPM and the projected was a solid 0.68. Interestingly enough, the smalls’ regression performed much better than the bigs’ (the opposite of what happened with combine data). This implies that athleticism may be a more significant factor in projecting a big man’s productivity than a small’s.

*During the process of re-doing the results two other things happened. First, we made Otto Porter a small (instead of having him a big because of height). Second, an error was found in Mike Muscala’s data, changing his ranking drastically.

These results look much more reasonable. The marginal value in these charts is for each player’s projected draft slot, from Draftexpress, but things can change. Ian Levy was able to put each player’s RAPM projection into a visualization that shows the range of picks in which each player’s marginal value would make them an attractive selection:

Conclusions

The first thing of note was the high evaluation of a number of point guards. Even the projections of small-school studs at the position such as Ray McCallum, C.J. McCollum, and Nate Wolters compared very favorably against competitors at other positions. That result is partially due to the recent success point guards have achieved in the NBA. This can also be attributed to the fact that currently the data isn’t set up to separate point guards from swing players. Nevertheless, players like McCallum or Wolters could actually turn out to be cunning steals for teams if they fall into the second round (projected to go 41st and 39th, respectively).

That said, a lot of familiar faces from the top of draft boards are sticking around to the top of the results in projected 4th year RAPM. Expected number one overall pick, Nerlens Noel, tested out with the highest projection, followed by fellow expected top-3 pick, Otto Porter. Every other non-point guard projected to have a positive 4th season RAPM, save Kentavious Caldwell-Pope, is tagged as a lottery pick in the DraftExpress mock draft. So, for the most part the model agrees with common knowledge. The largest “bust” as defined by marginal value (other than the -6.2 duo of Tony Snell and Mike Muscala) was Kansas’ Ben McLemore, who is expected by many to be the second pick this year. McLemore isn’t projected to be a terrible NBA player– his -1.3 RAPM projection is above the NBA average– but that kind of production would definitely make him a “disappointment” as a number two pick. That’s a primary reason this class is being labeled as weak. Save Noel, all the expected top picks are projected to produce at a role player level in their 4th seasons.

Given the relative success of this method in the past these results should definitely be taken into account when considering the draft. They aren’t perfect, but they are entirely data driven and provide a nice contrast to the majority of draft content.

Check back later in the week for more content on the draft based on this approach.

• steve

Question. Do you adjust in any way for a prospective NBA PG who never played PG in college? Am specifically thinking of CJ McCollum, who never played the point in college but is being looked at as an NBA PG due to his relatively small size.

• http://hickory-high.com/ Ian Levy

My understanding was that Cole and Jacob didn’t do any positional adjustments. They simply split the players into two groups “Bigs” and “Smalls”. Each group had a different regression equation and R-Squard, so McCollum’s projection would love different if he was grouped with the bigs. But as a small he’s grouped together with players who have played PG, SG and SF and may possibly move between those positions in the pros.

I’ll double check with them, but that’s my understanding.

• Andrew Johnson

Interesting stuff. I assume there will be more details released later. Your write up doesn’t mention age, was that included? My understanding is that that’s a pretty significant variable if you are comparing on court production of a senior to a freshman.

• Cole Patty

Age was included with the model, for of course the reason that you stated. We also used age in days, not years, due to the fact the draft is on a specific date. There should be a second part of this post coming soon, but may not exactly be the details you were hoping for.

• nbacouchside

Wow this model really hates Tony Snell an awful lot. This seems to be true of just about every attempt to use stats and combine results to project the quality of a player in the NBA that I’ve seen thus far. Needless to say, as a Bulls fan, I am concerned. On the plus side, they got Erik Murphy who could be useful-ish as a deep bench player.

• http://hickory-high.com/ Ian Levy

Like I said, take these numbers with an ocean of salt. But yes, they definitely don’t like Snell. I’ve seen a few other numeric possessions as well and no system that I saw thought very highly of him.

I was pretty surprised that the Bulls took him. From what I know of their basketball ops people they don’t seem like they would fall in love with a workout star. But I’m assuming they know something I don’t.

• nbacouchside

Yeah, they have taken some surprise guys in the past. I wasn’t thrilled when I heard the pick and then I did more digging and everyone who has even a slightly analytical bent doesn’t seem to like him. I can see why they think he can succeed, because of the shooting stroke and defensive tools, but there’s a whole lot wrong with his mindset. Doesn’t attack the rim or the glass very much at all for a guy with his physical gifts, which might have to do with his relative skinniness.

• http://hickory-high.com/ Ian Levy

Hey might be one of those guys who didn’t project well because he didn’t really fit into his college role. New Mexico needed him to be a primary scorer, it was more than he could handle, and his stats were correspondingly inconsistent. Hence the poor projections.

But maybe the Bulls feel like he can jump a level if they slot him into a supporting role.

• nbacouchside

The biggest red flag to me is his lack of rebounding or shots inside. That’s a lot to indicate that he’s just not aggressive enough. Also, as I’m sure you’re well aware, rebounding translates pretty strongly and if he wasn’t boarding in the college game where he was one of the best athletes on the floor, it’s hard to expect much of anything on the glass in the NBA.

• http://hickory-high.com/ Ian Levy

But my guess is the Bulls probably aren’t expecting much there. Stand in the corner and make threes. Stay in front of his man at the other end. Make the proper defensive rotations. They’re probably going to make it really simple for him.