### Offensive and Defensive Variance

Posted by on Nov 15, 2013 | 2 comments

US Presswire

“Increase the variance” – Daryl Morey

The quote above was Morey’s answer in a Reddit AMA last year during the playoffs to a question about what changes his Rockets needed to make to prevail in their matchup with the Thunder. It’s an answer and an idea that have really stuck with me.

A theme that has cropped up in my analysis a lot lately is how misleading averages can be. Many of the most commonly used basketball statistics are averages – field goal percentages, per 100 possession efficiency measures, even per game and per 36 minute statistics. But averages combine values above and below, creating a numeric shorthand to sum up all the different data points in the neatest and tidiest way possible.

When we talk about variance we’re adding context to an average, measuring the spread between the highs and lows of the data. Circling back to Morey’s quote, what he was saying (I assume) was that for his team to have a chance they needed to increase the variance of their offensive and defensive efficiency and hope that they could push their highs high enough to catch a break and get past the Thunder’s superior talent. And that’s exactly where variance plays a role in wins and losses. If every team performed exactly to their averages the better team would win every single game. Upsets happen when those peaks and valleys overlap in unexpected ways.

Using the game logs from every team for last season we can actually look at not just the offensive and defensive efficiency from each team but also how much variance each displayed in their performance on each side of the ball. But, before we go any further I need to explain a twist of language. Variance is an actual mathematical term with an attached equation. However, when I use the word variance throughout the rest of this piece I’ll be referring to standard deviations, a related measurement that I think does a better job of illustrating the variations. I apologize to my high school math teachers for the imprecise use of terminology.

The visualization below shows each team’s performance from last season, marked by their Offensive Rating (points per 100 possessions, ORTG) and by the standard deviation in their ORTG (a bigger standard deviation means more variance). The tab at the bottom will let you switch over and see the same display, but with each team’s Defensive Rating (DRTG).

With this visualization we can begin to see some separation between teams with comparable levels of average efficiency but very different levels of variance. For example, the aforementioned Houston Rockets and the Denver Nuggets had very similar levels of offensive efficiency last season but the standard deviation in the Rockets’ ORTG was about 3 points per 100 possessions greater, meaning they were much more likely to drastically exceed or undershoot their season average on any given night. We see a similar example on defense where the Dallas Mavericks and Brooklyn Nets had almost identical DRTGs but standard deviations that differed by well over 3 points per 100 possessions.

Now that we’ve identified variance on both sides of the ball, the question becomes why is it there? What elements create variance?

Using the collected data I ran a series of correlations between different variables and each team’s offensive and defensive variance. These are between season-long numbers, not a game-by-game comparison. Here’s what I found:

The strongest relationship was between the percentage of a team’s shots that were three-pointers and offensive variance. This makes sense intuitively and numerically. Because of their distance from the hoop, three-point shots are less likely to go in then shots from anywhere else on the floor. However, that level of difficulty is rewarded with an extra point if the shot goes in (this is as simple an example of variance as I can think of). Turnovers, on both sides of the ball, also have strong relationships with variance as do the percentage of a team’s shots that are mid-range jumpers.

Interestingly though, the relationship between variance and pace was relatively small. It showed up on defense where teams that played at a slower pace were slightly more likely to be consistent defensively. But on the other side of the ball there was no statistically meaningful relationship – teams that played at a faster pace were essentially no more or less likely to have a lot of offensive variance.

I find this fascinating because pace and variance are often closely linked in these sorts of discussions. For years the idea persisted that reducing pace was the way to instill variance in a game – the fewer possessions there are in a game the more likely the outcome is to be influenced by a handful of random could-go-either-way events and the less likely the game is to be decided by the static margin in talent between the two teams. But over the past two years we’ve seen several teams (including last year’s Rockets and this year’s Sixers) chase variance in the opposite direction by relentlessly pushing the pace and trying to capitalize on the ensuing chaos with superior conditioning and athleticism. Obviously I’m working with a very small data set here, just a single season, and drawing conclusions for the league as a whole instead of individual teams, but the relationship between pace and variance seems to be much smaller then I would have anticipated.

Introducing variance into the context of offensive and defensive effectiveness raises plenty more questions. Obviously, being consistently bad is trouble on either side of the ball. But assuming a team is relatively effective, is there an advantage to having a certain level of performance variation?

The answer is probably not. More variance means more good AND bad to work with, and the ultimate goal is still to be as efficient as possible, as often as possible. There were small relationships between variance and efficiency, both on offense (a correlation of 0.110) and on defense (a correlation of 0.276). There was also a small relationship between total variance (the sum of ORTG and DRTG standard deviations) and Net Rating (-0.250). That means teams with a positive Net Rating had, on average, slightly less overall variance. But again, by comparing to Net Rating we are looking at an average measure of performance.

The one other place I looked was the difference between a team’s actual win total and their Pythagorean Win Total (a projection of how many games a team should win based on their Net Rating).  This formula has shown to be a very accurate method of projection but every years teams under and over perform these projections by slight margins. I thought perhaps variance might help explain some of those differences.

But again, nothing conclusive. The difference between a team’s actual win total and their Pythagorean Win Total had a -0.188 correlation with offensive variance, 0.070 with defensive variance and -0.72 with total variance. In the end this just reinforces that as a long-term strategy variance leads to chaos, which generally leads to losses. But in the set scenario of a single game, or small handful of games like in the playoffs, variance (intentional or otherwise) is often what dictates the final outcome.

### Two Three-Point Shooters Equal Two Points

Posted by on Nov 6, 2013 | 0 comments

US Presswire

The value of the three-pointer is a staple of basketball analytics. It is often referred to as ‘advanced,’ but really it is mostly the result of pretty simple algebra, a team has to hit 50% more two point shots to get the same number of points as they would get off of three point shots. Away from the rim, most players struggle to do that.

Of course, there is more to it than that, all kinds of nuances can effect the percentages of any shot. Defenses can do the same math and prioritize guarding the three point line. When misses do happen there are different expected percentages of offensive rebounds. It is easier to for the opposing team to score off of defensive rebounds, which add up with more misses from distance. Driving to the rim results in more free throws, which is a very efficient way to score.

Then there is the issue of ‘spacing,’ simply for its own sake. Kirk Goldsberry, in a Sloan paper, estimated the scoring area in basketball is 1280 square feet based on where the vast majority of shots are taken. From a geometry stand point taking shots from further out increases the percent of that area the five defenders have cover and increases the space between defenders, opening lanes to the basket. As Houston GM Darryl Morey or any old time basketball hand would probably tell you, driving to the rim and spacing are complimentary. Players need space to open lanes and drive to kick it back out.

A while ago I ran a number of K-means Cluster analyses based on shot type using data from the suddenly defunct, HoopData. The analysis grouped Guards, Wings and Bigs based on where they shoot in relation to the basket. The analysis grouped both the guards and the wings into groups of Stretch and Slasher types, scoring from the three point line and at the rim respectively. Meanwhile, Bigs fell into three clusters, Bigs that shot almost exclusively at the rim, those with more varied offensive games and stretch bigs, who added a three point shot.

For example, Stretch Wings averaged 56% of their shots from three point range and 19% at the rim, while Slashing Wings averaged 21% from three and 35% at the rim, according to my analysis.  The numbers for Stretch and Slashing guards were both very similar.

The cluster analysis allows me to group player types together to look for patterns in the data, for example, I found some interesting patterns in terms of style of play with Stretch back court players and wings have higher eFG% percentages, but Slashers get more offensive rebounds and more free throws. Again, I will use the Wings as my example.

 Group COUNT Avg TS% Avg eFG% Avg ORB% Avg DRB% Avg AST% Avg TOV% Slash Wing 39 52.7% 49.1% 5.84 16.09 10.79 11.95 Stretch Wing 25 53.7% 51.2% 3.20 13.08 10.03 11.15 Grand Total 64 53.1% 49.9% 4.81 14.91 10.50 11.64

But to take the analysis to the next step I wanted to look for an independent value of spacing, and its limits. I think that is another value of using categorization, Stretch players are measured as players that take more three point shots, not necessarily make them at a higher rate (though there is a correlation there too).

The methodology adopted was both to chart the Stretch players in five man line ups against offensive rating (ORTG) and, secondly, to calculate an expected ORTG, (points per one hundred possessions) based on the individuals in a line up and compare it to the actual ORTG generated by that five-man lineup, then add Spacing configurations to the mix. (Similar to an analysis done here by Jacob Frankel using a different definition of stretch.)

I used line up data from the last two years, courtesy of NBA.com‘s media stats site.

The chart below uses a cut off of five-man lineups that played at least 250 minutes together in the either of the last two years, however, I ran the same chart multiple times with different minute filters and got largely the same results.  With the five man line up at this cut off, Tyson Chandler, Carmelo Anthony, Jason Kidd, Raymond Felton and J.R. Smith had the highest ORTG with 119.4 points per 100 possessions, and Brandon Knight, Jason Maxiell, Greg Monroe, and Rodney Stuckey had the lowest at 93.6 points per 100 possession.

I used a polynomial line to visualize the trend.  Consistently, with different minute filters the trend showed an increase in offensive rating going from 0 Stretch players in a line up to 1 and 2, and then flattened.  The same pattern was seen with Wing and Guard Stretch players only, as well as, using effective field goal percentage.

Then I ran the expected ORTG for each line up against the actual ORTG the line up had, with the 250 plus minute filter the R^2 was a decent 0.434.  The chart below shows the number of Stretch shooters in each line up against the residual error on the expected ORTG.

Again the same pattern can be seen with an increase in the offensive rating with two stretch players on the court and a then leveling off indicating diminishing returns to spacing.  Using that information I made a ‘dummy’ variable for each lineup that had two or more Stretch shooters and ran that variable in a step wise regression along with the Expected ORTG and linear count of Stretch shooters.  In every minute filter the Dummy Stretch came out as the most significant contribution in addition to the Expected ORTG.

The result of the model run with the 250 minute filter indicated that the value of having at least two Stretch shooters on the floor is 2.44 points per hundred possessions.

 Unstandardized Coefficients Sig. 95.0% Confidence Interval for B Variable B Std. Error Lower Bound Upper Bound (Constant) 13.29 12.53 0.29 -11.74 38.31 Expected ORTG 0.83 0.12 0.00 0.60 1.07 StrechersDummy 2.44 0.89 0.01 0.67 4.21

The results were similar with other filters on minutes, with the independent value of having two Stretch shooters in a line up centered just above two points per hundred possessions.  It might not seem like a lot, however, this value represents the value above the simple algebraic shooting percentage multiplied by three points. And two net point per game equal about 5 wins over the course of the season.

### Updating My Beliefs About Neon Jesus

Posted by on Oct 29, 2013 | 0 comments

US Presswire

Last week I did some analysis and created a visualization of the correlation between rookie preseason and rookie regular season performance on a number of stats. Before the sell by date of the analysis passed, basically as soon as the regular season starts and the preseason is expunged from our collective memory, I wanted use that data to update my expectations of this year’s rookies.

At this point one could run a couple linear multiple regressions with college stats and preseason stats, or maybe use one of the single number metrics. But, there is a huge amount of colinearity between the college stats and preseason. Guys who rebound well in college, for example, tend to do so in preseason, meaning many of the variables end up getting confounded, especially dealing with relatively small sample sizes.

In any case, Bayesian analysis actually lends it’s self much better to updating our expectations using sequential data, small data sets than multiple regression, and it’s analytically ‘on trend.’

A key to Bayesian analysis is to establish a prior belief, or starting point. To do that I used a combination of general rookie averages and the incoming draft class’s college stats and ranks.  As an example Celtics number thirteen pick Kelly Olynyk had an effective field goal percentage of 63.1% last year using data from Draft Express. I don’t start my expectations that he will be able to do that in the pros. On the other hand, the fact that he was significantly above the 54% eFG% college average of his fellow rookies (2.075 standard deviations above) in that category is important information.

In the last three years rookies with over 200 minutes of playing time in the regular season had an eFG% of 45.8%, four percent below the NBA average via data from RealGM. So, taking that information, my Prior Expectation for Olynyk’s eFG% was formed as below:

Rookie Average + (Standard Deviations from Class * Standard Deviation) = Prior Expectation

In this case: 0.458 + (2.075 .0506) = 56.3%

One could argue for tempering the adjustment away from the Rookie average, particularly for shooting. However, given that I am going to update these expectations with preseason observations, I think it forms a good informative starting point.

In eight games over the preseason Olynyk had an eFG% of 55.1%. Unfortunately, for demonstration purposes that means the Updated Expectations won’t move too much.  To update my expectations I had to weight my prior expectation and Olynyk’s preseason performance using a Bayesian weighting formula (shown below) coming out as 56.11%.

So, to look at one stat where Olynyk did move the expectations meter in the preseason, we can look at personal fouls. On personal fouls Olynyk already committed more fouls than average for his class, putting his expected fouls per 48 minutes above the rookie average of 4.9, at 5.63. But the big young Canadian well-surpassed expectations in the preseason getting called for fouls at a rate of 8.01 per 48 minutes. Weighing the two together Olynyk’s Updated Expectations come to 6.81 fouls per 48 minutes (meaning they don’t let you play 48 minutes in one game).

I applied the same methodology to all the 2013 rookies with college experience and significant playing time this preseason. Below are the preseason numbers, my ‘Prior’ belief and the Updated Belief for eFG%, Rebounds per 48 minutes, and Assists per 48.

 Player Team Pos Pre eFG% Prior eFG% Updated eFG% Pre REB Prior Reb Updated Reb Pre ASTS Prior Asts Updated Asts Anthony Bennett CLE PF 0.413 49.7% 48.31% 12.05 11.68 11.87 1.96 1.7 1.83 Victor Oladipo ORL PG 0.447 57.7% 55.5% 10.27 8.88 9.58 8.45 3.3 5.87 Cody Zeller CHA PF 0.443 48.5% 47.8% 12.87 10.88 11.88 3.41 2.2 2.81 Ben McLemore SAC SG 0.546 51.4% 51.96% 7.84 6.38 7.11 2.32 2.9 2.61 Kentavious Caldwell-Pope DET SG 0.375 45.6% 44.25% 8.92 8.28 8.6 1.96 2.5 2.23 Michael Carter-Williams PHI PG 0.388 35.9% 36.35% 8.07 5.48 6.78 7.58 8.7 8.14 Steven Adams OKC C 0.622 49.1% 51.27% 16.39 10.78 13.59 1.46 1.5 1.48 Kelly Olynyk BOS PF 0.551 56.3% 56.11% 9.3 10.98 10.14 3.87 2.9 3.39 Tony Snell CHI SG 0.362 42.9% 41.79% 5.85 3.38 4.62 5.85 4.1 4.98 Mason Plumlee BRK PF 0.409 52.3% 50.39% 14.63 11.48 13.06 2.72 2.6 2.66 Solomon Hill IND SF 0.24 46.3% 42.58% 5.09 6.48 5.79 3.52 3.7 3.61 Tim Hardaway Jr. NYK SG 0.482 42.8% 43.72% 5.13 5.18 5.16 1.56 3.2 2.38 Andre Roberson OKC SG 0.273 43.5% 40.83% 12 13.38 12.69 0.92 2.1 1.51 Archie Goodwin PHX SG 0.415 37.7% 38.33% 4.82 5.78 5.3 1.75 3.8 2.78 Tony Mitchell DET PF 0.733 40.1% 45.63% 12.62 10.38 11.5 1.1 1.3 1.2 Jamaal Franklin MEM SG 0.468 37% 38.66% 8.69 11.28 9.99 4.97 4.4 4.69 Peyton Siva DET PG 0.404 37.6% 38.03% 2.53 2.98 2.76 9.49 7.7 8.6 Erik Murphy CHI PF 0.293 54.8% 50.56% 9.69 8.38 9.04 1.79 2.5 2.15

The positive movers are in yellow and the negative in red in terms of changes in expectations.  Effective field goal percentage didn’t move much for anyone, primarily because the high variability of scoring and high standard deviation of eFG% in preseason.  Though Steven Adams moved up in eFG% somewhat, while Erik Murphy moved down.  On rebounds, Steven Adams looked like a beast on the boards all preseason, enough to significantly move his Updated Rebounds expectations, and Victor Oladipo playing a facilitating role with Orlando moved his Updated Assists up as well.

In addition I ran the numbers for free throws, turnovers and personal fouls, all per 48 minutes.

 Player Team Pre FT% Prior FT% Updated FT% Pre TOV Prior TOV Updated TOV Pre PF Prior PF Updated PF Anthony Bennett CLE 0.692 66.4% 67.8% 6.17 2.83 4.5 10.37 5.53 7.95 Victor Oladipo ORL 0.839 71.4% 77.7% 5.94 3.33 4.63 4.11 5.63 4.87 Cody Zeller CHA 0.548 72.5% 63.7% 3.68 3.13 3.4 4.73 5.13 4.93 Ben McLemore SAC 0.692 83.8% 76.5% 3.19 2.73 2.96 3.77 4.43 4.1 Kentavious Caldwell-Pope DET 0.7 76.5% 73.3% 2.18 2.53 2.35 5.44 4.63 5.03 Michael Carter-Williams PHI 0.647 66.2% 65.5% 2.44 4.03 3.23 5.38 4.73 5.05 Steven Adams OKC 0.529 41.1% 47% 4.39 2.03 3.21 7.9 4.93 6.41 Kelly Olynyk BOS 0.7 74.8% 72.4% 3.62 3.83 3.72 8.01 5.63 6.82 Tony Snell CHI 0.714 81.1% 76.3% 1.72 2.73 2.22 2.75 3.83 3.29 Mason Plumlee BRK 0.593 64.9% 62.1% 4.76 3.43 4.09 3.74 5.03 4.38 Solomon Hill IND 0.75 73.4% 74.2% 1.96 2.83 2.39 3.52 4.73 4.12 Tim Hardaway Jr. NYK 0.615 66.3% 63.9% 1.56 2.23 1.89 2.9 4.33 3.62 Andre Roberson OKC 0 51.9% 26% 4.62 3.03 3.82 8.31 5.13 6.72 Archie Goodwin PHX 0.588 60.5% 59.7% 4.82 4.03 4.42 3.07 5.83 4.45 Tony Mitchell DET 0.5 63.7% 56.9% 1.1 3.23 2.16 7.68 5.73 6.7 Jamaal Franklin MEM 0.8 75.8% 77.9% 3.73 4.23 3.98 4.55 5.33 4.94 Peyton Siva DET 0.833 83.5% 83.4% 6.64 3.53 5.08 5.38 5.43 5.4 Erik Murphy CHI 0.5 75.2% 62.6% 1.08 2.23 1.65 9.69 6.33 8.01

Again the most significant negative Updated Expectations are in red and positive in yellow.  It looks like I might let my parochial New England interests bury the lead as Anthony Bennett struggled enough with turnovers and personal fouls to worsen his Updated Expectations. That said, coming off an injury and by all accounts not being in game shape, I could probably temper my expectations adjustment somewhat. And I am not a ‘Bash the Number One Pick Because He isn’t the Best Guy in his Draft Class’ kind of guy, I wouldn’t have taken him there, but that’s on Chris Grant not Anthony Bennett.

I also tracked blocks and steals, but neither showed much impact on expectations.

As promised above, here’s the expectations updating formula I used. It is designed to diminish the impact of measures with high variability as measured by standard deviation. When trying to formulate a Bayesian continuous update formula this weekend, I ran across this one via Daniel Myers.

Updated Expectation= (Prior Expectation / Prior StDev^2 + New Observation/ New Observation StDev^2) / (1/Prior StDev^2 + 1/New Observation StDev^2).

### Projecting The 2013-14 NBA Season

Posted by on Oct 24, 2013 | 2 comments

US Presswire

The start of the NBA season is under a week away, which means it’s time for the always fun task of figuring out what will exactly happen when the madness starts. I’ve been working on methods to most accurately forecast what will happen this season, the methods and results of which are revealed below.

Calculating wins starts with figuring out how teams will do on offense and defense, and to get those numbers while accurately accounting for aging and experience, a player projection system is needed. The step-by-step process of everything is laid out below, if you’re interested. If not, just scroll down a bit to where the actual projections are housed.

The Player Projections

Player projections are based on, historically, how similar players developed from season to season at the same age. I enter baseline numbers for all the players I’ll be projecting, in this case the baseline is mean-regressed stats from 2012-13. The system finds how similar players of the same age group’s statistics changed the next season and applies those changes to the player in question’s baseline stats.

For players like Derrick Rose, Andrew Bynum, and Channing Frye, I projected what they would have done last year, then used that to project onto this season. There’s no production adjustment for injuries, but I do tweak minutes and games.

For fun, I also projected stats all the way through 2018, laying projections on projections on projections. The projected top five in ASPM, an all encompassing box score metric, for the 2017-18 season are Kevin Durant, LeBron James, Kyrie Irving, Anthony Davis, and James Harden. Passes the smell test.

For rookies, I used Cole Patty and myself’s prospect model and for internationals I assumed 85% of positional norm production.

Projecting Team Defense

I project three different versions of defensive rating then blend the three together.

The first version involves no player statistics. It looks at what a team did the previous season and regresses this to the league average based on how much of the roster is coming back and whether or not there is a coaching change. This version is weighted the most at 0.4.

The next version uses Dean Oliver‘s individual defensive rating. This mostly gives an idea of how good a defensive team each player played on last year, which isn’t great, but helps out a bit. This is weighted at 0.1.

The third version uses defensive regularized adjusted plus-minus. This is basically just on court defensive impact that goes through some complicated math to try to isolate an individual’s impact from his teammates and opponents. This isn’t perfect either, but it can give us an idea of things that aren’t captured in the box score. This is weighted at 0.5.

Projecting Team Offense

The baseline for team offense is is the usage weighted sum of all the individual players’ offensive ratings. This is then adjusted based on the tradeoff between usage efficiency. I then calculate the projected number of assists for the roster and the projected number of shots that will be assisted on. If a team doesn’t have enough assists to go around for shots that need to be assisted on, I penalize the efficiency of these shots. The logic goes these shots will probably off the dribble, closer to a defender, instead of being open assisted shots.

Projected Standings

Finally, the offensive and defensive efficiencies are converted into wins and the wins are forced to sum to 1230.

Let’s look at the East first:

This doesn’t do much to buck conventional wisdom. There are a clear top four teams who are all really close, a drop to the Knicks and Hawks, and another drop to the seventh and eighth seeds.

This model sees the last two slots being taken up by the Cavs and Raptors, but in previous iterations that are slightly different, the Pistons have overtaken the Raptors. The Wizards are hurt by the injury to Emeka Okafor and the lack of any offensive depth.

Now the West:

Well, so much for Kevin Martin and Russell Westbrook. Many people forget that the Thunder were a historically great team last season, and a lot better than the Heat in the regular season. That jump largely stemmed from the establishment of an elite defense, and the loss of Martin and Westbrook for a bit, both poor defenders, doesn’t look to hurt the D. The offense looks to take a small hit, but it stays afloat due to the continued brilliance of Kevin Durant, projected for a historical 65 TS% and 30 USG season.

Spots two through five are pretty much interchangeable, but the drop off to the Warriors may surprise some. The issue for the Warriors looks to be, surprisingly, on the offensive end. There are only four players on roster projected for an above average offensive rating and the bench looks really ugly offensively.

The bottom of the west is an absolute mess, with five different teams vying for the 7th and 8th spots. The Nuggets get in with an abysmal defense and good offense and the Mavs get in on the strength of a full season of Dirk and a competent usage soaking ball handler in Monta Ellis.

Here are projected team offensive and defensive ratings:

Playoff Simulations

Using projected efficiency differential, we can simulate the playoffs with Bill James’ log5 equation. I simulated the playoffs 1000 times, and here are the number of times each team won the title:

One thing to keep in mind: there’s no adjustment for playoff rotations, and I eased off on the Heat’s best players’ minutes, which will surely go up in the playoffs.

The interesting thing here is the sheer number of teams with a greater than 5% chance of winning the title (remember the 5% theory). Teams like the Rockets, Spurs, and Grizzlies aren’t championship favorites, but if a few things fall their way, they can win the title.

More Rookie Projections

Because of the stats I was going to use in this project, I had to build out myself and Cole Patty’s draft model a bit, and I might as well show the results of that here. These are the projected first year usage rates, offensive ratings, and defensive ratings of all the players whose data we collected. One thing that I found interesting was that usage (or offensive role) was a lot more accurate to project than offensive rating (or offensive performance).

The results are naturally very pessimistic, just because of the nature of the model.

Using the player projection system, I’ve generated a table showing the leaders in four key statistics over the next five seasons.

Things get less accurate the farther into the future the projections are, but nothing here is super-shocking other than Kendall Marshall being top-10 in assist percentage in 2017 and 2018.

—–

Take everything above for what you will. This is nowhere near perfect and I don’t necessarily agree with everything my system tells me. But it is good to realize that stats can capture things that aren’t so easily apparent to the observer, and that’s clear when looking at who has done the best predicting the season over the past few years.

If you have any questions or issues with methodology, be sure to let me know.  I’ll check in after the season to see how these did, and compare them to other win projections from around the web.

### NBA Game Tracking Project: Volunteers Needed

Posted by on Oct 22, 2013 | 0 comments

– Kevin Ferrigan

### Bending Efficiency

Posted by on Oct 22, 2013 | 0 comments

US Presswire

It is a much different era of basketball. The amount of information available to the public is vast, and there is more and more added to it each day. The average fan has exponentially more knowledge about the numerical meanings behind each play than we could have even fathomed 20 years ago. This isn’t a shockingly new concept, the NBA is only a microcosm of the entire pool of issues we have added access to. It’s just in basketball, like everything, the pursuit of knowledge is hopeless and eternal.

As a result, many have fallen in love with certain player types that we hadn’t even identified in the past, with names coined by their specific roles. The rim protector, the 3-and-D guys, and many other roles have emerged as NBA buzzwords. The chase for efficiency is a valiant one. What offense wouldn’t want to make a few tweaks to be more efficient? It would be silly to strive for anything else.

At the macro level of basketball, this is pretty black-and-white. The most efficient offense is the best. End of discussion. And that is how it should be.

At the micro level, the picture gets more gray.

As detailed here — and here, because more samples to give to the reader is good — the relationship between usage and efficiency at a player level is a tricky one. A low-usage player finds it easy to be much more efficient than a high usage player through simple concepts such as shot difficulty, fatigue, and defensive attention. It is much more surprising to see Shane Battier take a shot than LeBron James, nothing breathtaking.

But the point that seems to get lost — as pointed out in that first link — is the incredible difficulty of raising a player’s usage without watching the entire offense to cave in like a white dwarf going supernova. Throw out two 3-and-D guys, two rim protectors, and a passing point guard, and you don’t just create a system of ball movement that becomes basketball utopia. Eventually the natural selection of the league will make a player become more of a higher usage guy, and that typically results in their abilities being put into question by the masses.

There are exceptions, like in most rules. James Harden raised his usage up a couple notches to 29% and still was able to put up a 60% TS% — though, I find it easy to be able to point out his 66% true shooting the year before was far superior, but that is besides the point. And Tracy McGrady was able to transcend statistical common sense and was able to up his usage and true shooting after he left for Toronto for Orlando. While these prove that the task at hand isn’t impossible, there are plenty of Paul George type cases — who went from an effective three-and-D to an offensive focal point after the Danny Granger injury — that show that going down that road is certainly tough on both the player and the team.

So, why hate the gunner? Sure their unconscious aura with the basketball is frustrating, but somewhere down their basketball path their behaviors were reinforced. Is it frustrating to see Rudy Gay clank another 20-footer to the tune of 49.4% TS%? Yes, it is excruciatingly painful. However, for team reasons and for player reasons this is a cross they have to bear. It is an evil, but it is also a necessary evil.

Look around the league, there are plenty of examples. DeMarcus Cousins, Monta Ellis, Brandon Jennings, Greg Monroe, and Gay — to name a few — are all players that would be much better served in a different role of some kind, but just weren’t able to fill that role for the time being. Some chalk it up to a lack of data being around the front office and in the locker-room, which certainly would impede this kind of behavior from going forward, but the fact that they were on their rosters last season doesn’t help.

If every team had a LeBron or Kevin Durant to eat up the number of possessions they do while scoring at Hall of Fame rates, then they certainly would build their offenses around those players versus a Monta Ellis type player. However there are 30 teams in the league, and not even half of them come close to sniffing a title. Being a focal point in any offense is a hard job in the NBA, and just because a more human player tries to fill it doesn’t mean building blocks of the sport just change. Someone’s got to do it, and being a human punchline all at the same time is hard.

Should you love that guy who has a less substantial role on the offensive end any less? Of course not. The kind of potential they show on both ends of the floor in short bursts makes a soul think about the sport harder than anything else this sport exhibits. It’s just not an evil thing to be wrapped up in the pure innocence we once had for the gunner. They are the bane of statistic movements in many ways, but are also in many ways the facilitator for the stat-sheet stuffing wunderkinds we adore.

### Kevin Durant Kills it from Everywhere: Introducing Scoring Versatility Index

Posted by on Oct 16, 2013 | 3 comments

US Presswire

There are a number of different stats to quantify and qualify scoring prowess in the NBA. Generally they fall into either volume statistics or efficiency stats with points per game and true shooting being the kings of their respective categories. In a Grantland article last week Kirk Goldsberry unveiled a new shooting statistic, ‘Shot Score.’ Essentially shot score is the inverse of Ian Levy’s XPPS, the expected points per shot given a player’s overall shot selection. Instead, Goldsberry emphasizes the differential between the player’s expected point production and actual point production, treating it something like a degree of difficulty in diving.

Goldsberry explains that someone like LeBron James who scores efficiently from a variety of places on the court is more valuable than a player like DeAndre Jordan even though Jordan has a higher true shooting percentage, the traditional advanced scoring stat, than James.

I think that is at least partly true, the ability to score efficiently from a variety of spots on the floor is likely to make it much harder to guard a player and allow the offense more flexibility. To some degree this is realized in the scoring differential between Jordan and James. Because DeAndre Jordan can only score effectively in a limited amount of real estate he got half the shots per 48 minutes and got to the line for three fewer attempts per 48 minutes of play than James according to Hoopdata, even with his poor free throw shooting inviting Hack-a-DeAndre at times.

However, I noticed that the way shot score was calculated, adding up the net points a player generated over the expected points, could reward a player like DeAndre Jordan or Steve Novak.  Goldsberry acknowledged as much in a podcast with Levy on Hickory, pointing to his earlier work at the Sloan Sports Conference.

In fact, it was probably the disproportionate efficiency James showed inside that induced the Spurs to adopt the defensive strategy they came one free throw or offensive rebound away from riding to a Finals victory. Instead of blanketing James with tight coverage, or bring a double team on James, who is an excellent passer out of the double, the Spurs deliberately gave both James and his team mate Dwayne Wade space to shoot with only a loose contest daring them to pull up and shoot. (Also, taking advantage of the fact that it is generally harder to, shoot off the dribble).

I viewed the Spurs strategy as another demonstration of the value of shooting versatility, (as well as, just how effective James is at scoring in the paint). I also thought it would be very unlikely such a strategy would be successful against Durant, who is unusually lethal from just about all over the court.

So, I decided to try to create a metric that measured scoring versatility in the NBA, which is different from how I interpret the Goldsberry measure as shooting prowess adjusted for degree of difficulty. Instead, I wanted to create a measure that did track how versatile players were in their scoring, including the ability to get to the line as Tom Ziller at SB Nation emphasized.

To build the index I used Hoopdata.com’s shot data, that I have used in the past on shot locations, along with their free throw data. I divided each player’s efficiency at HoopData’s five shot locations by the average player’s efficiency. The five locations are, At the Rim, Short (three to nine feet), Mid-Range (10-15 ft), Long-Two’s (16-23 ft) and Threes, along with made free throws per field goal attempt. Then I took the average of those scores times the percentage of zone’s where the player attempted a shot.

The intent was to measure how many different locations a player could score efficiently. Like any efficiency measure there were sample size issues, but limiting the selection of players with at least 250 shot attempts gave me very presentable results.

Below are the top twenty players in scoring versatility, the higher the ratio above one the more efficient the player was compared to the average player:

 Name Rim Ratio Short Ratio (3-9) Mid Ratio (10-15) Long Ratio (16-23) Three Point Ratio FTM/FGA Ratio Versalitiy Index Kevin Durant 1.21 1.55 1.49 1.26 1.47 2.49 1.51 LeBron James 1.26 1.72 1.08 1.35 1.44 1.57 1.33 Chris Paul 1.12 1.58 1.33 1.47 1.16 1.76 1.32 Dirk Nowitzki 1.02 1.48 1.42 1.47 1.48 1.20 1.27 Kobe Bryant 1.12 1.40 1.37 1.17 1.16 1.73 1.27 James Harden 1.01 1.08 0.77 1.06 1.30 2.65 1.25 Steve Nash 1.14 1.21 1.34 1.47 1.55 1.19 1.24 Tony Parker 1.11 1.43 1.22 1.29 1.23 1.48 1.23 Carl Landry 1.14 1.10 1.21 1.11 1.18 1.95 1.23 Deron Williams 1.13 1.28 1.27 1.17 1.34 1.48 1.22 Serge Ibaka 1.25 1.35 1.54 1.38 1.25 0.97 1.22 Darren Collison 1.08 1.19 1.15 1.23 1.27 1.76 1.21 Jason Terry 0.95 1.63 1.37 1.35 1.32 0.93 1.19 Amir Johnson 1.07 1.60 1.17 1.09 1.37 1.18 1.19 Ray Allen 0.95 1.68 1.21 0.97 1.50 1.13 1.19 Kevin Martin 1.19 0.78 1.30 1.26 1.52 1.48 1.19 Kyrie Irving 0.95 1.30 1.31 1.32 1.39 1.22 1.18 Jamal Crawford 1.08 1.56 1.20 1.14 1.33 1.12 1.18 Stephen Curry 0.95 1.18 1.42 1.29 1.61 0.99 1.17

Kevin Durant and LeBron James were both masterful scorers, but Durant’s superior Mid-Range game and the ability to get the line pulled him ahead.  James Harden wouldn’t even make the list without including free throws.  Also of note, Carl Landry, who is likely to miss the season, was surprisingly versatile in his game last year; above average from everywhere.

On the other end, here are the low scorers in terms of scoring versatility.  The bottom of the list is dominated by front court players, not surprisingly, and in many cases may not be an issue for their team’s offense unless the lack of any mid-range game clogs the team’s offensive spacing.

 Name Rim Ratio Short Ratio (3-9) Mid Ratio (10-15) Long Ratio (16-23) Three Point Ratio FTM/FGA Ratio Scoring Zone Coverage Versalitiy Index Josh Childress 0.89 0.00 0.00 0.00 1.18 0.25 0.67 0.26 Bismack Biyombo 0.98 0.75 0.88 0.61 0.00 1.13 0.83 0.58 Lavoy Allen 1.08 1.31 0.80 1.00 0.00 0.50 0.83 0.61 Kosta Koufos 1.07 1.40 0.88 0.73 0.00 0.60 0.83 0.62 Kevin Seraphin 1.12 1.37 1.02 1.02 0.00 0.45 0.83 0.65 Larry Sanders 1.05 0.75 0.80 0.85 0.00 0.73 1.00 0.65 Emeka Okafor 1.13 1.23 1.03 0.85 0.00 0.84 0.83 0.67 Reggie Evans 0.86 0.93 0.00 1.23 0.00 2.28 0.83 0.68 Kenneth Faried 1.07 1.14 0.88 0.91 0.00 1.25 0.83 0.69 Jason Maxiell 1.10 1.40 0.78 0.94 0.00 1.04 0.83 0.69 Elton Brand 0.97 1.40 1.03 1.26 0.00 0.73 0.83 0.69 John Henson 1.06 1.14 0.60 0.73 0.00 0.88 1.00 0.70 DeAndre Jordan 1.18 1.55 1.05 0.38 0.00 1.04 0.83 0.71 Nikola Pekovic 1.01 0.99 0.97 0.67 0.00 1.71 0.83 0.71 Lamar Odom 0.96 0.99 0.58 1.14 0.71 0.29 1.00 0.72 Charlie Villanueva 0.97 1.18 0.29 0.50 1.23 0.31 1.00 0.72 Keith Bogans 1.21 1.71 0.00 1.35 1.20 0.20 0.83 0.73 Omer Asik 0.97 0.77 1.44 1.02 0.00 1.44 0.83 0.74 Taj Gibson 1.11 0.62 1.03 0.94 0.00 1.06 1.00 0.74 Chris Kaman 1.03 1.64 1.12 1.49 0.00 0.57 0.83 0.75

Bottom line, I think the degree to which the ability of a player to score from a variety of spots on the floor is more valuable than a specialist is very much an open question.  Efficient use of possessions is the ultimate goal and, to me, the starting place for any analysis.  But, given that every team is trying to score against a live team defense trying to deny the offense open shots from their preferred locations it is probably helpful to make them worry about denying as many spots as possible.

### Making Connections: Explanation Percentages

Posted by on Oct 11, 2013 | 0 comments

US Presswire

Editor’s Note: This post is a collaboration between myself and Jeremy Conlin. He’s the man behind SuiteSports.com, as well as a contributor to KnickerbloggerClipperBlogBuzzFeed Sports and right here at Hickory-High. You can find Jeremy on Twitter, @jeremy_conlin.

Over the past few weeks Jeremy and I have been rolling out a series which looks at the relationships between the Four Factors, overall efficiency and Pace, on an individual team-by-team basis. The idea was that our understanding of the relationships between these factors have been, up to this point, built with multi-season, league-wide data which means we are looking at averages. But averages are a flattening of relationships both weaker and stronger. We wanted to look at how these relationships manifested for each individual team to see what we could learn about the teams themselves and the relationships at large. These analyses were all performed using the game logs from each individual team from last season. Here’s what we’ve covered so far:

We’ve seen some really interesting patterns but so far we’ve been looking at these patterns grouped by relationship. Today I wanted to create a view to look at all the relationships, assembled by team. I ran a series of regressions on each team’s performance in the Four Factors and Offensive and Defensive Rating from each game last season, and used a technique I’ve used before (explained here and borrowed from the work of Evan Zamir and Daniel Myers) to convert those regressions results into percentages. I’ve called them Explanation Percentages in that they explain, for that team, how much of the variation in overall efficiency can be explained by the variations in that particular statistic. For example, the Atlanta Hawks had an OTO% of 17.37%, which means that 17.37% of their game-to-game variation in Offensive Rating can be explained by the game-to-game variation in their Offensive Turnover Percentage.

I’ve created two separate visualizations, one for Offensive Rating and one for Defensive Rating. For each, the explanation percentages are shown as stacked bars for each team. You can hover over the bar for any team to see the actual numeric percentages. The width of each bar represents their Offensive or Defensive Rating, depending on the graph.

Here’s the offensive graph:

Here’s the defensive graph:

The biggest surprise for me was the extremely small relationship free throw attempts had, relative to the other factors, for nearly every team on both sides of the ball. This is much smaller than the relationship Evan Zamir had found in his previous work on this subject. I’ve given it some thought but haven’t been able to arrive at a satisfying conclusion as to why this relationship looks so much smaller on a team-by-team basis than it does with league-wide data (If anyone has any ideas, please let me know).

Obviously eFG%, or shooting accuracy, was the biggest factor on both sides of the ball explaining, on average, about two-thirds of efficiency. What’s interesting though is that TO% and REB% fluctuated greatly (relatively) from team to team, often flip-flopping in order of importance. If anything this really highlights how many different ways there are to build offensive and defensive systems and how many different ways this can create nuanced differences in the statistics. These relationships simply aren’t static.

This post essentially wraps up this work Jeremy and I have been doing. We may revisit it during the upcoming season, but with new data so close on the horizon we’ll be leaving it for now. As always, if you see something we missed or a pattern we’ve overlooked, please let us know!

### Making Connections: ORTG to Pace

Posted by on Oct 9, 2013 | 0 comments

US Presswire

Editor’s Note: This post is a collaboration between myself and Jeremy Conlin. He’s the man behind SuiteSports.com, as well as a contributor to KnickerbloggerClipperBlogBuzzFeed Sports and right here at Hickory-High. You can find Jeremy on Twitter, @jeremy_conlin.

Over the past few weeks Jeremy and I have continued to (slowly) roll out the results of some analysis we’ve done on The Four Factors. The relationships between the Four Factors and efficiency have been fairly well studied and established, however this has usually been done by looking at aggregate, league-wide data across multiple seasons. This helps establish how these statistical categories work together and overlap at a macro level, but we wanted to drill down to the micro.

We grabbed the game logs for each team from last season and calculated offensive and defensive efficiency, as well as the Four Factors and pace for each team in each game. This lets us look at that way these statistics relate to each other on an individual team level, which can often be very different from the league-wide trends. For example, in our first analysis we found that the Memphis Grizzlies, one of the best offensive rebounding teams, actually had a negative correlation between Offensive Rebound Percentage (OREB%) and Offensive Rating (ORTG). Although hitting the offensive glass was a huge strength for the Grizzlies it was a way of covering up for their other offensive holes. The more they were hitting the glass, the more other things weren’t working, and their overall efficiency was suffering.

Today we’re looking at the relationship between Pace and Offensive Rating (ORTG), exploring how pushing the tempo affects overall offensive efficiency for certain teams. The individual team graphs are below.

### Making Connections: OeFG% to ORTG

Posted by on Aug 26, 2013 | 0 comments

US Presswire

This post is a collaboration between myself and Jeremy Conlin. He’s the man behind SuiteSports.com, as well as a contributor to KnickerbloggerClipperBlog, BuzzFeed Sports and right here at Hickory-High. You can find Jeremy on Twitter, @jeremy_conlin.

Two weeks ago Jeremy Conlin posted the first set of results from an extended project he and I have been working on. The relationships between the Four Factors and efficiency have been fairly well studied and established, however this has usually been done by looking at aggregate, league-wide data across multiple seasons. This helps establish how these statistical categories work together and overlap at a macro level, but we wanted to drill down to the micro.

We grabbed the game logs for each team from last season and calculated offensive and defensive efficiency, as well as the Four Factors and pace for each team in each game. This lets us look at that way these statistics relate to each other on an individual team level, which can often be very different from the league-wide trends. For example, in our first analysis we found that the Memphis Grizzlies, one of the best offensive rebounding teams, actually had a negative correlation between Offensive Rebound Percentage (OREB%) and Offensive Rating (ORTG). Although hitting the offensive glass was a huge strength for the Grizzlies it was a way of covering up for their other offensive holes. The more they were hitting the glass, the more other things weren’t working, and their overall efficiency was suffering.

Today we’re looking at the relationship between Offensive Effective Field Goal Percentage (OeFG%) and Offensive Rating (ORTG), one that we obviously expect to be extremely strong for all teams – you can’t score efficiently if you can’t make shots. The individual team graphs are below.

This slideshow requires JavaScript.

This next visualization combines all the individual team information into one graph. The height of each bar represents how strong the correlation was between OeFG% and ORTG. The color of each bar represents the team’s ORTG. The width of each bar represents their OeFG%.

As we would expect the relationship between OeFG% and ORTG was extremely strong for all teams, with the 0.715 correlation for the Milwaukee Bucks being the smallest. We do find an interesting cluster of teams near the top – the Spurs, Rockets and Knicks. Everything the Spurs do comes down to precise execution of their system, with the ultimate output being the production of quality shot attempts. It makes total sense for them to be at the top of this list. If their system isn’t working, open shots aren’t being created and the shots that are being created probably aren’t being made. The Spurs didn’t have a focus on attacking the glass or getting to the rim to scaffold that offensive efficiency when shots weren’t being made. Which is understandable because they didn’t often hit that roadblock.

The Knicks and Rockets are also interesting because they finished first and second in three-point attempts last season, separated from the third team in the league by nearly 300 attempts. Both teams often played smaller lineups and opted to go with a spread attack. The Rockets in particular, with their focus on nothing but three-pointers and shots at the rim, put all their offensive eggs in the shot-making basket. Three-pointers are a tool to buoy offensive efficiency and they can really push the boundaries of a team’s OeFG%. However, they also introduce a ton of variance into the offensive equation and making them at a consistent rate becomes even more important when you’re moving shots away from the rim and the free throw line.

At the other end of the spectrum we find two of the most efficient offenses in the league last season, the Heat and the Thunder, with a smaller relationship between OeFG% and ORTG than the league average. With the tremendous individual scoring prowess and athletic gifts of players like LeBron James, Dwyane Wade, Kevin Durant and Russell Westbrook these teams can afford to miss a few shots here and there, making up for it by leveraging turnovers into transition opportunities and piling up trips to the free throw line. It’s also interesting that teams we think of relying heavily on isolations, the Heat and Thunder, but also the Nets, Lakers and Bucks all ended up on the bottom half of this scale, regardless of how efficient their overall offense was.

If anything jumps out at you that I’ve missed, please let us know in the comments. We’ll be back later in the week to look at some more of these team-level relationships.