Adjusting Expected Points Per Shot
USA Today Sports
We’re proud to present this guest post from Ben Dowsett. Ben is a stats whiz and regular contributor at Salt City Hoops. You can also find him on Twitter, @Ben_Dowsett.
Like most progressive advanced metrics, Expected Points per Shot helps paint a distinct portion of the overall analytics picture. As its name suggests, unlike many measures that focus primarily on raw accumulation and percentages per time period or number of possessions, XPPS centers around expected values given league averages. The differential between XPPS and actual points generated per shot attempt is one of the finest numerical qualifications for shot-making skill on both an individual and team level.
But the nature of the data – again, based on expected values rather than concrete results – begs one particular question to me: what sort of increased value could we add by introducing various levels of actual results into the expected data as controlled variables?
The original idea focused on controlling the results for free-throw percentage; that is, if we alter the XPPS formula to reflect each individual player’s actual free-throw percentage rather than the standard league modifier of 1.511 (league average points generated per two-shot trip to the line), what would happen? When discussing the idea with Hickory-High boss man Ian, however, he took the idea one step further – what if we did this same sort of controlled variation for each of the five shot locations plus my original free-throw adjustment? We could create six different variations of the data and see which areas showed the largest effect on overall XPPS, along with all sorts of other fun trend-spotting. A full-on nerd-out? You don’t need to ask me twice. The ever-wise wizard Joe Kolassa and I got started right away.
A brief breakdown of our method, for those interested: as explained on the XPPS page in greater detail, the metric is calculated using a standard point-per-attempt value for each location. To test each individual variable (shot area), we kept these modifiers intact for every area but the control variable, which we altered to instead calculate each player’s actual points generated per shot from these areas for this season (this is an important distinction; the standard XPPS modifiers are calculated for league averages dating back to the 2000-01 season, but that type of long-term analysis would be very difficult given the varying ages and years in the league for different players, so we used only the current season). A quick example: LeBron James has attempted 217 shots from within the Restricted Area this season, and has made 173 of them, for a total of 346 points generated. So for our RA variation, instead of using the standard modifier of 1.183, in LeBron’s case we would use his actual points generated per shot, which ends up at 1.648. We did this same analysis for all five shot locations, as well as for free-throw points generated, and there were certainly some interesting results.
We decided to limit the scope to rotation players only (15 mins+/game, 13% minimum usage, played in roughly half their team’s games) for sanity’s sake and to reduce variance somewhat. The rough expectation going in was relatively simple: players who shoot high volume or high percentages from a particular area will likely see large variations from that area, but players who shoot both high volume and high percentages should, of course, see the largest upticks in their overall XPPS. This assumption mostly held true, with some interesting caveats here and there.
For our adjustment in the Restricted Area, it’s not a huge surprise that the 6’8, 250 pound semi-truck who also happens to finish at about 80% at the rim tops the charts. LeBron’s XPPS spikes up an extra 0.152 when calculating using his actual RA PPS, to a 1.259 number that’s just plain unfair for a high-volume shooter. And while wings are scarce among the top 25 for this category (predictably populated mostly by low-block presences at PF and C), the others beyond King James all fit the bill perfectly – Manu Ginobili, Dwyane Wade, and Kevin Durant are all elite rim-attacking wings, and their RA-Adjusted XPPS reflects it. Likewise, while seeing a name like Jeremy Lin might be a surprise to some, SportVU data backs up his inclusion – he’s one of the most efficient and most frequent drivers to the hoop in the league.
There’s not nearly as much variation when moving to the In-the-Paint-Non RA adjustment (more on larger trends like this in a bit), and there’s a more even distribution among positions. Jonas Valanciunas leads the category with a boost of 0.076 to his overall XPPS, and several other guys with above-average touch for their positions show up as well – Chris Paul and Dwyane Wade (again) are both noticeable, along with crafty bigs like Dirk Nowitzki and Greg Monroe.
Not surprisingly, the Mid-Range adjustment is chock-full of “elbow” guys – Luis Scola tops the category (adding .111 PPS to his overall total), with players like LaMarcus Aldridge, Serge Ibaka, Al Horford and David West not far behind. But like RA-Adjusted XPPS, the guys near the top who don’t fit this trend (Aaron Afflalo, Steph Curry, Kyle Korver, JJ Redick) represent the absolute cream of the crop for shooting at their positions.
Again, it shouldn’t come as a shock to anyone that the vast majority of players on the top of the list for both types of three-point attempts are elite distance shooters. Jose Calderon gains an added 0.174 PPS when using his actual Above-the-Break-3 numbers, the largest single-area increase for any player among all six variations. Curiously, the largest boosts in the Corner-3 adjustment don’t come anywhere close to those for ATB-3 – eight different players show an increase of at least a tenth of a point for the latter while not a single one does so from the corners. Volume likely plays a large role in this (teams shoot anywhere from double to triple as many ATB threes as they do corner threes), as does the fact that the standard modifier for corner threes (1.153) is already quite high given the value of these shots, mitigating the impact of elite shooting from here somewhat.
Unfortunately, while making our adjustment for free-throws had been my initial idea, the results for this category ended up being close to pointless, at least compared to all other areas. No player showed an XPPS increase of more than 0.014, and while the top of the list certainly did include high-volume, high-percentage free-throw shooters (Dirk, KD, Harden), the boosts just aren’t enough to consider particularly relevant. But our overall expectations were mostly met; with a few surprises in each area (the biggest: Andre Iguodala somehow appearing in the top-25 for Mid-Range-Adjusted XPPS after shooting 31% from there last season and 33% in 11-12), high-volume and high-efficiency guys saw the largest swings, while those who fit both descriptions tended to show the highest increases to their XPPS. Conversely, then, the guys who lowered their XPPS the most within certain ranges weren’t the ones who simply didn’t shoot from those areas, but rather the ones who continued to jack shots even with mounting evidence they couldn’t make them – I expected to see both Josh Smith and Gerald Wallace near the basement for at least one three-point category, and was not disappointed.
While individual player trends illuminate the studs and duds from the respective areas, the statistician in me couldn’t resist a look at some larger tendencies. This sort of data lends itself quite nicely to linear regressions and an examination of the results, provided the researcher has a keen eye for overlap and differences in variable measurement – I’d like to think I’m up to the task. Briefly, for those not familiar with linear regressions (longer description here): they’re a simple way of plotting two sets of data points on an x and y axis to determine the quantitative relationship between the variables. This relationship is expressed in multiple forms, but the easiest to understand and most frequently used is R-squared, or the correlation coefficient. This coefficient can be anywhere from 0 to 1, with decimals closest to 1 showing the highest correlation and vice versa. For example, Effective Field-Goal % and True Shooting % are very similar metrics, with the only difference being that TS% accounts for the value of free-throws and eFG% does not. A linear regression between the two shows an R-squared value of 0.852, a very high value which we would of course expect given the similarities between the two measures. Keeping that in mind (and also, remember that these regressions are still only for qualified players, not the entire league), a few of the more interesting trends I noticed for our XPPS data:
- I’ll start with one trend that isn’t specific to the wizard Joe’s adjustments, but rather showcases the overall effectiveness of Hickory-High’s PPS data. As I mentioned in the opening, any metric like this can only tell so much of the overall story of what’s happening on the court. While comprehensive within its area, XPPS data obviously can’t in any way account for a player’s rebounding, passing, defense, or any of a number of other variables. That said, I decided to measure XPPS and its variants against Win Shares per 48 minutes, widely considered one of the more complete overall metrics judging player performance. And for an incomplete statistic, XPPS sure held its own: for differential between XPPS and actual points-per-shot (essentially a player’s shot-making ability above league average), an R-squared value of .413 may not seem astronomical on the surface, but considering all the other completely unrelated statistics that go into calculating WS/48 it is quite large. When using only APPS, the correlation rises all the way to .557. They’re not completely similar, of course, but the fact that eFG% shows a correlation of about .390 with WS/48 is telling, as eFG%, like XPPS variants, is a shooting-only metric. XPPS also compares favorably with a more unrelated measure, Net Point Differential (NetRtg), which posts a correlation of .363 with WS/48.
- The subconscious mind would likely associate the categories that showed the largest effect on overall XPPS (Restricted Area and Above-the-Break-3) with the largest correlation to XPPS in a linear regression…but this would, in fact, be backwards. When you pause to think about it, it makes complete sense; since these areas’ adjustments cause more variation from the original data, it stands to reason that the actual plotted data points would differ more (not correlate, in other words). For this reason, RA-Adjusted XPPS (.635) and ATB3-Adjusted XPPS (.591) have the two lowest correlations with original XPPS data.
- Conversely, In-the-Paint Non-RA and Corner-3 are the two adjusted areas that show the highest correlations with standard XPPS, both rounding to an R-squared value of .832. These are the two lowest-volume areas league-wide, so it makes sense that they’d match up most closely with the original values. Mid-Range-Adjusted XPPS actually groups in more closely with RA and ATB3, which is also interesting; it’ll be intriguing to check back in a year or two and see if this remains the case as the league trends toward cutting down mid-range jumpers and maximizing threes and layups.
- One last general regression-related bit: it’s not necessarily surprising, but it’s interesting to note how little XPPS and APPS correlate. A .164 R-squared value is even lower than I’d have anticipated, and it crystallizes the idea any knowledgeable basketball person will tell you: talent is still paramount to all else. For all our tweaks and calculations, there’s a reason why LeBron, Durant and the like seem to appear on the leaderboards for every variation we can come up with – the dudes can make shots, and they can do so way more effectively than the vast majority of their peers.
While none of these tweaks to XPPS are likely revolutionary by any means, they’re a fun way to add in a little more specificity to a unique metric in the analytics community. They’re also a simpler and more complete way of analyzing not only which guys shoot the best or the most from which locations, but which guys are helping their teams when they shoot and which guys aren’t. A big thanks to Ian and Hickory-High for allowing the wizard and I to play around with their data and make a small contribution, and a happy 2014 to you all.