Playing With The New Toys: Digging Further Into SportVU
USA Today Sports
Special Thanks to Darryl Blackport of BackboardBlues for his continuing help interpreting and visualizing much of the data herein. Also thanks to Brian Kopp of Stats, Inc. for taking the time to speak with me at some length about how Stats, Inc. is working with both the NBA’s media arms and the individual franchises to improve the SportVU system itself, and equally importantly, to bring as much new information as possible to the NBA-following public.
Recently, we took a deep-dive into some of the game-by-game stats available in NBA.com’s game-by-game player tracking boxscores. However the game-by-game stats only tell a tiny portion of the story of how the publicly available interpretations of SportVU data* can illuminate player and team performance (The public data itself is only the tip of the iceberg, for example see this breakdown of pick-and-roll effectiveness or this one). First of all, single games represent a tiny sample size, separating the noise of luck from the signal of skill is virtually impossible over such a small sample. More importantly, the data captured on a season-long basis is much richer than that provided in the game boxscores. For example, the “assist chance” is not included for each game; as will be discussed below, this data allows for some highly informative looks at offense and shot creations far beyond what standard box score stats and even play-by-play data allows.
* In actuality, the information presented on NBA.com and elsewhere gleaned from the SportVU system has already been heavily refined from the raw data. This raw data is contained is database with entries representing the positions of players and the ball several times every second of an NBA game, and requires a fair amount of computing power to process into a form accessible and understandable in more familiar basketball terms. For ease of discussion, the filtered output which results in the columns of statistics seen on NBA.com will be referred to as “SportVU data” or “statistics” for the remainder of this article.
This isn’t to say that all the data presented under the ‘player tracking” tab on NBA.com represents information useful to determine the effectiveness of a particular player. To quote a presenter at the recently concluded Sloan conference (himself quoting Albert Einstein) “not everything that can be counted counts and not everything that counts can be counted.” Or, more directly, to quote Stan Van Gundy upon being informed that Paul George leads the league in distance traveled asked, “of what possible us is this information?” Kopp largely agrees with this assessment, but cautions that though in a vacuum the distance traveled and average speed categories presented might not mean much, by digging a little deeper there is great potential for analysis in terms of training, conditioning and injury prevention by analyzing the level of exertion of a player throughout a game or series of games.
Another a SportVU-derived stat which gets cited with some regularity is “points per touch”. While this number on the surface appears to indicate something about the efficiency with which a given player scores, it actually is far more a reflection of how likely a player is to shoot than anything else. Nick Young has an exceptionally high points per touch because Nick Young shoots all balls — his overall efficiency is slightly above league average. But if you never pass, you are going to score on a high portion of your touches because you miss 100% of the shots you don’t take. This is not intended as an indictment of the SportVU system nor even the presentation of this data point, but more of a caution that interpretation of the data is in many ways more important that the data itself. An intriguing effort at contextualizing performance based on touches from Jared Dubin can be found here.
Much like Dubin’s analysis, many of the full season “Player Tracking” stats found on NBA.com can, with a bit of manipulation, tell us a great deal about how certain teams and individual players go about making their contributions on both ends of the court.
As an example of some new, evaluatively useful information presented as a result of the system, SportVU has provided much more detail about shot types than has been available in the past. The Shooting Efficiency and especially more detailed “Catch and Shoot” and “Pull Up” tabs give a great deal more information about players’ shooting abilities in different situations than has ever been available previously. It has been well-accepted that most players shoot better in catch-and-shoot situations, rather than off the dribble, and this data goes a long way to quantifying the difference. To take a favored example, Kyle Korver is a ridiculously deadly catch-and-shoot player. He shoots a patently absurd 51.7 percent on catch-and-shoot threes, best in the league by a good distance. A surprising second is LeBron James at 48.9 percent. Make them put the ball on the floor, and Korver’s percentage falls to a still very respectable 37.0 percent while LeBron falls all the way to 33.0 percent. So running either of them off three-point line would seem to be a good play, though in the case of LeBron making him drive leads to an entirely other set of problems!
Innumerable other insights can be gleaned from this shot breakdown data, though it is itself not without some small problems. For example a rhythm or “escape” dribble into a shot such as this:
is treated the same as driving pull up:
While these shots are similar, the strong intuition is that most players would shoot more effectively in the first situation than the second, Stephen Curry‘s status as an alien not withstanding.
While the shot breakdown metrics are somewhat prêt-à-porter in that they are useful as is, interesting information can be gleaned from other sections of the data with a bit of manipulation and interpretation.
Valuing the rim protection provided on the defensive end, especially by power forwards and centers, is a perfect example. Using stats found under the “Defensive Impact” tab (this header is a bit of a misnomer as the data presented best captures “rim protection” rather than the overall influence a player has defensively), and combining them with some more traditional metrics can give us a decent estimation of this important defensive skill. However, this is also one of the areas where citation to SportVU data starts to go awry. Most commentators simply cite the “opponent’s field goal %” as the defining statistic on a player’s ability to defend the rim, when that stat tells only half the story. Even a relatively poor “rim protector” is much better than no one contesting the shot at all.
Manipulating the Data
More specifically, an uncontested shot taken less than 5 feet from the rim is converted around 75-80 percent of the time. Shots contested by any player are converted in the mid 50s, whereas shots converted by a “big” are made at a rate of just under 50 percent. In terms of big men, the best at holding opponents to a low FG% at the rim, such as Roy Hibbert achieve percentages in the low 40s. At the same time the worst in this regard, such as Thad Young, give up just around 60% percent at the rim. The gap between the best and worst is comparable to the gap between an uncontested shot and one taken over a challenge from the weakest rim protecting bigs. Thus, just being big and being there has tremendous value! To put it another way, the number of contests at the rim is almost equally important to the percentage allowed.
Though somewhat unhelpfully labelled as “Opponent FGA at the Rim Per Game”, the public SportVU data does capture this number of contests. We can thus estimate the “points saved” by a player’s rim protection — if we use (for ease of calculation) an estimate of 75 percent conversion rate on shots under 5 feet which are uncontested, we know that these shots are worth about 1.5 points each. Take a made up example of a player who contests 4 shots per game, and on those shots the opponent scores 50% of the time. Simply subtracting the actual points scored (4) from the expected points scored (6), we can estimate that this player prevented 2 points.
To spell out the math:
Expected points = FGA*.75 (uncontested shooting percentage) * 2 (points per FG)
Actual Points = FGA * .5 (actual shooting percentage versus player’s contests) * 2)
This gives us a raw number of “points saved”, but we’re not quite done yet. First of all, how do we tell what is a good or bad number? Second, it’s not totally realistic to simply measure the player’s contribution against what would happen if we simply digitally removed him from the court. A solution to both problems is to estimate what a hypothetical “league average” player would have done given the minutes a player has played. By aggregating the defensive impact data from players leaguewide, it appears that an “average” big man would contest just over 8 shots per 36 minutes played and would allow just under 50% conversion rate versus his contests. Using the formula shown above this means the expected rim protection value of a big man is just over 4 pts/36. This sets an appropriate baseline against which to judge whether a player is actually making his team better or worse.
There is also a final adjustment to account for players on teams which either through pace, defensive performance or a combination of both allow relatively more or less shots at the rim. A big man on the 76ers has more opportunities per minute to defend a shot at the rim than a big man on the Pacers, as Philly is terrible defensively while playing at an extremely elevated pace, and it would distort the results to not account for these increased opportunities to “save” points. In the results listed below this adjustment was based on a comparison between the number of opponent attempts at the rim while the player was on the floor for each player (compiled here), though using team data from NBA.com is a close enough approximation (on a per minute basis, very few players in the league have been on the court for a disparity in opponent’s rim attempts greater than about 1.5 shots from their teams per 48 minute average)
Rim Protection Values
Examining the whole rim protection story leads to some intriguing results. Unsurprisingly, through the All-Star Break, Roy Hibbert was by a decent margin the most valuable rim protector in the league, saving over 4 points per game more than then average big man. Hibbert not only holds opponents to a low percentage (41.3 percent) but he also leads the league in the percentage of opponents shots at the rim he contests at almost 60 percent. By comparison, the average big man contests around 38 percent of opponent short shots.
And in fact, this contest% shows how incomplete mere reference to opponent field goal percentage can be. To take one notable example, Anthony Davis, despite leading the league in blocks, is not actually an especially effective rim protector. Though he does hold opponents to around 45 percent shooting at the rim, he only contests around 28 percent of opponents’ rim attempts, on par with noted help stalwarts Carlos Boozer, Andrea Bargnani and Boris Diaw. Therefore, despite his lofty block totals and good ability to force misses when he’s present, his relative inability to contest shots renders him below average in this metric, costing New Orleans about .5 pts/gm. Below is a table listing the ten best and ten worst rim protecting big men so far this season on a per game basis:
The columns from left to right: Games played; Minutes per game; Opponent’s FG% at the rim on shots contested; Contests per 36 minutes – to define how well a player performs better or worse than average on this metric, production had to be translated into a per minute-based rate; contest% is the ratio of opponent close shots while the player is on the court which the player contests; Saved per 36 over average is a minute-based comparison of the points each player has “saved” with their rim presence compared with what the hypothetical league average big would have accomplished; and saved per game is that same number presented on a per game basis.
Of course, rim protection is not the sum total of a player’s (even a big man’s) defensive contributions. How much of Davis’ relative dearth of contests are as a result of New Orleans’ schemes which often have him aggressively trapping the pick-and-roll, and how much is it simply Davis having just turned 21 and not yet able to rotate quite as quickly as the NBA game demands? Unfortunately, the public data can’t tell us that, yet. But with more seasons of data perhaps some light will be shed.
(I’ve written extensively about measuring rim protection value previously here and linked therein, with leaguewide statistics for the entire league presented here)
Playmaking, Offensive Roles, and Turnover Rates
While most of the data needed to answer some questions about rim protection are available either in the player tracking data or elsewhere on NBA.com, some selections can provide some answers while raising some new and interesting questions.
One of the long-standing mysteries in statistical analysis is the value of playmaking. Various box-score based metrics have wildly differing views on the value of an assist. After all, who should get credit between the guy who made the basket or the one who made the opportunity? The answer varies a great deal by shot attempt.
However, the SportVU data has begun to shed some light on the value, broadly speaking, of playmaking. SportVU measures “potential assists“, that is to say passes that lead to shots that if the shot were made, the passer would be credited with an assist. (An interesting note on this front, in discussing the SportVU system with Kopp of Stats, Inc., he reminded me there is no real “official” definition of what should be an assist, so the SportVU team had to come up with some guidelines on their own as to what would and would not count as a “potential assist” in the optical tracking data. He estimated that about 5 percent of awarded assists fall outside the parameters they selected. Which isn’t to say those are all illegitimately awarded assists, for example a Kevin Love outlet pass leading to a player taking three dribbles and dunking on a breakaway should be an assist, but would not be capture in the SportVU definition). Assisted shot tend to be more open:
As noted in the discussion of “rim protection”, contested shots are more difficult from any area of the court. Therefore, generating uncontested looks in general leads to better offense:
As an aside, the teams in the lower right corner, Oklahoma City, Houston, Minnesota and Toronto, don’t necessarily disprove the positive effect of creating for others, as those teams are among the league leaders in another important source of easy points: drawing fouls.
Since creating shots through passing is demonstrably valuable, determining who is most effective at doing so is also useful. Jared Dubin has manipulated this data to come up with some insights about who appear to be the better passers in the league, with the names being exactly who one might suspect – the top point guards and LeBron James.
Going the next step, how can we conceptualize offensive creation ability as not simply the ability to take shots (as expressed by usage rate) and passing ability (either through assist rate, or by reference to the Dubin’s analysis above)?
Well, a combination of SportVU generated stats and traditional box score numbers allow’s an estimation of a player’s total offensive role. I’ve named the metric “True Usage” as it is intended to be an addition to or replacement for traditional usage rate. Essentially True Usage combines a player’s shots, turnovers and assist chances into one number to estimate the percentage of his team’s possessions with which the player is directly involved. Unsurprisingly, it’s a metric dominated by point guards:
Of the players with the top 24 TrueUsage rates, 20 are pure point guards, two (Brandon Knight and Tyreke Evans) are combo guards with significant ball-handling and playmaking responsibilities and two are LeBron and Kevin Durant. The second column for each player is a measure of “True TOV%”, which better represents a player’s propensity to turn the ball over (or lack thereof) than the more commonly used TOV%. Turnover percentage has always tended to overcredit high volume shooters relative to pass-first players in terms of turnovers in that the stat is calculated simply dividing turnovers by the total of “shooting possessions” and turnovers. Thus the only way a player could “lower” his turnover rate was to shoot the ball. This lead to a situation in which the best point guards in the league often had some of the higher turnover rates. As these players became point guards in part because they took care of the ball well, a metric which purports to measure ability to avoid turnovers that scores this class of players poorly is a metric of questionable validity.
Which leads to “True TOV%” which accounts for a players turnovers as a percentage of shots and assist chances. Thus players like Chris Paul or Mike Conley who take on a huge offensive burden in terms of both shooting and distributing the ball are not overly penalized and in fact are shown to be excellent at taking care of the ball – both have sub 6% TrueTOV% whereas the league average in around 8.9% and the league average just for point guards is around 8.2% (a benefit of calculating turnover rate this way is that players playing “bigger” positions on the 1 to 5 scale tend to turn the ball over a bit more, relatively speaking, which comports with even a casual understanding of basketball as bigger players tend to be worse at dribbling in traffic, catching the ball on the move, and passing).
TrueTOV% also sheds light on the value of some “marginally efficient” cornerstone offensive players like LaMarcus Aldridge or Al Jefferson who make up for middling shooting efficiency by soaking up a huge number of possessions while turning the ball over extremely rarely (both have TrueTOV% in the 5.6-5.7% range, much lower than 9.7% average for big men).
However, the SportVU data as yet doesn’t allow a total analog to the “Usage/Efficiency” curve traditionally seen in shooting. The problem is that there is as yet no especially good measure of “passing efficiency”. Following up the discussion of individual “player tracking” box scores from last week, one piece of data has been omitted from the individual game box score: the “assist chance”, and the season-long data strong suggests an overestimation of the efficiency of passing, perhaps with the “hometown assist” acting as sort of a one-way ratchet upwards in that shots which might result in the award of a questionable assist are only counted as an assist chance if the shot goes in.
Examination of the season-long “playmaking” data discussed above somewhat confirms this theory, as the implied shooting percentages on “potential assist” plays is far higher than the player’s actual shooting percentages for most players, increasing by more than what more traditional measures of “assisted shooting efficiency” might suggest. Further, the data implies approximately 1.75 players are “involved” either shooting or potentially assisting on each play. For a number of reasons, this seems high and inconsistent with other data, as league wide, approximately 1 in 4 made baskets is assisted, and the number of “secondary assists” awarded by the cameras are extremely low by comparison to overall assist totals (Chris Paul currently leads the league with 2.2 “hockey assists” per game). If the data concerning the ratio between points and assist chances is suspect, the estimation of passing efficiency is difficult if not impossible. So even though looking at overall offensive contributions including both shooting and passing can advance the understanding of relative contributions of players there are important questions still to be asked.
(I’ve also written extensively about measuring “TrueUsage” starting here and linked therein, with leaguewide statistics through the All-Star Break can be found here)
One area where the best way to interpret this data is an open question is rebounding. The research paper awarded top prize at the recent Sloan Sports Analytics Conference involved a much deeper look into separating rebounding into component skills. However, that paper involved analysis and manipulation of the raw data, and even this deep dive represents initial research towards establishing a framework within which to examine rebounding.
The snapshot given on NBA.com is a much smaller slice of already processed statistics which provides some intriguing and conversation-starting data points. on what is remains in many ways a mysterious art. Moving from these data points to actionable information for evaluating player and team performance is a sticking point. For example, many people (myself most definitely included) have often criticized Kevin Love’s defense by suggesting he is more worried about pumping up his rebounding stats than actually helping to force a miss in the first place. There is certainly some visual evidence of this as Love is visibly reluctant to leave the basket area on defense, though how much of this is scheme and an acknowledgement of his relative lack of lateral quickness is unknown. He is also critiqued for “stealing” all the gimme defensive rebounds the opposition simply concedes.
These lines of attack are probably slightly unfair. The Timberwolves have a very conservative scheme with their bigs on defense because their primary big men, Love and Nikola Pekovic are not well equipped for an aggressive hedging scheme. And to the extent Love does grab a bunch of “easy rebounds” defensive rebounding is still a vital component of team defense. As approximately 173% of all AAU, middle school and high school coaches like to repeat ad nauseum, defense doesn’t end until we get the rebound.
However, there is still some support for the narrative critiques of Love within the public SportVU numbers – Love leads the league in “rebound opportunities per game” and is 3rd in the league in uncontested rebounds, both data points which could support the charges of “basket-hangings.” On the other hand, he is also third in the league in contested rebounds and is well above average in terms of securing the rebounds he pursues – of the 24 players averaging 15 or more “chances” per game, Love ranks a very respectable 8th in rebounding efficiency. So is he grubbing for stats of relentlessly crashing the glass? This being the first season of the data it’s hard to draw conclusions. Unlike Rim Protection where league averages can be calculated and translated into a unit of measurement (points saved in that case) which could be used as a measuring stick for individual players, there doesn’t appear to be an intuitive baseline for comparison for what represents “good” or “bad” in terms of rebounding. Are rebounding chances more important, or does rebounding efficiently suggest more value? How dependent on teammate and opponent skillsets are the answers to those two questions? We simply don’t know yet. But as with many topics we now know much more about what we don’t know than we did before SportVU.