Pages Navigation Menu

What To Trust In Small Sample Size Theater

It’s an annual rite of November. With only around 10 games played by each team, it seems that there’s just enough data to start coming to conclusions based on this data. But all stats are not created equal; some numbers from the first 10 games can be trusted and others cannot.

I dug into data from NBA.com, looking at how a team did in a few different metrics in the first ten games of the season compared to how they did in them over the whole season.

The clearest thing here is that shot selection is incredibly consistent while shot making ability isn’t. The number of corner three attempts a team takes per game will regress to the mean only ~15% after 10 games, while how a team shoots from the corner over the first 10 games is over 55% luck. This also gives a bit of a proxy for which shots are the most high variance. Three pointers from both spots on the floor have huge luck components over the first 10 games, but field goal percentage in the restricted area over the first 10 can be a bit more representative of the team’s actual ability.

More luck is involved in defensive rebounding percentage and defensive rating than in offensive rebounding percentage and offensive rating. This may indicate that a team can better control their own offensive performance than control an opponent’s offensive performance. Good offense beats good defense (Or, bad offense is worse than bad defense).

A team’s pace is fairly real after 10 games, likely do to the fact that pace is about play style, not results.

There are conclusions that can be drawn from the first ten games of a season, but it’s tough to use the first ten games to help predict the entire season. Excluding shot selection statistics and pace, every stat tested above was over 25% luck in the first 10 games of the season. This is a far cry from saying that statistics from the first 10 games mean nothing, though. They tell what has happened, and those numbers are important in analyzing the season so far. Their predictive ability is what’s lacking.

  • Clint Peterson

    I have a problem with early season data being construed as “luck.”

    10 games worth of sample size for one team is indeed small within the context of 82-plus in a season. But when we multiply that by 30 teams it becomes considerably larger. And if we look back at say 10 years of that same data sample it’s no longer a small sample size at all.

    We also have to consider the number of possessions in a single NBA game, then multiply that by games and again by teams. Again, the sample size suggesting luck diminishes greatly as the sample size within context has grown. That’s an awful lot of luck, good or bad, more than a bounce of the ball can account for.

    Is this season an aberration? Maybe, but I find that unlikely. And an aberration isn’t luck, necessarily, but more likely an anomaly within the scientific presentation of data, of which luck has very little place aside from psychology theory and random quantum mechanics — the latter based more so on randomness than luck itself.

    If we were to look back at a decade or more of the same early season data I’d expect we’d find very similar numbers within the context of the sample size for this one season. At that point, a pattern emerges meaning we’ve eliminated luck from the equation entirely and have to find reasonable explanations for why play often tends to suffer early in seasons, of which there are many, none of which that can be accounted for by voo doo or serendipity.

    In a lengthy Twitter discussion with Salt City Hoops’ Andy B Larsen on this topic I asked at what point in the season does data become reality and not luck. His answer? “There is no point in which data becomes reality.”

    Are we really willing to cite data, then present it scientifically as merely lucky or unlucky? If so, then the only conclusion we can arrive at by season’s end is that as numbers rise, players and teams got luckier and luckier, not more fluid, experienced, smarter nor better tuned to each other, the game and the court.

    If numbers and data are not reality, then what is?

%d bloggers like this: