OTTAWA – Everyone who attended this weekend’s hockey analytics conference at Carleton University was presumably into fancy stats, but the expertise varied widely. The technical complexity of many of the presentations was such that my notes trying to make sense of it all occasionally devolved to, “MATH???” And at the other end of the scale, there were the guys sitting behind me, who had this exchange just before the first session of the morning:
Dude 1: “I’m just going to title everything ‘Corsi.’ I’ll consider today a success if I learn what PDO stands for.”
Dude 2: “It doesn’t stand for anything. It’s a guy’s name.”
Dude 1: “Really?!” (frustrated sigh)
The analytics community feels at this moment like all grassroots uprisings: small, tight-knit, run on camaraderie rather than competition, animated by the happily defiant knowledge that they’re viewed as upstart outsiders. And, yes, they’ve heard it all—there were a lot of jokes about people who don’t watch the game.
Analytics has became a culture war in hockey: The most strident fancy stats proponents cast traditional hockey media as hidebound, braying meatheads who repeat the same truisms without ever considering whether they hold any, you know, truth. And the most vociferous members of the old guard paint the analytics crew as a bunch of arrogant Poindexters too busy clacking away at their spreadsheets to watch a game, never mind darkening the door of a dressing room.
What the conference revealed is that the black magic of analytics sometimes simply confirms or quantifies what observers of the game already know, and sometimes it turns it on its head. The NHL is on the verge of launching its own flotilla of fancy stats, and even if it weren’t, this is a community motivated enough to track 500 zone entries a game by hand when the data they want doesn’t already exist. Analytics is not going anywhere.
The 2014 off-season was dubbed “the summer of analytics”: a handful of NHL teams hired numbers guys, and it seemed a movement that began a decade ago in a Yahoo! discussion group had really arrived. Rob Vollman, author of Hockey Abstract and one of the godfathers of fancy stats, noted to the capacity crowd of 225 in Ottawa that an event he organized in Edmonton a year ago drew just 20 people. The analytics community continues to flourish online, and many people at the Ottawa conference scrawled their Twitter handles on their name tags and introduced themselves in person to people they already knew online. The whole thing felt a bit like a summer camp reunion. The crowd looked overwhelmingly between the ages of 20 and 40, and almost universally male, to the point that I gave a little involuntary nod of acknowledgement when I made eye contact with one of only four other women I saw in the room (that includes conference organizer Shirley Mills, an associate professor of mathematics and statistics at Carleton).
Throughout the day, there were wry acknowledgements of when and why a fancy stats insight grabbed widespread attention: it starts with “Toronto” and ends with “Leafs.” Sam Ventura, a PhD candidate in statistics at Carnegie Mellon University, had taken shot attempts and weighted them by the probability they would result in a goal, based on distance from the net, shot type and so on, resulting in Expected Goals For (EGF) and Expected Goals Against (EGA). His results adhere to what he called the three laws of hockey analytics: Whatever stat you conjure up, Buffalo has to be so bad as to dangle off the charts by a hilariously terrible margin, Toronto has to fare poorly so the findings generate a lot of hype, and Detroit always lands in the top five.
Analytics guys often work ridiculously hard to get the raw data they stuff into their fancy-stats sausage-making machine. David Johnson, the man behind HockeyAnalysis.com, explained how he painstakingly cross-references data available from the NHL to get what he needs. Event summaries provide player names and sweater numbers, which he matches up to the play-by-play sheets that log every event that happens on the ice, and those in turn are mapped onto shift reports so he knows who’s on the ice every time something happens. All this has added up to more than a billion individual stats on his site. But there are weird glitches that make life more difficult: Player names might arbitrarily switch from Matthew to Matt, so his code thinks one player is two different people, and it’s obvious the shift reports are less than accurate when they claim one team opted to play the entire second period without a goalie.
Alex Diaz-Papkovich, a master’s student at Carleton, added shifts and penalty expirations to play-by-play data, allowing him to pull apart fine details like the influence of individual players on a line. For the Leafs’ top line, Phil Kessel and James van Riemsdyk make good things happen, but Tyler Bozak? Not so much. Bozak is the punch line on skates for analytics right now—nearly every presenter took a swipe at Toronto’s first-line centre, and they got a laugh every time.
Stephen Burtch, who writes for Sportsnet and Pension Plan Puppets, wanted to unpack the influence of coaches, so he looked at the effects in Toronto and Ottawa when the men behind the bench changed mid-season. In Toronto, Corsi For Percentage (CF%) and Scoring Chances For Percentage (SCF%) have both improved drastically under Peter Horachek. But while Kessel and Bozak have had their scoring chances cut in half—which explains why Leafs scoring in general dried up—they’re also allowing fewer scoring chances, making them much better defensively.
Andrew Berkshire, managing editor of the Canadiens blog Eyes On The Prize, examined whether P.K. Subban is in fact a defensive liability. Subban and Ottawa’s Erik Karlsson both fall victim to the same trope, he said, but it stems from a fundamental misunderstanding of what makes good defence. Defencemen are traditionally judged on blocking shots and levelling hits, but that rewards you for not having the puck, when the best defence is keeping the puck away from your opponent—as the two former Norris Trophy winners do, given how much they have possession. Berkshire looked at Subban’s stats before and after his 2013 Norris win, when the Habs and Michel Therrien vowed to “fix” him. He found that they broke him instead. Subban’s instinctive and dynamic playing style was forced into a more traditional dump-it-in system, and he’s now allowing more shot attempts as a result. “Playing safe has actually made him a worse player,” Berkshire says.
Michael Schuckers, an associate professor of statistics at St. Lawrence University in Canton, N.Y., who helped organize the conference, said the most exciting thing on the horizon is player tracking, which the NHL is testing right now. “We’ve made some big steps in the last 12 months, and it’s just going to continue to grow,” he said.
Hovering around the edge of the conference all day was the idea of “making it”—who among their ranks had been hired and transformed into insiders, and what it will take to convince teams to pay more attention to insights like these. The last panel of the day featured some of the biggest names in hockey analytics musing on that. In baseball, there were a few teams that were very open about what they were doing and loudly championed analytics in the early days, Vollman said, but in the NHL, teams that have had success, such as Chicago and L.A., have refused to talk about it. And once someone from the fancy stats world gets scooped up to advise a team, they’re put under a gag order, he said, so it’s impossible to know how much influence they’re having.
Two years ago at the MIT Sloan Sports Analytics Conference in Boston, Brian Burke snorted, “Statistics are like a lamppost to a drunk: Useful for support but not for illumination.” (He was paraphrasing Scottish poet Andrew Lang, who used those words a century ago to mock dishonest politicians, but anyway, it sounded like a classic Burke-ism.) There’s still plenty of screaming over how to use the insights provided by analytics—and even whether there are useful insights there at all. But it looks like more and more hockey teams are willing to see the light, and the growing roster of fancy stats experts just can’t wait to flick the switch.