Challenging the Hockey Statisticians - Part 1
Our own Justin Azevedo has run a great series explaining advanced stats to the uninitiated. The advanced stats perspective of hockey is a tough hurdle when you first come upon it.
Kind of like walking into a Calculus class before you have done Algebra.
Over the last couple years I've seen three potential reflex reactions to them in numerous hockey forums, blogs and in on-line hockey conversation.
(1) The fan who goes this is complicated therefore it must be right and accepts it blindly without unpacking the epistemological foundations.
(2) The fan who goes this is too complicated for me to understand therefore I am going to disregard all of it.
(3) The one that is silenced by the advanced stat conversation. The thought being that this is complicated yet I can't disregard it because it is rationally based on empirical evidence but I also can't challenge it because I do not know enough about it
Number 3 would be me and I read more and more on the topic to further my understanding. Hopefully one day I can contribute something meaningful to the conversation. (That day is not today by the way, this is a different kind of exercise to keep us occupied in the off-season)
The folks who do work in this area are well aware of its limitations. Justin immediately caveats all 4 of his articles with:
Advanced stats are not perfect and are mostly useless as evaluators of talent (like pretty much any other stat) without context, and the bigger the sample size the more accurate a stat will become-ideally.
Justin Azevedo - M&G
So how will advanced stats improve? The same way just about every idea in human history has, it has to be questioned through dialectic.
A dialogue between two people holding different views with a rational foundation. So I am going to forward some Socratic style simplistic questions here.
The challenge for our advanced stat advocates at M&G; be they our contributors like Justin, Ryan P and Arik or our regular readers and our mentors *cough* Kent *cough* is to answer these simple questions on advanced stats. As far as I can tell the advanced hockey stat conversation started a little over a decade ago. It was in use much longer of course with NHL clubs privately but it entered public fan discourse about a decade ago, as far as I can tell.
Some of you may be wondering why I didn't post these questions on another blog I write for, which is well known for its advanced stats writers. The reason is simple, I want to write these simplistic, yet foundational questions, from an introductory perspective for all M&G readers.
This is the level of counter to Justin's introductory articles on advanced stats. As I mentioned, advanced hockey stats in the public fan forum has been going on for over a decade. I suspect that these types of questions have circulated before and are tiresome or uninteresting for some stat writers who have been around forever.
Although it would not upset me in the least if anyone in the SBNation hockey stats universe wanted to weigh in from anywhere else. That is the whole point, to get a dialogue going and share the knowledge.
Perspective
This is not an anti-stats article. I acknowledge the validity of the advanced stats paradigm. It is irrational to completely disregard it but I question the scope of its predictive power and with that its authority to do much more than describe past events and contextual circumstances.
It is an interesting conversation, certainly, it helps enrich and provide fuller perspective on the players we watch. The Sedins are good but the high % of O-zone starts really helps them doesn't it. Hence, we get a fuller perspective of players and the game by listening to the advanced stats conversation but I have also seen advanced stats wielded like a sledge hammer to crush the optimism of the casual fan. Is that justified when those that create the analysis themselves qualify its very limitations?
Certainly the broad predictive strokes have merit. The Canucks will finish in the NHL's top ten next season, the Senators will miss the playoffs but can you thread a needle absolutely on a bubble team and predict the Flames or Blue Jackets in or out of the playoffs? Can you state the Canucks will be top 3 again this season?
Certainly there is some general predictive power here but is it significant? Is it any different from the experienced hockey fan on the coach, beer in hand perspective. This is not due to flaws in the existing computations or quantifications of Corsi, Qualcomp and so forth but rather due to the nature of the game of hockey. These are the foundational challenges that will be put forth for the Statistician's rebuttal.
Some sports like Baseball slide extremely well into statistical analysis, I would postulate that others resist it significantly and hockey is the latter.
The hockey gods or what may be more properly called the X factor is the reason I watch so many games year after year. Lets challenge the Statisticians for a response for how they deal with these various X factors.
(1) The Ice
Does any other sport have the diversity of surface for the game that hockey does? Ice in the southern markets has been referred to as worker's ice, slower ice, bouncy ice. This is a consistent geographical effect. Is ice differentials noted in any analysis? Is different types of ice something that certain players will proposer on and if traded to a northern team could be expected to decline in performance or vice versa.
Is it reasonable to hypothesize that Jay Bouwmeester's performance in Calgary is a result of not being able to perform as well on faster ice? Is Roberto Luongo's stay in the crease approach cultivated by the fast ice and boards in Vancouver?
This is more than a home ice effect. I am considering it as a macro principle. If all 30 teams had the same players who were perfectly cloned the effect I would expect would be lower scoring and lower goals allowed on the slower southern ice. Higher scoring and more goals allowed on the faster ice.
Is this a legitimate component for analysis or should teams simply continue to look at things on a rink by rink basis as visitors? Would coaches on slower ice in the south be better to consider trapping and defensive systems and focus on acquiring these types of players vs northern teams where faster ice who will be better served picking up highly skilled fast forwards?
As a macro principle should ice quality be a foundational element of analysis? Do statisticians disregard it and if so what is the explanation?
(2) The Goal by Deflection or Luck
Hardly an uncommon event. A goal by deflection is almost something that you expect to happen at least once in most games but how would you weight the skill vs luck factor of a deflected goal? I am not sure if one can fairly state this is too small a % of goals scored to say it is insignificant.
Can you blame a goalie for being beat on a deflection? Can you credit a player facing the wrong way, not even seeing the puck and the puck hitting his skate and going in fairly with this along with every other player on the ice? Do the numerous goals by deflection unfairly skew player stats upwards for the scoring team and unfairly skew stats downward for the opposing team who is defensively perfectly positioned, including the Goalie?
Or do we just say the hockey gods bless all teams equally with the luck or misfortune of deflection goals?(remember the bouncy ice of the south) It would not bother me in the slightest to see Goals given a sub-category of - XDeflection goal representing a evaluation of more luck than skill.
It is not to state that deflections are not practiced and not something that many players are highly skilled at. The player that jumps to my mind is Ryan Smyth who has made a career out of the art. But I have seen just as many goals go in through skill deflections as I have by luck deflections - off player's skates, their body and I hesitate to give them the same credit as a Ryan Smyth type player who knows position and deflection to the point that it is hard to ignore.
Same with a pure luck goal where a player scores on his own net for example or an incredibly bad goal the Goalie has let in. A goal that is so bad that it is a clear bungle and again why should the stats of the players on the ice be moved up or down in these cases? Not all teams were lucky enough to have the opposing team score on their own net last year.
http://www.youtube.com/watch?v=Ygi2SUWwk98
This type of pure error luck goal by Johnson is rare but in principle is it any different than a goal that hits a skate and deflects in - neither goal is really planned is it? Does any player shoot with the intention of deflecting a goal in off another player's skate?
(3) Injury Analysis in Advanced Stats
Hockey is the one sport where injury is almost expected. Yet to the best of my knowledge players are not valued more positively because of their sturdiness.
In an analysis of the value of players is this factored in anywhere? A player, who may excel when he is healthy, can not be considered as valuable for his team if he rarely plays a full season without injury. Is there any calculation that considers this? If not, why not? Players have histories of being sturdy and injury prone and this should be a factor considered somehow in the strength of a team going into the season, should it not?
Hockey is not the type of game where injury is the exception, it is the rule. How do statisticians account for the injury element in their analysis of a teams strength. One would intuit that it is reasonable to predict in some cases higher level of injuries on a team overall, given the injury history of its players and its style of game.
As knowledge of concussions increase and players take longer and longer to return to active play this element of analysis may be an area to consider for advanced statisticians. After the first concussion the possibility of another gets easier and easier as does the probability of missing more and more game time.
(4) Predictive Power?
The final question and perhaps the most important one. The question that cuts away all the complexity of advanced stats and brings the paradigm under the interrogator's light.
Keep in mind I have already acknowledged the validity of the advanced stats paradigm in enriching our perspective of the game and enhancing our perspective of players. It gives us a fuller understanding. At the bottom line though when someone states Jarome Iginla is over the hill, look at his declining X, Y or Z statistical quantifications, it is a powerful statement. One that due to the mathematical structure of the assertion tends to steam roll over those who want to counter-argue optimistically for the future. I don't like this common use of stats by some of its followers.
Last season Jarome Iginla crushed the pessimistic predictions of the statisticians. Matt Stajan was a massive bust and under-performer of their predictions. Lets review a posted article from Flames Nation.
In fairness, the authors at Flames Nation, as all advanced stats writers usually do in some way, qualify their predictions if they make them. Rob Vollman immediately states how difficult it is to predict points for a player in an upcoming season but he uses a good methodology and he has a lot of little beads flying across the abacus in the background. Lets look at the results of his work to challenge the statistical paradigm he represents.
6 of 9 players missed his statistical point production range: Iginla, Tanguay, Giordano, Stajan, Hagman and Bouwmeester.
Vollman hit the predicted point range for White, Bourque and Jokinen. That is a 33% success rate.
Statistically speaking should we consider this significant? When all the dust has settled, when all the buttons have been pushed, when all the gerbil driven excel spreadsheets have stopped churning, if you are not at least above 50% in your predictions are you not better off flipping a coin?
I applaud Vollman for doing this by the way, he is exactly the kind of stats writer I like, one that really tests methodologies and actually crunches out numbers for predictive results. Even though he misses, I respect his gusto for doing what we all want to see from advanced stats, predictive power, our hockey pools await.
Kent Wilson mentions in the evaluation article something we hear endlessly from hockey statisticians, I used to hear it from financial analysts as well when their stock picks failed. I did not have enough data, my data was flawed or in the case of hockey stats, the sample size is too small and if you bundle my projections all together it all averages perfectly. I am actually right.
No, the math is right but its ability to predict the future is limited.
The problem for financial analysts is that if your task is to pick a list of stocks and bundle them into a mutual fund it is fine but the task at hand was not to make a projection for the Calgary Flames as a bundled mutual fund of players but to predict individual player's scoring projections. The article is entitled "Pre-season PLAYER projections review", not average scoring for all Flames forwards, unless I misread or misunderstood the original articles Vollman wrote.
It is like playing monopoly with someone who has a permanent "Get out of Jail Free Card." While it is true of course, the larger the sample size, the better the predictive power but when it goes wrong the response is well, the sample size is too small for reliable results. You rarely hear sample size mentioned when they get a prediction right.
What is a sufficient sample size?
If the old vet forwards of the Flames last year could not provide an extensive enough history and they are into the final years of their career, perhaps advanced stats simply can not predict the future at all. Perhaps they really can only make the most general predictions and if so, well, remember old hockey fan with beer in hand on the coach, he can do that as well.
And with that comes the rub, if advanced hockey stats are limited in their predictive power and are primarily descriptive of past events are they not over valued at the current moment until they evolve to illustrate some consistent predictive results?
Anyone still want to say the Flames will miss the playoffs this year? ; )
________
Next week's prepatory reading
Ludwig Wittgenstein's "Remarks on the Foundations of Mathematics" & Godel's Incompleteness Theorems.
39 comments
|
0 recs |
Do you like this story?
Comments
Luck (or variance) is absolutely a huge factor in all statistics- and nobody should ever disagree. Look at Ovechkin last year- his play didn’t change, but totals did (less points, etc) because of a season of bad bounces, basically.
I’d say his play was different. For about half the year, he was playing with around 3 Corsi Rel with the same easy competition and favorable zone starts he normally gets (in 09-10 he posted a 19 Corsi Rel in similar minutes). Theories why range from being heartbroken to injury.
From January on he was the same old player. The Caps’ PP not scoring didn’t help either.
Scott Gomez and Michal Frolik are two other great examples of guys who had bad numbers for little reason. On ice shooting at about half his previous level for Gomez, individual shooting under 5 for Frolik.
Red Line Station and @RedArmyLine, featuring coverage of the most frustrating team in the NHL
I believe in next year.
by red army line on Sep 3, 2011 1:30 PM PDT up reply actions
Regarding ice, it’s not a persistent geographical effect (it’s humidity, whether another team uses the arena, preparedness of the arena and what conditions they favor, etc) and the team is a huge confounding variable. The Caps tend to have lots of GF and GA on maybe the worst ice in the league. The Coyotes are down the list despite great ice. Many people use road data or compare data between home/road to develop adjustments.
You can break down goal types. By definition, though, you can’t model luck. Just look at player data from other years and guess how much is skill. That’s essentially what a lot of this boils down to.
Injuries are absolutely a factor. They’re not intertwined with Corsi, but it’s easy enough to see that Jordan Staal and Travis Zajac don’t miss games while Alex Semin and Ales Hemsky often do. Thing is, the past, upon which these stats are based, include injuries, ice etc, as well. So they are built in.
Score tied Corsi has an r squared of over .5 with wins last time I ran the numbers, and according to Hawerchuk predicts just under 60% of series wins (with the actual score tied Corsi battle in the series being won by the series winner over 70% of the time). JLikens has a model that accounts for other factors, but he’s only been able to do it for two playoffs so far.
Red Line Station and @RedArmyLine, featuring coverage of the most frustrating team in the NHL
I believe in next year.
As for the final few paragraphs, I’d say that modelling complex systems accurately is almost impossible. I think we’re getting closer to determining how much skill and luck was involved in generating results in the past, which can give us a range of expectations, but one simply cannot predict many of the variables that will happen in the future.
For example, one of the reasons Jarome’s points jumped this year is the Flames drew a lot more penalties and he spent more time on the PP. The Flames PP also went from a bottom third unit to 8th overall by mid-February. Those are two things that simply can’t be assumed by anyone heading into the year. Also, you can make educated guesses about what coaches are going to do in terms of ice time and match-ups according to the past, but they find ways to change those up all the time based on X few weeks or results or Y injuries.
The closest I think we can come in the end is “if”, “then” predictions within a given range of expectations.
Wow. Bravo. Most excellent article my friend.
by Jeremywilhelm on Sep 4, 2011 1:44 PM PDT via mobile reply actions
Any hockey pool going on this year with M&G?
Really interested in a keeper
I don’t know if we’ll be doing a keeper league, but I’ll have something up about it in the next couple of days.
by Justin Azevedo on Sep 4, 2011 5:37 PM PDT via mobile up reply actions
Interesting questions, Mitch. I’ll take a whack at em tomorrow.
by Justin Azevedo on Sep 4, 2011 5:38 PM PDT via mobile reply actions
Does any other sport have the diversity of surface for the game that hockey does?
I would say baseball absolutely does. Teams model their games to suit the ballparks in which they play.
by J.J. from Kansas on Sep 5, 2011 5:19 AM PDT reply actions
???
You’re going to have to explain that a bit more.
The type of differential I am suggesting is quite significant. It would be like a ball park where it is constantly raining / misting. Not quite raining hard enough to have to the game called but misting enough to affect the surface of the field and play.
A player running to first base will be affected in speed. A base stealer would be affected by higher risk of slipping in take off. The ball would move significantly different in a grounder in the infield. The pitcher would be distracted by making sure the ball was dry as well as delivering the pitch etc…
The ball park of perpetual rain would be one where different players may be advantaged or disadvantaged. Catchers may be able to throw out base runners who are stealing more easily etc. That is the kind of differential I am suggesting be considered.
Ice quality is not a minor issue – imo, it directly affects speed and movement of the puck. It affects the outcome of games. I am suggesting here that it goes beyond the concept of home team advantage.
A homer 5 feet over the left wall in Toronto isn’t going to be a homer in Boston, thus creating a tangible effect on the outcome of the game. That’s why there are stats that take into account each ball park in baseball.
by Justin Azevedo on Sep 5, 2011 9:08 AM PDT up reply actions
This
Also, the way groundskeepers manage their grass to be either “fast” or “slow” makes a large difference too. Whether they play on old turf, new turf, or natural grass, the varying persistent weather conditions of the area.
I would say that baseball is very comparable in how much the outcome of games is affected by field conditions.
by J.J. from Kansas on Sep 5, 2011 2:03 PM PDT up reply actions
Ok
So the differential is accounted for in baseball. Doesn’t surprise me, those baseball guys have really got all the bases covered statistically : )
Is there anything comparable in the advanced hockey stats conversation. Any attempt even at it anywhere that anyone knows about?
No, but that’s because it a) doesn’t seem to be quantifiable b) conditions can change extremely rapidly and c) as far as I/we know it doesn’t have a large enough tangible affect on the game, so we don’t know where to start.
by Justin Azevedo on Sep 5, 2011 6:16 PM PDT up reply actions
Not that I've come across
As for those who say it doesn’t really matter, I would tend to disagree.
While I don’t think a team will turn down a great puck-handler due to playing on choppier ice, I’m fairly certain there’s a certain level of cart/horse argument that could be had about both the Ducks and the Stars’ organizational makeup giving trouble to finesse teams during spring playoff hockey games.
by J.J. from Kansas on Sep 5, 2011 6:18 PM PDT up reply actions
The nature of the ice is so volatile though that even if we were able to quantify it there’s not guarantee the stat would be accurate on a game to game basis, and for whatever reason that doesn’t sit well with me.
by Justin Azevedo on Sep 5, 2011 6:26 PM PDT up reply actions
That’s what I love about advanced statistics and hockey. ;)
by J.J. from Kansas on Sep 5, 2011 6:51 PM PDT up reply actions
And that validates that inescapable X factor in hockey.
I think we can agree that some rinks are better and more consistent than others. Some rinks are terrible and unknowns, it may be just a general thing to note but I think there may be something to targeting certain players that may perform better on your home ice. (Provided of course you know your home ice is consistently maintained)
Provided of course you know your home ice is consistently maintained
See, but you don’t know that. If you’re acquiring players based on what you perceive your ice quality is, then you’re probably making a mistake. Don’t have stats to back that up, though, just a really strong gut feeling.
by Justin Azevedo on Sep 5, 2011 7:35 PM PDT up reply actions
1. The Ice
Even in the past 10 years, there have been massive advances in the quality and maintenance of ice surfaces all around the league. No doubt that there are some places that have worse or better quality ice, but the relative skill level of the player in the NHL combined with the variance of playing surfaces (remember, there’s a maximum of 41 games on the same ice surface) suggests to me that there isn’t a tangible effect on the outcome of games-at least, not one large enough where we have to account for the arena like they do in baseball. I’ve played on some shitty shitty surfaces before-but even at my relatively low skill level, it didn’t really impact the pace or tone of the game.
2. Luck
There are stats that account for luck-PDO is the first that comes to mind. As for your example-goals are actually extremely rare events-if you look at Jarome’s Raw Corsi/60 last year, only about 1% of the time did a shot directed towards the net actually result in a goal, meaning that those “lucky” goals probably only account for a hundredth or a thousandth of a percentage point in terms of shots directed at the net. Point being, the actual importance/significance/occurrence of a “lucky goal” is nominal at best.
3. Injury risk
There’s not really any way for us to account for injury risk-the only constant between every player in the NHL right now is that they’re all male. Everyone’s body reacts differently to a stimulus. We can quantify how many shots a guy gets, but we can’t quantify how strong an ACL is. Not sure where else to go with this one. There’s no reason a guy who played 10 games last year can’t play 82 and vice versa.
4. Predictive Power
Personally, I don’t like to use “comparables” when predicting player performance. I kind of find it fruitless to compare two different players in different situations in different eras to try and figure out how many goals someone might get. Now, using stats to compare players against each other in the context of the same season is cool, but to use said stats to try and predict anything aside from regression to the mean can be a bit dangerous. It’s probably impossible for us to predict anything with certainty, but that won’t stop me from saying that Glencross probably won’t score 20 goals next year.
If there’s stuff in there that doesn’t make sense, I’m fucking tired.
You don’t think Glenncross will score 20? He was pretty much a PP staple player last year when the PP was going strong. I doubt Sutter is gonna change that.
by Jeremywilhelm on Sep 5, 2011 12:56 PM PDT via mobile up reply actions
I don’t think he’ll shoot 6% above his career average again, no. If he gets 15 at evens I’ll be shocked.
by Justin Azevedo on Sep 5, 2011 1:20 PM PDT via mobile up reply actions
1. Have you played a game in the southern states. What a Canadian may call bad ice at a local small town rink in Canada is different from the ice in the hot and humid south. Not disagreeing just exploring the question.
I’ll leave it at the ice level for this discussion but boards, lighting and other elements can be at play as well. Are they significant enough for teams to consider in acquiring or not acquiring certain players – perhaps not, but it is just something I wanted to hear a response on.
2. Luck – I’ll accept your counter on the % of luck goals being insignificant but I may count them this year for the Flames anyway and judge how many goals the team puts in on what I consider a luck deflection – Is the 1% your threshold of insignificance? Or say 5% or less goals scored by the Flames this season also too small for significance or some other number?
3. Injury can strike anyone, anytime, for sure. But I would approach injury history on players who have 5-6 seasons under their belt. Jbo, Jarome and Olli have over the course of their careers proven to be very sturdy. Part of this may be how they play the game. A hard checking, shot blocking player would be more likely to take injury. Is it worth adding to an analysis, can a proper methodology be created? I don’t know this is all part of the discussion.
4. I want to be reasonable on this. I can’t predict how many leaves are going to grow on my trees next spring but I could probably come up with a reasonable range. That is all I am looking for. Lots of things to account for but I think predictions within ranges can improve. I think the methodologies can get better in hockey. I hope to see that in the future.
1. One in June 2009 in Anaheim, but even without that I’m confident in saying a brand new ice maker in PHX probably makes better ice then a 50 year old arena in London.
2. I’d say 5% of goals, but like David said, being able to deflect goals in front of the net is a repeatable skill, so I wouldn’t count something like that as a lucky goal.
3. While that may be true, what if Matt Cooke knees JBo in the first game of the season? Then his lack of injuries means nothing. Banking anything on/trying to quantify something as common and random as an injury strikes me as silly.
4. They can and with time they probably will.
by Justin Azevedo on Sep 5, 2011 7:33 PM PDT up reply actions
Let me expand a little more on 3 and 4…
The probability that a player could get injured in any game is 100%, and there’s no way for us to reduce that probability unlike there is for a goal-PP time, shifts, comp, etc. That doesn’t matter when it comes to injuries.
What do you think is a “reasonable range”?
by Justin Azevedo on Sep 5, 2011 7:44 PM PDT up reply actions
I probably shouldn’t use the term "range" because it is a statistical term with precise meaning. Predicative power is all I am talking about. Statistics is a huge field all I am looking at is the bottom line on the results.
When I read an advanced stats article that predicts something I expect a success rate above 50%, hopefully at the 70% level.
Sample isn’t large enough to get confidence intervals that small, I’d think.
Plus you run into confirmation bias all the time—the stats guy seeing a correct prediction validating the stats, the anti-stats reader seeing an incorrect prediction proving stats are worthless and “hockey can’t be quantified”
Red Line Station and @RedArmyLine, featuring coverage of the most frustrating team in the NHL
I believe in next year.
by red army line on Sep 9, 2011 9:05 AM PDT up reply actions
A few of my thoughts...
1. The Ice
Though I haven’t done any work to study the issue, I don’t think the quality of the ice is a major factor and I have seen no evidence that suggest that it does. For example, I don’t see teams from the south avoiding skilled puck handlers because their poor ice hurts a players ability to handle the puck. Maybe they do, but I haven’t seen anything to suggest it.
2. Luck
I actually don’t believe pure luck is a major factor in hockey. If a deflection occurred, it was a result of someone getting in front of the net causing traffic (which is a repeatable skill/style of play). If you are not there, you are not going to get a deflection.
Most of what people call ‘luck’ in statistics is really just an inability to identify talent/skill due to lack of information or insufficient sample size.
3. Injury Risk
As Justin said, there is really no way for us to account for injury risk. There is probably no doubt that some players are more injury prone than others, but if luck comes into the game at all it might come into the game most in the form of injuries. Hitting the boards at a slightly different angle might be the difference between skating away unharmed and a separated shoulder. Players with a past injury may have an increased chance of re-injury but I am not sure how to quantify this. I think we account for injuries qualitatively with ‘given more or less equal talent, choose the player with a less significant injury history’ rather than quantitatively.
4. Predictive Power
Based on past performance I’d bet on Ovechkin being one of the top goal scorers in the NHL this upcoming season. Ok, that was a simple one. The problem arises in refining the predictions to more specific values such as ‘Ovechkin will score 42 goals next season.’ The problem is, we don’t know a lot of things about the future and even if we knew exactly what Ovechkin’s talent level is, we wouldn’t be able to predict the future. Will Ovechkin get injured (nagging or more significant). Who will Ovechkin play with. Will any of Ovechkin’s linemates get injured. How many minutes of ice time will Oveckin get. How many penalties will the Capitals take/draw that would affect Ovechkin’s PP/SH/ES ice times. Will the Capitals struggle out of the gate resulting in a coaching chance and the new coach implement a more defensive style of play? Predictions get even more difficult when you have middle of the road type players that could either play a more offensive second line role or a more defensive third line role. What role they settle in to can affect their results on the ice dramatically. Unlike baseball which is more of an individual game, there are many factors that influence a players performance (i.e. how many goals will they score) that are beyond the players talent level. This is why predicting goal totals is so difficult. If you can predict all these other factors I could probably come up with a reasonably reliable predictive model.
But just because we haven’t come up with a reliable model for predicting all those other factors doesn’t mean we shouldn’t use advanced statistics to attempt to evaluate individual players skill levels. If you can maximize your teams individual skill levels the likelihood that you will see success on the ice is greatly increased.
For me, statistics is best used to identify players who have performed somewhat above or somewhat below what basic statistics would tell us or what conventional wisdom tells us. For example, most Leaf fans believe Luke Schenn to be a quality defensive defenseman, but I showed through advanced stats that he problem isn’t a very good defensive defenseman at all and that coach Ron Wilson didn’t generally rely on him in defensive situations. Looking at his basic statistics wouldn’t tell us this. Study of advanced statistics can also help us identify who had particularly good or particularly bad results in large part because of the one ice situations that they have been given. A player might get 50 points on a bad team but on a good team can only expect to get 30 points because they won’t be getting near as much ice time or be getting near as many minutes on the PP because the good team has better players playing ahead of him and the bad team didn’t.
----------------------------------------------------------------------------------
HockeyAnalysis.com - Taking a Deeper Look at the World of Hockey
by HockeyAnalysis on Sep 5, 2011 11:13 AM PDT reply actions 1 recs
1. Ice – fair enough but a hypothesis can be proven false. If ice doesn’t matter someone can create a research design based on historical data. I could look at vet players like Jarome with a lot of games under his belt and several others and go through his performance at each rink. I would hypothesize that he has scored more goals in the northern rinks with better ice etc… The results may be insignificant and he has scored his goals without significant deviation at all other rinks etc…
Just because no one has done this doesn’t make it automatically invalid. It needs to be disproven.
2. Luck – Not sure I am fully convinced. No player shoots or passes and plans for that "pass" or "shot" in all cases to hit that skate and go into the net. Many deflections are planned and practiced plays, for certain, but how many are pure ‘throw it at the net and hope for the best’. (I’ll count those this year for the Flames – it very well may be too small a number to be significant)
3. Injury – I believe you can create a methodology for the probability of injury risk for players at a certain point in their careers – 5-6 seasons maybe ~500 games. This methodology would include their style of play, their shot blocking, their hits etc. I feel Kyle Clutterbuck is far more probable to get injured than Jbo, our positional – non-checking D man. Once a player has a concussion his next concussion is far more probable and so on and so on.
I think stat methodologies could be created and tested in this area if there was a will to do it. Don’t ask me for the details, I’m wondering if real GMs would consider it in a trade and find it useful.
4. Predicative power – I am not asking for anything more than a reasonable range for a prediction. It is impossible for exact predictions. I am not expecting that.
My point is only that we should, at least at this point in the evolution of stats, keep at the forefront that they are still evolving and are not absolute. They can improve and the days of designing better methodologies and computations are still here.
Ice – True, I am just not convinced the effects are as great as field factors in baseball or even the older small ice arenas in the 1980’s.
Luck – Tomas Holmstrom made a career of standing in front of the net and I am sure he had a lot of ‘lucky’ goals deflect off of his various body parts and into the goal. But, shouldn’t Holmstrom get some credit for being in front of the net and essentially creating his own luck? Just as he should get some sort of credit for a goal scored while he was screening the goalie.
Injury – I am sure GMs take past injuries into account when acquiring or trading away players. I am just not sure they quantify it in any methodical way.
My point is only that we should, at least at this point in the evolution of stats, keep at the forefront that they are still evolving and are not absolute. They can improve and the days of designing better methodologies and computations are still here.
I agree. We should always try to improve our methodologies and there is still more to learn. That said, I think the greatest gains might actually come from improved statistics tracking from the NHL rather than enhancing current techniques on current data. I’d love to get my hands on more accurate shot location data with shot speed (bring back the fox puck, but not the glow) as well as passing data and maybe even individual attribute data such as skating speed and acceleration. I’d love to analyze how slower skaters perform against faster skaters or investigate how quality passers can influence a shooters shooting percentage.
----------------------------------------------------------------------------------
HockeyAnalysis.com - Taking a Deeper Look at the World of Hockey
by HockeyAnalysis on Sep 5, 2011 8:15 PM PDT up reply actions
Great article Mitch. I wholeheartedly agree that the way theories get proven out is through some antagonistic debate.
My thoughts on your main questions:
1. I think most of the people here have covered it. I’m not sure the ice effects are as pronounced or tangible as they are in baseball, but it would be interesting to see if someone could prove it out. The quality of ice is definitely different, but how much it affects the outcome of the game is what’s in question.
2. I would say that players in front of the net are sometimes in essence “creating their own luck”. However, we can track it. We look at a player’s on-ice shooting percentage. If a player had a “lucky” year and that number was 5% above his career average, we would theorize that it was an uncharacteristicly lucky year. However, if his on-ice sh% was 3% higher than his teammates but in line with his career average, it may be an indicator that our player knows how to create goals better than most.
3. Injuries lend themselves to total value vs per-minute-value discussions. I tend to look at counting metrics (ex. GVT) to see total value while looking at per minute value using Corsi, etc. However, it’s easy enough to take health into account. Ranking players by GVT per game or per minute would show the best value players per ice time, but filtering for minimum games, or then further sorting on total would account for health. I don’t think we’ve explicitly tailored metrics to injury history as a community, but the tools are there to incorporate it.
Ryan Popilchak
Matchsticks & Gasoline, Artic Ice Hockey, &Hockey Prospectus. My twitter handle is @sprtopinionated
question #4
I hit the post button too early, sorry.
- - Predictions are the holy grail of analysis, partly because they’re the most difficult. Like Kent mentioned above, there are a myriad of situations and variables to consider, many of which we can’t properly account for. However, that doesn’t mean we should just give up. It means that knowing where we’re deficient gives us a chance to develop the knowledge we need.
As for how much predictive power we need, I’ll pose these questions to you.
If I could give you information that helped you pick a correct stock more often than the average person, how much would be useful? Would 1 out of 100 times be enough? Would you need to be right every time, or just more than the average investor?
The idea is simple. Predictions using advanced stats will never be 100% correct. But if that information could help a GM, or even a guy in his hockey pool get even the slightest edge over his competition then there is value to that advantage.
Even with all the great work done over the last few years, we’re a long way off the knowledge we’d like to have, that said, we’ve also come a long way.
Ryan Popilchak
Matchsticks & Gasoline, Artic Ice Hockey, &Hockey Prospectus. My twitter handle is @sprtopinionated
by SO_RyanP on Sep 6, 2011 7:42 AM PDT reply actions 1 recs
Great Discussion !!!
Really exceeded my expectations – I did my best to take a contrary position but very good counters from all who responded.
I’m not going to lie though, I do take some satisfaction in digging up these articles. As Iggy even beat the 3-2-1 projection for last year.
http://www.arcticicehockey.com/2010/9/10/1669695/reasonable-expectations-jarome























