CORSI
Also Known As: Shot Attempts | Shots | Shot Volume | Possession
When you see a "shot count" on a scoreboard or a box score, that number represents only pucks that either go into the net (and become a goal) or are stopped by a goaltender. If you watch even a few minutes of hockey, you'll notice that a lot more pucks are directed towards the net than those that either go in or are played by a goalie. That's where Corsi comes in. Corsi doesn't just count shots in the traditional sense, it also counts missed shots (pucks that miss the net), and blocked shots. You can come up with this number on your own simply by adding the first three columns on an NHL scoresheet. Corsi can be measured "for" a team / player (CF) or "against" a team / player ("CA") as an on-ice measure or individual count.
Why is Corsi helpful? First, it's a more complete representation of the offense that a team is generating - it's all the pucks being sent towards the net - and it also represents the workload a goaltender faces.
Second, and most importantly, Corsi has been statistically proven to be one of the strongest predictors of the likelihood to win a game: the team that shoots the most pucks towards the net is most likely to get the ideal outcome! Next time you hear "the team tilted the ice in their favor," or "the team controlled possession," it's likely rooted in Corsi.
FENWICK
Also Known As: Unblocked Shot Attempts | Unblocked Shots | Unblocked Shot Volume
Now that we understand why Corsi is good, we can also understand where it has flaws. Currently, public shot data from the NHL is tracked by humans recording shot location and outcome, and this means, when it comes to blocked shots, the NHL marks where a shot is blocked not where it was shot from (there's only so much we can capture real time!) So, Fenwick is a measure that removes the data that doesn't truly represent what happened - it's Corsi without the blocked shots. In other words, Fenwick is shots on goal, goals, plus missed shots.
Fenwick isn't as statistically predictive as Corsi, but it does help us understand the differences in performance at a team or player level as it relates to blocked shots. Fenwick is also a big piece of more complex analytical measures so understanding what it is important.
EXPECTED GOALS
Also Known As: Shot Quality | xG
Expected goals is a measurement based on the idea that not all shots are created equal, and this makes sense, no? It would seem far more likely that a goal comes from a puck shot from in close to the net as compared to a shot that was fired from far away at the blue line, right?
That's the adjustment expected goal calculations try to answer. Using a mathematical model that factors in which kinds of shots become goals, expected goals factors in a variety of factors including, but not limited to: shot distance, shot type, time since last shot, game state (even strength versus power play or penalty kill), and shooter.
While expected goals feels like a great measure, it's still not necessarily the best one we have because of a few key reasons. First, not all expected goals are created equal. Every model has its own formula so it's important to understand what each does and does not include. Secondarily, because these models are based on publicly available data, some pieces of information are assumptions - for example, we don't know for sure if a shot is a rebound so we decide that if two shots happen in a certain location within a certain amount of time, it's a rebound.
Expected goals is a valuable tool, but always take the time to know which model you are using and what that model represents. A few to check out:
Evolving Hockey
;
MoneyPuck
;
HockeyViz
; Natural Stat Trick.
WINS ABOVE REPLACEMENT / GOALS ABOVE REPLACEMENT
Also Known As: WAR / GAR
If you are a fan of baseball, you've likely heard the terms WAR or "wins above replacement." WAR is a measure that looks at a lot of different data points to try and distill a player's value into one single number and that is how many "wins" does a player add (or subtract!) to their team as compared to a "replacement level" player. A replacement level player is a conceptual baseline of a player who neither adds nor subtracts value - their contribution is zero. Goals above replacement does the same thing as WAR, but looks at how many goals a player contributes. GAR can also be broken out into sub measures including offensive GAR, defensive GAR, etc. Just like expected goals, there are a few WAR and GAR models, and just like expected goals, the data we have access to today limits how robust these models can be.
Given the complexity of the game of hockey, it's certainly fair to question the validity of a "one number captures all" measure. Think of WAR and GAR as good places to start understanding a player's contribution that can point you towards the types of follow up questions you may have about how to truly evaluate that player.
MICRO STATS
Also Known As: Passing Data | Zone Entries / Exits | Player Tracking
Everything we've talked about to this point has been "shot-based," but the next exciting batch of data we can explore looks at things that are happening leading up to a shot. This information is currently lumped into a catch-all category called "micro stats" and falls into a few categories:
Passing data: Who is making passes on a team? Where is the pass? What is the outcome of the pass?
Transition data (zone exits and entries): How does a team get out of their own zone / into their offensive zone? Who makes this happen? Who tries to keep it from happening on the other team? How often do they do it? What is the outcome?
We are just at the beginning stages of understanding the true value of this kind of information, but we are already learning what kind of passes are most "dangerous" (most likely to lead to a goal), and what are the best ways to get the puck into the offensive zone. The only drawback to this data is that today, it is not made publicly available by the league and must be manually tracked. This means that getting our hands on this information is a much slower process than working with shot-based data.