The Hockey Stick Effect: Wayne Gretzky’s Goals, Part 1

1 Apr

What is the measure of greatness? How about 894 records, one for each of the goals driven home by the National Hockey League’s Wayne Gretzky, aka the Great one?

That spreadsheet is as large as it gets for NHL scorers, and Tableau ace Ben Jones has infused the goal count with lots of supplementary background about each and every one of the 894, archiving the data for download on the data.world site here.

In fact the workbook makes itself available in both Excel and CSV mode, the latter requiring a text-to-columns parsing that likens it to the former. Either way, a few organizational points need to be entered.

For one thing, you’ll note that what’s called the Rank field in column A numerically ids Gretzky’s goals, in effect sorting them by newest to oldest. That is, Gretzky’s first goal – scored on October 14, 1979 – has received id 894, with the numbers decrementing ahead in time until his final score – tallied almost exactly 20 years ago on March 29, 1999 – has bottomed out with the number 1. It seems to me – and I suspect you’ll share the opinion – that the enumeration should have pulled in the opposite direction, with Gretzky’s last goal more properly checking in at 894. With that determination in mind I reversed the sequence via a standard autofill, entering 894 in cell A2, 893 in A3, and copying down.

You’ll also be struck by the unremittingly monotonic entries in the Scorer field, comprising 894 iterations of the name Wayne Gretzky. We’ve seen this before in other data sets, of course, being dragged into the data set as a likely accessory to some generic download protocol. Again, you can either ignore the field or delete it. Either way, you’re not going to use it.

And your curiosity will be stirred anew by the blank column-heading cells idling atop columns D, F, and G. It’s difficult to believe that Ben Jones, who doubtless knows whereof he speaks, would allow these most rudimentary oversights to escape his notice, but alternative explanations notwithstanding, the headings aren’t there and must be supplied.

Column D reports a binary datum – whether a Gretzky goal was scored at his team’s arena or at the rink to which his team traveled for an away game. I’ll thus entitle the field Home/Away and proceed to do something about the data themselves, whose cells remain empty when signifying a home goal and register an @ for “at”, that is, a goal netted at someone else’s arena. A pair of finds and replaces – the first, substituting an H for the blank cells, with the second supplanting the @ signs with a companion, alphabetized A – should sharpen the field’s intelligibility.

The headless column F archives game outcomes, i.e. wins, losses, or ties, and so I’ll call the field Result, or something like it. Column G denotes the phase of a game when the goal was scored, either during regulation time or overtime – or so I assumed. But a second thought soon followed on the heels of that hunch, if I may mangle the metaphor: it occurred to me that the Regulation/Overtime opposition simply recalls whether or not the game itself swung into an overtime period, irrespective of the actual times at which Gretzky scored. Could that uncertainty be relieved?

I think so, and I played it this way: first, I named the doubtful field Reg/OT, and ran a find and replace at the F column, substituting Reg for any empty cell therein. I then moved toward a pivot table:

Row Labels: Date (ungrouped, in order to exhibit each date)

Columns: Reg/OT

Values: Date (Count)

What I found is that no game date featured a value for both a regulation and overtime goal, a discovery that goes quite some way toward clinching the second speculation – namely, that the Reg/OT field entries do no more than inform us if the games necessitated an overtime period.

After all, if we confine the analysis momentarily to the games that spilled into overtime, one could most reasonably imagine that a scorer with Gretzky’s gifts would have occasionally lodged a goal in both the regulation and overtime phases of the same game; but the pivot table uncovers no such evidence. For any given date, Gretzky’s score(s) appear in either the OT or the Reg column. Moreover, some of the games – for example, November 27, 1985 – record two overtime goals, a unicorn-like impossibility in a sport in which overtime ends when the first goal is scored. (You’ll note by the way that the overtime-column goals only begin to appear in 1983, when a five-minute overtime period was instituted.)

Thus I’d aver that the Reg/OT field conveys little understanding of Gretzky’s scoring proclivities; all it does is identify games that happened to have extended themselves into overtime, and in which he scored – some time.

The Strength field cites the demographic possibilities under which Gretzky accrued his goals: EV refers to even strength, when both teams’ numeric complements on ice were equal, PP, or power play, during which the scoring team team temporarily outnumbered the other after a player was remanded to the penalty box, and SH or shorthanded, the rarest eventuality – when Gretzky scored while his team was outnumbered.

I do not, however, know with certainty what the EN entry in the Other field represents even though I probably should, and I see nothing in Data World’s data dictionary that moves to define it. It may very well stand for end, as in end of game, however; each of its 56 instances are joined to goals there were scored with fewer than two minutes left in their respective games. EN may then stand for scores achieved after the opposing goalie skated off in a losing cause and was replaced by offensive player, in order to buttress a desperate try at equalizing the game. Indeed – all 56 of the EN goals were scored in wins by Gretzky’s team.

As a matter of fact, I think I’m right. Filter the Other field for its ENs and look leftward at the Goalie field in L. There’s nothing there.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: