Simulating the 2019 NHL Draft 1 Million Times

By guest Contributor Tyler @TylemakerV

Last year, I wrote a program to simulate the first round of the NHL draft 1 million times based on public rankings. I posted the results on social media and people appears to enjoy them. I also really enjoyed putting it together and seeing the results. Thus, I decided to do it again. Rig City Sports was kind enough to let me post a writeup here. Feel free to skip to the results.

Intro and Purpose

The goal is to see, given publicly available information, the likelihood of each player getting selected at any given location in the draft. Rather than just a consolidated list of rankings, I wanted to run the draft over and over to really account for where players might go. This will help catch a true range of expected draft results (given the public rankings), including potential drops. Essentially, I forced a computer to do 1 Million mock drafts.

Rankings

For the purpose of this simulation, I used I used 17 different rankings. To protect IP, I did not link or post rankings behind a paywall: Future Considerations, Russ Cohen (Sportsology), Craig Button (TSN), Prospect Pipeline, ISS Hockey, Hockey Prospect, Ryan Kennedy (THN), McKeens Hockey, Corey Pronman (Athletic), Cam Robinson (Dobber Prospects), Sam Cosentino (SN),Bob McKenzie (TSN), Scott Wheeler (Athletic), EliteProspects, Tony Ferrari (Dobberprospects), Will Scouch (Scouching), Steve Kournianos (Draft Analyst).
I tried to get a good range of opinions as the more diverse the lists the better this works. This year actually has considerably more diverse rankings than last year. The top 2 are clear cut. After that, there is a ton of disparity (Such as Zegras as high as 3rd and as low as 21st!). I was really curious how this would play out in the simulation.

Methodology

This model works by simulating the draft 1 million times. Teams are randomly assigned rankings to use in each iteration.
Each simulation essentially works like this: Every team walks into Vancouver on draft day and randomly selects a piece of paper containing one of the aforementioned rankings. They then select the highest-ranked player still available to them at their pick. Once all 31 picks are made, we restart. 31 picks, repeated 1 million times for 31 million total selections.

Benefits and Shortfalls of this Analysis

The entire idea behind this sort of analysis is to try and capture the potential fallers in a draft. Something that isn’t captured by an average of public rankings. Indulge in the following scenario:

Suppose the Blackhawks, Avs, and Kings internal rankings look like this:

Blackhawks Avalanche Kings
#1 Hughes #1 Hughes #1 Hughes
#2 Kakko #2 Kakko #2 Kakko
#3 Turcotte #3 Podkolzin #3 Zegras
#4 Byram #4 Byram #4 Byram

These lists are reasonable as all of these players are ranked as high as 3rd on the aforementioned lists. On draft day, Turcotte will go 3rd, Podkolzin 4th and Zegras 5th. Byram would then fall to 6th. So despite all teams agreeing that Byram is the 4th best player in the draft, the hockey world will be stunned as they watch him fall to 6th (or beyond).

This is the reason we see certain players such as Zadina, Veleno and Wahlstrom fall way below where anyone could possibly imagine. Last year, when I ran this simulation, all 15 rankings I used had Wahlstrom in the top 10. One would assume this would’ve made him a lock for top 10 pick. I saw people online claiming “there is NO WAY he could fall to 10th overall, look at the rankings!”. Nonetheless, using those same rankings, he fell outside the top 10 in a whopping 1.24% of the simulations. Lo and behold, on draft day, he slipped to 12th. We see this with Caufield this year, who is ranked 13th or higher in all 17 rankings, but falls past the 13th pick in over 1% of simulations. This is something only a simulation could catch (there is theoretically a way to mathematically compute the exact probability at each selection but this would be incredibly complex after the first few selections). Furthermore, you will see some players that are ranked in similar ranges by most rankings, yet have drastically different simulation results. This is because order matters. It can be quite tricky to really see where a player might go based solely on rankings (even more so on consolidated ranking averages).

Contrarily, this analysis has no ability to predict potential “reaches”, where a player goes higher than expected. Last year we saw Hayton go 5th overall which was deemed “impossible” by my simulation as no ranking had him 5th. It’s very likely this happens again. For example, I have seen quite a bit of Philip Tomasino hype online. Theres a few rankings that came out after my simulation that have him in the top 12. I wouldn’t be completely shocked to see him go in the 12-16 range. However, as none of the rankings used in this simulation have him higher than 17, the simulation cannot predict him going higher than this.

Results

Lets cut the preamble and get to the results. If you want to see a spreadsheet of the results that can be found here. Likewise an album of all the players individual results can be found here.

The “Top 11”

The “Top 11” (Noting Hughes and Kakko are just lines)

After running the sim, I found there was a fairly defined “Top 11”. Each of these 11 players was picked 11th or earlier in at least 75% of the simulations. No other player cracked 11th or higher more than 25% of the time. So we can say there’s a fairly defined top 11. Lets break that down further

The big 2

There isn’t much to say about these 2. Hughes and Kakko go 1st and 2nd respectively in all 1,000,000 simulations. I expect the same on Friday.

The rest of the top 11

After the big 2 there is a bit of contention around 3rd overall. The recent general consensus is it will be either USDT Forward Alex Turcotte or WHL Defenseman Bowen Byram. The simulations agreed. Notably, however, these players both have a reasonable chance of falling to 5th or beyond. There’s even an impossibly small chance Turcotte falls outside the top 10 as seen in 21 out of the 1 million simulations. After this is a big pack that includes 3 WHL forwards (Dylan Cozens, Kirby Dach, Peyton Krebs) and 3 USDT forwards (Trevor Zegras, Cole Caufield, Matt Boldy). Finally, the Russian enigma Vasili Podkolzin, who ended up higher in this sim than I expected. Overall, after Hughes and Kakko, it gets interesting.

The Middle Pack

The middle pack

The next somewhat defined group is the 13 players seen above.

6 of the players in this group sneak into the top 10 in at least one simulation. BCHL Forward Alex Newhook and Swedish D-men Philip Broberg and Victor Soderstrom are the only players in this group that crack the top 15 in over 50% of simulations.

Soderstrom is an interesting one in particular as despite going top 20 in approximately 99% of simulations, he fell to the mid and late twenties in a handful of sims. He had the widest range of selections (as high as 8th). He even dropped to the second round in one single sim. You can now confidently say there is a 1 in a million chance that Victor Soderstrom drops to the second round.

The Wild Cards

Also in this group of 13, we find 3 wild cards. OHL winger Arthur Kaliyev is arguably the most polarizing prospects this year. Some rankings have him as high as 11th, and others have him outside their top 31. Yet it looks as though the 2 extremes will balance out nicely as he’s projected to go in the middle of the first round. Notably he fell outside the first round in 4 sims. American goaltender Spencer Knight is heralded as the best goalie prospect since Vasilevsky. He could honestly go anywhere. Notably, his results here are likely skewed by the fact that some of the rankings used seperate out their goaltenders from their skaters. Finally, Russian forward Pavel Dorofeyev is all over the rankings, and this resulted in a cool looking graph.

The Rest of the Expected First Rounders

The best of the rest

Admittedly, the results are pretty blurry around this range. As often seen in real life, players ranked 40th or 50th or later will squeeze into the late first round. Likewise, guys considered “late first round picks” will drop outside. But here are the only other players who are first round picks in at least 30% of the simulations. I’ll touch on a few of these players.

These 2 could probably have mixed in with the previous group as they are both very likely first round picks. Swedish forward Nils Hoglander only fell to the second round in 7.87% of simulations but also isn’t projected to go much higher than 20th. Philip Tomasino, as mentioned earlier, has many fans, and there is a realistic chance he jumps into the top 15. Likewise, he only fell to the second round in about 3% of simulations.

One of the widest ranges of players was QMJHL Forward Jakob Pelletier, who went anywhere from 14th (in 0.0023% of sims) to falling out of the first round (in 23.77% of sims). Swedish forward Simon Holmstrom also had a large range (as high as 16th), but is more likely (~53%) to go in round 2.

Could this type of analysis help an NHL team?

The entire point of this analysis is to show just how random the NHL draft is. It’s chaotic. Accurately projecting where a player might go is extremely risky for NHL teams. However, we can also see that we can find some order in the randomness. Work on draft optimization has been done before, and could potentially be combined with this sort of analysis. This sort of analysis does do a better job of predicting where a player might go than looking at rankings alone. The problem is NHL teams rankings vary wildly from public opinion and from one another, especially after the first 30 or 40. At that point anything can happen and we see players get picked seemingly out of nowhere. Nonetheless, this analysis could be useful in trade decisions. Let’s take the Oilers for example, who are rumored to be very high on Philip Broberg. They might be planning on taking him at 8th. Suppose there’s an offer on the table to move down 5 spots in exchange for an asset. The Oilers might look at their internal lists and think there’s no way Broberg falls past 11th or 12th and thus will turn down the offer. However, given the information we know, Broberg is likely still available at pick 13. He falls that far in over 63% of simulations. Armed with this knowledge, the Oilers could more accurately weigh the costs and risks of moving down in the draft against the asset being offered.

Implications for the Oilers

Being an Edmonton blog, we should definitely touch on how this effects the Oilers.

Picking 8, the Oilers are likely to have several high quality options available. Best case scenario Alex Turcotte falls all the way down, but this only happens in 0.6951% of simulations. Recent rumors suggest they might look at Philip Broberg in this range, he will almost surely be available at this point as he gets picked in the top 7 only 0.4% of simulations. I personally hope the Oilers target one of the highly talented North American forwards. Several of Cozens (39.51%), Dach (53.82%), Boldy (95.64%), Caufield (61.51%), Zegras (29.86%) and Krebs (88.16%) will still be available at 8th overall. It is very possible one of these is a future Oiler. 

Regarding the Oilers second round pick at 38th overall. Some interesting names may still be available. Obviously the pipe dream is to see one of the big names like Suzuki, Tomasino, Heinola, or Lavoie slip all the way to 38th, but as seen above, this is highly unlikely. However, there might be some interesting names that could fall to 38th including McMichael, Holmstrom, Poulin, and Honka. Each of these players slipped to round 2 in about half or more of the 1 million simulations, which means 38th is somewhat possible.

FAQ

Here’s a quick rundown of questions people have asked in the past, or you may have.

  • Why only simulate the first round?
    • A couple reasons. Firstly, many of these rankings only go to 31 spots and I don’t want to over-saturate on the longer lists. Secondly, many of these rankings are public to #31 but behind a paywall for everything afterwards. Thirdly, There is so much variety in lists beyond the first 20-30 picks that it’s basically useless projecting beyond that.
  • How did you do this?
    • I modified and re-used my Python program from last year for the actual simulations. The rankings and results were compiled in excel. I used R (including ggplot2 and dplyr packages) to create the graphs and charts.
    • The Python script took 28 minutes and 48.57 seconds to complete the 1 Million simulations.
  • Can I see or use the code?
    • Yes, the code is available here on GitHub
    • Feel free to download it, or view it. I’m not a software developer so it’s not very nice code, but could be modified for other drafts (NBA/NFL).

Other Links and Closing Thoughts

In case you missed it, here are some links to the results as well as a neat read:

Thanks for reading. I’m excited for Friday to see what might happen. I also may be posting occasionally on this site from now on. Feel free to toss me a follow on twitter at @TylemakerV if you want.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this:
search previous next tag category expand menu location phone mail time cart zoom edit close