Jump to content

Using Pythagorean Expectation to predict remainder of VHL S66


DMaximus

Recommended Posts

As we move past the midway point in the VHL season. I thought it would be a good time to use some statistics to help predict the remainder of the season. Pythagorean Expectation is one of the better methods to use to create a predictive model.[1] The Pythagorean Expectation takes our beloved Pythagorean Theorem and translates it to hockey to create an expected win percentage based on Goals For and Goals Against. Researchers found that a more precise model could be created using a different exponent than Pythagoras’ 2. To determine the proper exponent to use, take the totals goals scored per game and raise it to .458[2].

 

For the VHL this year there has been 1282 goals over 218 games equaling 5.88 goals per game. That gives us a Pythagorean exponent of 2.251.

Now we take that exponent that is based on the scoring rate this season and apply it to each team’s goals for and goals against to figure out their expected winning percentage.[3]

 

Here’s our results for the current season:

Team

GP

Wins

Goals For

Goals Against

Expected Winning %

Actual Winning %

Win differential

Helsinki Titans

44

33

148

102

0.698

0.750

2.286

Riga Reign

44

26

123

105

0.588

0.591

0.123

Calgary Wranglers

43

24

124

110

0.567

0.558

-0.382

HC Davos Dynamo

44

22

135

122

0.557

0.500

-2.496

New York Americans

44

23

113

110

0.515

0.523

0.334

Vancouver Wolves

43

23

133

132

0.504

0.535

1.317

Toronto Legion

43

21

129

139

0.458

0.488

1.303

Malmo Nighthawks

43

21

121

120

0.505

0.488

-0.701

Moscow Menace

44

15

153

177

0.419

0.341

-3.424

Seattle Bears

44

10

103

165

0.257

0.227

-1.315

 

A positive win differential means that team has won more games than expected, that team is over-performing. A negative means they won less than expected, they are under-performing.

 

Using the expected winning percentage, we can figure out the expected number of wins each team should get over the remaining games in the season:

Team

Expected wins for remainder of season

Helsinki Titans

19.545

Riga Reign

16.467

Calgary Wranglers

16.443

HC Davos Dynamo

15.589

New York Americans

14.424

Vancouver Wolves

14.623

Toronto Legion

13.284

Malmo Nighthawks

14.635

Moscow Menace

11.724

Seattle Bears

7.201

 

Using that can give us a crude prediction for the final standings. I say crude because we’re not accounting for OT loses, which means we’ll underestimate the total points most teams have at the end of the year.

 

Here’s the final standings prediction using this model:

Team

Wins

Loss

Points

Helsinki Titans

53

19

106

Riga Reign

42

30

84

Calgary Wranglers

40

32

80

HC Davos Dynamo

38

34

76

Vancouver Wolves

38

34

76

New York Americans

37

35

74

Toronto Legion

34

38

68

Malmo Nighthawks

30

42

60

Moscow Menace

27

45

54

Seattle Bears

17

55

34

 

Seeing how close the projected standings are for the makes me wish I took into account overtime loses, because that will make a huge difference in how the actual final standings turn out. But I still think this will be a fairly accurate projection of the actual final results.

 

* I apologize for the table formatting, it didn't copy over nicely and I can't get it to copy with the boarders properly.

 

[1] See this paper for an analysis on predictive models in hockey. They conclude that a Poisson model gives the best results. Computing that is well outside my current capabilities. http://www.hockeyanalytics.com/Research_files/Win_Probabilities.pdf

 

[2] Why .458? It was determined to be the optimal rate for the NHL and it’s beyond me to figure out what it should be for the VHL.  http://www.hockeyanalytics.com/Research_files/Win_Probabilities.pdf

 

Link to comment
Share on other sites

58 minutes ago, DMaximus said:

Pythagorean Expectation is one of the better methods to use to create a predictive model

So did you do this all manually? Because this seems like a lot of work.

Link to comment
Share on other sites

55 minutes ago, Beaviss said:

Interesting take but one thing you didn't take into account is STHS being STHS.


This is kinda sad. 

What was the results of the internal testing for FHM5 as a sim engine again? lol

Link to comment
Share on other sites

2 minutes ago, Peace said:


This is kinda sad. 

What was the results of the internal testing for FHM5 as a sim engine again? lol

 

never did FHM5

Link to comment
Share on other sites

3 minutes ago, Peace said:


This is kinda sad. 

What was the results of the internal testing for FHM5 as a sim engine again? lol

 

Just now, Beaviss said:

 

never did FHM5

 

Would EHM (Eastside Hockey Manager) have been tested? It is older but better I know some of us play it and enjoy it.

Link to comment
Share on other sites

9 minutes ago, Rayzor_7 said:

So did you do this all manually? Because this seems like a lot of work.

 

The formulas are already established and used by many statisticians across many sports. I took their formulas, threw in the VHL numbers and spit out the results.

Link to comment
Share on other sites

5 minutes ago, Rayzor_7 said:

 

 

It is older but better I know some of us play it and enjoy it.

 

I disagree. 

EHM's user interface is better - much better - but FHM5's sim engine is superior to EHMs. I honestly think there a lot of people wear rose tinted glasses when comparing EHM to FHM5, but if we were talking about FHM4 I'd be leaning towards EHM as well. FHM5 brought a lot of significant upgrades to the series and beyond the user interface it's a treat to play. Now obviously I can only base my opinion on observation, but I've put in over five hundred hours on both titles over their release and I honestly believe FHM5's sim engine is the better choice if VHL went down that route. 

Link to comment
Share on other sites

38 minutes ago, Rayzor_7 said:

 

 

Would EHM (Eastside Hockey Manager) have been tested? It is older but better I know some of us play it and enjoy it.

 

It was attempted just recently in the NSHL. Player import and stat exports are difficult, to say the least. And because player attributes change on a daily basis, you basically have to re-upload the player file each day. Add to that the difficulties with modifying schedules and adding/removing teams and the crash-happiness of EHM as a whole and the league was a mess.

Link to comment
Share on other sites

37 minutes ago, Enorama said:

 

It was attempted just recently in the NSHL. Player import and stat exports are difficult, to say the least. And because player attributes change on a daily basis, you basically have to re-upload the player file each day. Add to that the difficulties with modifying schedules and adding/removing teams and the crash-happiness of EHM as a whole and the league was a mess.


This is why I think building a FHM5 test would be beneficial - you can do all of that using the build in commissioner mode.

Link to comment
Share on other sites

(Hi still here stalking the VHL)

 

I really liked this piece, always appreciate people applying statistical analysis to sim leagues. Have played around with it in crude fashion myself as well, and was curious - is there any sort of strength of schedule component in what you ran? That's one thing I've run into with STHS in particular - scheduling can be really wonky at times, with Riga playing Seattle 8 of its first 10 games or something like that, which I feel like would make limited sample size projections difficult. (Also asking without actually looking at the index, it may be more balanced with 10 teams for all I know.)

Link to comment
Share on other sites

20 minutes ago, CowboyinAmerica said:

(Hi still here stalking the VHL)

 

I really liked this piece, always appreciate people applying statistical analysis to sim leagues. Have played around with it in crude fashion myself as well, and was curious - is there any sort of strength of schedule component in what you ran? That's one thing I've run into with STHS in particular - scheduling can be really wonky at times, with Riga playing Seattle 8 of its first 10 games or something like that, which I feel like would make limited sample size projections difficult. (Also asking without actually looking at the index, it may be more balanced with 10 teams for all I know.)

 

Thank you for the kind words. For what I posted here, I used the basic Pythagorean Expectation to calculate the expected win ratio. That is just using Goals For and Goals Against. It does not take anything else into account.

 

You're correct that using strength of schedule would create a more precise model. In fact, sabermetricians use 2nd order (which uses expected runs scored and allowed) and 3rd order (which factors in strength of schedule) to more accurately predict future wins in baseball. https://en.wikipedia.org/wiki/Pythagorean_expectation#"Second-order"_and_"third-order"_wins

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...