Introducing the Update_Skate World Ranking (Episode II of II)

Yesterday, I introduced the Update Skate World Ranking. Many people were curious about its functionality and wanted more details about individual distances. In this article, exactly these things will be presented, using an interview format where I totally just didn’t make up all the questions myself.

Scroll down for the distance rankings!

Are you a programmer?
The ranking runs on a plethora of formulas in a spreadsheet program. I do know a little bit about programming, but not enough to make something like this.

How are the scores calculated?
The score of a skater in an individual distance is based on the three best corrected races in that particular event. I include a limited set of competitions. The list presented this week are based on the competitions in the “Skating Bubble”, a few national tournaments and championships, the 2020 World Championships and the 2018 Olympics; for less frequently organized, more older tournaments are included. The faster the corrected times, the better the score. This is not linear, though; an 5 second improvement from 13:00 to 12:55 in a 10km will give a smaller score improvement than a 5 second improvement from 12:40 to 12:35.

Correcting times, what do you mean?
The times are not corrected based on air pressure, temperature or other factures one has to look up. That’s just impractical for the amount of competitions I’m using. Instead, the correction formulas behind the Ranking are based on standard deviations, the international level, differences between the top and subtop and other values which can be taken from the results themselves. The time ranking is calibrated with a reference time, and all other competitions in that distance are modelled as if they were skated in conditions that result in the same reference time. The model uses three different sets of formulas to calculate three different corrected times, of which an average is taken to produce the actual corrected time. Each of these sets of formulas are influenced by different values, such as the mutual differences among the top skaters, the difference between the calibrated times and the actual times, or the comparison between the predicted level versus the actual level.

How did you develop this?
By repeatedly running competition results through the WIP model and then tweaking and adding formulas, I could let it produce more and more credible outcomes over time. Of course, this isn’t self learning or anything, but it possibly does fit the definition of an algorithm, since I fed the model new formulas based on the previous outcome. From now on, the changes will be limited, so previous scores are comparable with new ones.
Some parts are not that complicated to explain, such as calibrating the reference time or determining the threshold of the international level value, other formulas show clear effects of repeated adjustments – and I will have to admit that I’m guilty of some spaghetti code here and there.

That’s indeed complicated. Maybe I can ask something else – why is Sven Kramer so low down?
Because he was only good in one event last season.

And why is Lorentzen so high up?
He scores a decent number of points in three events: 500m, 1000m and sprint samalog. The allround and sprint samalogs and the pursuit are by the way calculated in a similar way as the traditional distances.

And the mass start?
That one is the simplest of all since it already has a points system. I modified that to make sure that winning the semifinal and the final always results in 180 points, so that winning three times is 540 points, which, along with a bonus for the highest score, results in a score of 600 pints.

So the numbers mean something?
For the individual distances as presented today, yes, they do. Skaters with a distance score of approx. between 0 and 200 points are subtop in the world. Between approx. 200 and 400 are podium candidates, between 400 and 600 are gold candidates, and higher than 600 (which is possible in time-based events) is very rare. Scores below 0 are currently not supported. Usually, between 20 and 30 skaters per distance are present in the rankings. For the overall scores, I just add up the distance scores. For women, I then multiply them with 5/6, since the women’s distances are in terms of length closer together than the men’s distances, and therefore easier to combine. For a skater to be included in a distance ranking, he or she should have at least one race that meets a certain limit score.

Why is Miho Takagi so low down in the individual distances?
After the calculation of the score, a penalty is applied for skaters who haven’t competed in international races for more than six months, which increases per month. This makes sure injured or retired skaters don’t stay at the top of the ranking for too long. A side effect is that all skaters who haven’t been able to compete in the bubble get a penalty. I deem this reasonable, since it’s hard to compare skaters to one another when the only references are other skaters from the same country. Without this penalty, Miho Takagi would have easily topped the 1000m and 1500m distance rankings. Despite all of this, she leads the overall ranking! Without the penalty, she’d have 572 pts in the 1000m and 590 pts in the 1500m.

Last question: is it a coincidence that you wrote about Roest and Takagi earlier this month, and they are the ones who top your rankings?
No. Absolutely not.

Distance Rankings

I plan on updating these rankings throughout this season.

Stay tuned !

One thought on “Introducing the Update_Skate World Ranking (Episode II of II)

Leave a comment