Haha, ich lach mich schief, die errechnen ihre Elo aus simplen Stellungstests.
Aus der Introduction:
The idea of Intrinsic Performance Ratings (IPR’s) is to judge skill based on the quality of decisions
made rather than the outcomes of contests. Aside from the issue that the outcome depends on the
skill of opponents and on factors variously called “luck,” there is a simple sample-size motivation.
A chess professional may play 50 games in a given year and call that a lot, but as a statistical
sample this is scant. However, those games may average 30 important move decisions, yielding a
healthy sample of 1,500 moves. Analysis of those moves by computer programs to sufficient depth
to be stronger than the player can then provide both an objective measure of skill, and reasonably
informative confidence intervals on the assessment.
A common feature of chess magazines or columns, one long called “Solitaire Chess” in magazines
of the US Chess Federation, involves pausing before each move (usually those by the winning
side) of a selected game, and choosing from several plausible alternatives. A strong player composing
the puzzle has provided point values for each choice. At the end the reader adds up the
points for all of his/her choices, and there is a table giving corresponding skill levels. The levels
are often given as ratings on the international Elo scale, where for instance 2200 is commonly the
threshold for “master,” or it may give prose names master, expert, amateur, etc. for those levels.
We do not know of any attempt to make this correspondence scientific.
The IPR model is basically “Solitaire Chess” done scientifically, using suitably-scaled differences
in values given to moves by authoritative chess programs as the “points.” Although the
differences are negative, the model would be unchanged if we declared that the best move is always
worth 5 points and differences in the usual pawn/centipawn units of chess engines were subtracted
1 from it. The correspondence between points and Elo rating is first established by training the model
on large sets of games by players with established Elo ratings. The model generates projections of
how many points a player with a given Elo rating would score on a standardized “Solitaire Set.”
To generate an IPR for a player’s performance in an event, or for a whole event, or for any set
of games, we run the training process in reverse: First we train the model on that set of games.
Then we take the parameter values that were fitted in the training, and use them to generate a
projected points value on the Solitaire Set. The corresponding Elo value is then read off. We do
not go directly from the parameters to Elo because there is more than one dependent parameter
in the model, and the tradeoff between the two parameters called s and c in the current simple
form already seems difficult to assess. The “Solitaire” step also affords a reasonable way to project
confidence intervals that currently seem to be no worse than about 30% too narrow—i.e., modeling
error requires no more than a 1:4 multiplier on them.
Das freut mich auch deshalb so besonders, weil es zeigt, wie diese höheren Mathematiker da die Sache mit den einzelnen Stellungen und Zügen und den ganzen Partien und deren ganzen und halben Punkten einfach in einen Hut werfen, was da wohl eng-eng-eng- Hardliner davon halten werden?

Noch dazu aus der Conclusion:
Acknowledgments. Foremost we thank the programmers of the Arena chess GUI for full scripting
and recording of computer analysis, and those of TOGA II and RYBKA 3 for their engines and
advice.
Sie ziehen also noch dazu offenbar als "Referenz" Rybka 3 und Toga2 heran und wonach sie die Qualität der Züge der GM bewerten, sind einfach die numerischen Evals dieser Engines, nach denen vergeben sie dann ihre zu gewinnenden Punkte pro Stellung.
Oder bin ich mathematisch zu dumm für diese Arbeit und habe ich da etwas prinzipiell missverstanden?
(Ehrlich gesagt hab ich's nur durchgeblättert, weil mir die Introduction und die Conclusion schon ziemlich gereicht haben, nur um nicht meinerseits missverstanden zu werden, ich find das genial einfach bzw. vielleicht sogar einfach genial, aber ich bin ja nur ein alter Stellungstester und eher kein echter Celolist

)