Veröffentlichung über Lc0 neue BT4 Architektur

By Lothar Jung Date 2026-03-01 09:00

I'm happy to announce that our paper on Leela's BT4 architecture, the "Chessformer", has been accepted for publication at the International Conference in Learning Representation! (ICLR 2026, see https://openreview.net/forum?id=2ltBRzEHyd).

To summarize our findings, we use a 64-token encoder-only transformer with a position encoding called the Geometric Attention Bias (previously referred to as Smolgen) to achieve state-of-the-art searchless chess strength and human move-matching with 20x fewer parameters and FLOPS. We also trained transcoders on the MLP activations of these models, finding features corresponding to several nontrivial human chess concepts like "square on b3, b6, f3, or f6 in the opening that have been weakened by the lack of a bishop pawn". The improvement in modeling strong human play is particularly notable. Prior approaches required search to accurately model this caliber of play well, but a raw Chessformer is enough to outperform these by up to 5%.

We are in the process of open-sourcing code the repository, but for now the big takeaways are:

(1) The Chessformer is capable of achieving state-of-the-art results on a variety of chess modeling tasks, giving it the potential to either greatly increase output quality or greatly decrease resource cost compared to prior architectures.

(2) Strong, searchless, human-like play will now cheaply be available to projects like the Leela Odds bots which aim to produce results that are strong *and* human-geared. The fact 2% increase in move-matching accuracy at ratings below 2000 from taking move history into account may also prove useful.

(3) Having interpretable, square-attributable features across a range of skill levels could enable a better understanding of how humans process chess and allow for the design of better features for NNUE architectures.

(4) More broadly, a strong tokenization and position encoding choice are critical to the success of a transformer.

By Lothar Jung Date 2026-03-01 10:04

Zum Spielstärkegewinn:

Engine strength and tournament performance

We acknowledge that our claims about engine strength and tournament performance were not justified and have attempted to rectify this through additions to Section 5.4 which addresses them both. In particular, it describes the format, models used, and results of three prominent online chess engine tournaments where Chessformer-equipped Apollo configurations defeated large pools of engines that included the top Stockfish engine. It also compares the playing strength of an Apollo configuration equipped with the Apollo-CF Chessformer model against the older convolution-based Apollo-CNN model that Apollo had used in the most recent major tournament. We see consistent gains of 100 Elo points, which is especially large for a top engine. For context, in the 14 months between June 2023 and September 2024, the Stockfish project continuously employed over a thousand CPU cores to run over 10 thousand volunteer-submitted tests, gaining roughly 46 Elo on a similar testing setup.

By Reinhold Stibi Date 2026-03-01 11:43 Upvotes 1

Lothar, wir sind ein deutschsprachisches Forum und kein Englisches.

Deine English-Kenntnisse könntest du anderweitig zum Ausdruck bringen.

Gruß
Reinhold

By Lothar Jung Date 2026-03-01 12:01

Hallo Reinhold,

der Artikel läßt sich durch kostenlose Tools, wie zum Beispiel DeepL, in Sekunden übersetzen.
Das ist doch kein Problem!

Gruß

Lothar

By Lothar Jung Date 2026-03-01 14:58

Für dich und deinen Spezi:

Seht es doch als kostenloses Weiterbildungsangebot.

By Patrick Götz Date 2026-03-06 15:52

Übersetzt mit DeepL.com:

Wir erkennen an, dass unsere Behauptungen über die Leistungsfähigkeit der Engine und die Turnierleistung nicht gerechtfertigt waren, und haben versucht, dies durch Ergänzungen in Abschnitt 5.4, der sich mit beiden Themen befasst, zu korrigieren. Insbesondere werden darin das Format, die verwendeten Modelle und die Ergebnisse von drei bedeutenden Online-Schach-Engine-Turnieren beschrieben, bei denen Apollo-Konfigurationen mit Chessformer eine große Anzahl von Engines besiegten, darunter auch die Top-Engine Stockfish. Außerdem wird die Spielstärke einer Apollo-Konfiguration mit dem Apollo-CF-Chessformer-Modell mit der des älteren, auf Faltung basierenden Apollo-CNN-Modells verglichen, das Apollo beim letzten großen Turnier verwendet hatte. Wir sehen konsistente Gewinne von 100 Elo-Punkten, was für eine Top-Engine besonders viel ist. Zum Kontext: In den 14 Monaten zwischen Juni 2023 und September 2024 setzte das Stockfish-Projekt kontinuierlich über tausend CPU-Kerne ein, um über 10.000 von Freiwilligen eingereichte Tests durchzuführen, und erzielte dabei etwa 46 Elo auf einer ähnlichen Testumgebung.