Lc0 T2 Netz ist stärker al T1 Netz

By Lothar Jung Date 2023-08-25 16:49 Upvotes 1

**Match:** T2 vs. T1
**LC0 version:** v0.31.0-dev, dag-update_03890b
**LC0 options:** Backend=cuda-fp16, MinibatchSize=40, MoveOverheadMs=0, StrictTiming=true
**Hardware:** Ryzen 5 3600 (6x3.6GHz) + RTX 4090@300W
**Time control:** 12s/game+0.2s/move
**Book:** unbalanced 4-moves book (80-100cp)
**Speed:** t2: 10949 nodes/move (median), t1: 10824 nodes/move (median)
**Tablebases:** 6-man
**Adjudication:** 6-man TBs, -resign movecount=3 score=300, -draw movenumber=1 movecount=4 score=5
**Software:** Cutechess-CLI, restart=on, timemargin=1000
```
   # ENGINE    : RATING ERROR POINTS   GAMES DRAWS(%) CFS(%)
   1 lc0.net.t2-768x15x24h-swa-5230000 : 25.1 8.1 1583.5 3000 53.0   100.0
   2 lc0.net.t1-768x15x24h-swa-4000000 :    0.0   ---- 1416.5 3000 53.0    ---

White advantage = 178.9 +/- 4.1
Draw rate (equal opponents) = 93.9 % +/- 2.7```

By Lothar Jung Date 2023-09-03 09:22

Neue optimierte Parameter für die T2 Netze:

Tune of search parameters (T2)
**LC0 version**: v0.31-dag (https://github.com/Ergodice/lc0/commit/9137daaf)
**LC0 options**: Network: t2-768x15x24h-swa-4832500.pb.gz, MinibatchSize=216, MoveOverheadMs=0, CPuctFactor=3.973, CPuctBase=45669, PolicyTemperature=1.4, WDLDrawRateReference=0.61
**SF options**: SF16, Threads=16, Move Overhead=0
**Tuning ranges**: CPuct: [1.0, 5.0], FpuValue: [0.0, 2.0], WDLCalibrationElo: [2700, 4000]
**Tuning configuration**: acq function: ts/mes, 268 iterations/24120 rounds/48240 games
**Hardware**: EPYC 7443 + NVIDIA A100
**Time control**: lc0: 30+0.3, SF: 20+0.2
**Book**: UHO_4060_v2
**Tablebases**: 6-man
**Adjudication**: 6-man TBs, -resign movecount=3 score=550, -draw movenumber=1 movecount=10 score=8
**Software**: chess-tuning-tools
**Optimum found**: `{'CPuct': 2.484687655049582, 'FpuValue': 0.8585997118897314, 'WDLCalibrationElo': 3268.6729444511716}`

By Lothar Jung Date 2023-09-03 10:22

Hier die neue Engine für TCEC:

https://ci.appveyor.com/project/Ergodice/lc0

By Stefan Pohl Date 2023-09-08 08:21

Der Testrun von Lc0 mit der TCEC 25 Binary und dem TCEC 25 Netz ist durch und online:

https://www.sp-cc.de/nn-vs-sf-testing.htm

By Lothar Jung Date 2023-09-08 08:40

Danke!
Gute Steigerung durch neue Engine und Netz.

By Stefan Pohl Date 2023-09-08 11:16

Lothar Jung schrieb:

Danke!
Gute Steigerung durch neue Engine und Netz.

Ja, allerdings zeigt die Gamepair Auswertung, daß Stockfish 15.1 doch noch recht dominant ist: 148 Gamepairs gewonnen, Lc0 nur 85. Die Wahrscheinlichkeit, daß Lc0 das Superfinale gewinnt, ist also nur mikroskopisch klein. Zumal Stockfish seit 15.1 auch noch stärker geworden ist.

Code:


   # PLAYER                             :  RATING  ERROR  PLAYED     W     D    L   (%)  CFS(%)
   1 Stockfish 15.1 avx2                :       0   ----    6500  2951  2980  569  68.3     100
   2 Lc0 0.31dev TCEC 25                :     -44     22     500    85   267  148  43.7      99
   3 Lc0 0.30dev T1-4000 (15x768)       :     -79     22     500    62   265  173  38.9      59
   4 Lc0 0.30dev 811107 (19x512)        :     -83     22     500    53   278  169  38.4      61
   5 Lc0 0.30dev TCEC 24                :     -87     23     500    56   266  178  37.8      58
   6 Lc0 0.30rc1 T1-4000 (15x768)       :     -90     22     500    62   250  188  37.4      56
   7 Lc0 0.30dev T1-30875 (15x768)      :     -93     23     500    60   251  189  37.1      54
   8 Lc0 0.30dev BT2-4510 (15x768)      :     -94     21     500    60   249  191  36.9     100
   9 Lc0 0.30rc2 814174 (15x768)        :    -168     24     500    28   221  251  27.7      67
  10 Lc0 0.30dev 813207 (15x768)        :    -176     26     500    21   226  253  26.8      78
  11 Lc0 0.30dev TCEC 20                :    -190     24     500    25   203  272  25.3      75
  12 Lc0 0.30dev T1-2432500 (10x256)    :    -202     25     500    20   200  280  24.0      58
  13 Lc0 0.30dev TCEC 22                :    -206     26     500    25   186  289  23.6     100
  14 Lc0 0.30dev TCEC 18                :    -315     32     500    12   118  370  14.2     ---

------------------------------------------------------------------- 
--- Number of all Gamepairs          : 6500 
--- Number of drawn Gamepairs overall: 2980 (= 45.85%) 
--- Number of 1:1 drawn Gamepairs    : 1516 (= 23.32%) 
--- Number of 2-draws drawn Gamepairs: 1464 (= 22.52%) 
-------------------------------------------------------------------

By Lothar Jung Date 2023-09-08 13:58

Ich könnte mir vorstellen, dass das weiter unten angesprochene T1 distilled Netz gegen Stockfish besser abschneiden wird. Gerade wegen der GPU und dem sehr kurzen TC.

By Andreas Wutzke Date 2023-09-08 08:56

Danke!
Die neue Engine hab ich mir runtergeladen, aber wo bekommt man das neue Netz für TCEC?

Grüße
Andi

By Lothar Jung Date 2023-09-08 09:01 Edited 2023-09-08 09:05

Hier der Link:

https://drive.google.com/file/d/154iaRCC4BEZnJfDnpGfpwoxMR0C9HOpA/view?usp=drivesdk

Unter Discord/Lc0/ bot-spam findest Du viele Download-Links zu Netzen, Engines und Infos.

Falls Du eine Einladung zu Discord brauchst, sag mir Bescheid.

Gruß

Lothar

By Andreas Wutzke Date 2023-09-08 09:17

Danke werds am Wochenende mal testen, bin schon gespannt....

By Lothar Jung Date 2023-09-03 12:43 Edited 2023-09-03 12:53

Hier die Neuerungen der neuen Lc0-Version:

https://github.com/Ergodice/lc0/tree/boosting

Aus den Informationen der Beiträge des Threads läßt sich die (theoretische) Elo-Steigerung der neuen Engine zusammen mit dem aktuellen T2 Netz errechnen.

By Lothar Jung Date 2023-09-08 10:00 Edited 2023-09-08 10:03

Hier noch Infos zum minimax boosting (optional):

Pushed some changes to https://github.com/Ergodice/lc0/tree/boosting, an experimental hybrid between minimax and AlphaZero which "boosts" the weight on the best move. The branch also allows what is reported as the nodes/nps values to be specified by `--reported-nodes` among
```
"playouts" or "legacy" (what we used to report as node count)
"nodes" (actual nodes, or LowNodes in the code)
"queries" (neural network queries)
```
If possible, I would greatly appreciate if someone with a GPU more powerful than my Geforce MX230 compared the query speed of the early game vs the late game. If my hunch is correct we aren't getting as many queries in while searching transposition-heavy positions since a large portion of the batches are just edges, which don't require an NN eval.

Minimax Boosting

At each node, rather than using an average of the playouts, a "boost" is applied to the best moves which increases the weight. This may assist with the problem of evaluations changing slowly when a new line is found while retaining the stability of PUCT. The new parameters are

MinimaxBoostPriorWeight (default 10.0)
MinimaxBoostScale (default 1.0)
The prior weight allows this effect to be smaller at nodes with fewer visits, i.e., lower confidence. The scale is the amount the best node's weight is scaled by in the average calculation.

By Lothar Jung Date 2023-09-08 10:24 Edited 2023-09-08 10:31

Für schwächere Hardware oder kurzem TC zeigt das T1 distilled bessere Ergebnisse:

https://cdn.discordapp.com/attachments/430695662108278784/1146261100707446884/t1Dist512.png

https://storage.lczero.org/files/networks-contrib/t1-512x15x8h-distilled-swa-3395000.pb.gz

Speaking of nets, so far I am impressed with **masterkn6**'s t1-smolgen-512x15x8h-distilled-swa-3395000.pb.gz https://discord.com/channels/425419482568196106/456137110609592325/1104788507089719467 on my 1x 3080:
Snips below from https://discord.com/channels/425419482568196106/530486338236055583/1143845024656797788 and https://discord.com/channels/425419482568196106/530486338236055583/114553174565192915

By Peter Martan Date 2023-09-09 07:59 Edited 2023-09-09 08:03

Bemerkenswert finde ich, dass die letzten Engine- Versionen und Netze Ansätze zeigen, von MultiPV- Modus in bestimmten Stellungen auch zu profitieren.
In der letzten Engine- Netz- Kombi hab' ich's noch nicht in der Liste gespeichert, weil's innerhalb der error bar war, hier hab' ich jetzt den run mit single primary und den mit MultiPV=4 (MV4) vorläufig drin gelassen. Der Screening- Test mit den 256 Stellungen

https://www.dropbox.com/s/8k2xzm550ox7lw0/2562.epd?dl=0

ist ja nicht sehr selektiv, was nahe beisammen liegende Versionen oder Netze angeht, da lässt sich mit nur einer Stellung mehr kein Staat machen, aber immerhin sind die Lösezeit- Indices auch eine Spur besser.

    Program                                    Elo   +/-  Matches  Score   Av.Op.   S.Pos.   MST1    MST2   RIndex

  1 HypnoSIccf-NN240623-Set1                 : 3561    2  10474    59.3 %   3495   206/256    1.6s    2.2s   0.84
  2 SugaRXPrOIccf240623-Set1                 : 3559    2  10506    59.1 %   3495   200/256    1.4s    2.2s   0.84
  3 CrystalMZ040823-Set1                     : 3557    2  10461    58.8 %   3495   203/256    1.6s    2.3s   0.83
  4 HypnoSIccf-NN240623-Set0                 : 3556    2  10370    58.6 %   3496   199/256    1.5s    2.3s   0.83
  5 SugaRXPrOIccf040823-Set1                 : 3548    2  10225    57.5 %   3496   190/256    1.6s    2.4s   0.81
  
 25 Lc0v0.30.0-dag+git.1842e13d-3650M         : 3513    3   9043    51.9 %   3500   153/256    1.4s    2.9s   0.70
 
 27 Lc0v0.31.0-dag-5230M-MV4                  : 3509    3   9133    51.3 %   3500   147/256    1.3s    2.9s   0.66
 28 Lc0v0.31.0-dag+git.f4d40b15-5230M         : 3506    3   8944    50.8 %   3501   146/256    1.4s    2.9s   0.69

 30 Lc0v0.31.0-dag+git.8138ee5-4000M          : 3502    3   8819    50.1 %   3501   146/256    1.5s    3.0s   0.67
 31 Lc0v0.31.0-Sep23-4000M                    : 3501    3   8838    50.0 %   3501   145/256    1.5s    3.0s   0.67
 32 Stockfish15                               : 3495    3   8458    49.2 %   3501   135/256    1.3s    3.1s   0.72
 33 Dragon3.2byKomodoChess                    : 3495    3   8513    49.1 %   3501   135/256    1.3s    3.1s   0.71
 34 Berserk20230818                           : 3484    3   8340    47.2 %   3503   128/256    1.4s    3.2s   0.67
 35 Berserk11.1                               : 3475    3   8172    45.9 %   3504   118/256    1.2s    3.2s   0.64
 
 49 Lc0v0.31.0-dag+git.dd64c7ec-CPU30T        : 3421    3   8051    37.9 %   3506    96/256    2.3s    4.0s   0.23
 50 DeepHIARCS15.1                            : 3401    3   7694    35.3 %   3507    84/256    2.3s    4.1s   0.26

MST1  : Mean solution time (solved positions only)
MST2  : Mean solution time (solved and unsolved positions)
RIndex: Score according to solution time ranking for each position

Angeregt durch Thomas Plaschkes Posting

https://forum.computerschach.de/cgi-bin/mwf/topic_show.pl?pid=165928#pid165928

hab' ich mal die 128

https://www.dropbox.com/s/804b7chwli13laf/1284.epd?dl=0

single thread laufen lassen, das ist selbst für SF dev. mit 15"/Stellung zu wenig Rechenzeit, insbesonders in einer neuen Liste mit wenigen Einträgen hat man da zuviel error, aber dafür sieht man schön, dass selbst da die Fische von MultiPV (HypnoS intern) profitieren und LC0 legt mit MultiPV=4 (MV4) doch deutlich zu.

 Program                                    Elo   +/-  Matches  Score   Av.Op.   S.Pos.   MST1    MST2   RIndex

  1 HypnoSIccf-NN240623-Set1                 : 3557   12    738    59.3 %   3491    72/128    3.4s    8.5s   0.65
  2 HypnoSIccf-NN240623-Set0                 : 3547   12    713    57.8 %   3493    70/128    3.9s    8.9s   0.61
  3 Lc0v0.31.0-dag-5230M-MV4                 : 3533   14    744    55.4 %   3495    63/128    3.3s    9.2s   0.52
  4 Lc0v0.31.0-dag+git.f4d40b15-5230M        : 3512   14    690    52.1 %   3498    54/128    2.6s    9.8s   0.47
  5 SunSE-MV4                                : 3499   13    660    49.6 %   3502    57/128    5.3s   10.7s   0.46
  6 CorChess4dev-20230906-MV4                : 3496   13    654    49.2 %   3501    55/128    4.8s   10.6s   0.44
  7 Stockfishdev-20230903-MV4                : 3490   13    649    48.2 %   3502    56/128    5.6s   10.9s   0.42
  8 CorChess4dev-20230906                    : 3461   14    607    43.5 %   3507    44/128    4.5s   11.4s   0.37
  9 SunSE                                    : 3461   14    598    43.4 %   3507    43/128    4.8s   11.6s   0.43
 10 Stockfishdev-20230903                     : 3406   14    547    35.3 %   3511    30/128    3.7s   12.4s   0.33

MST1  : Mean solution time (solved positions only)
MST2  : Mean solution time (solved and unsolved positions)
RIndex: Score according to solution time ranking for each position

Wie immer die 3070ti- GPU, A-B mit 30 Threads der 16x3.5GHz CPU in der 256er- Liste, single Thread in der neuen 128er, 6Steiner Syzygys, für LC0 immer 2 Threads und 1Gb NN-cache bei den 5", 2 bei den 15".