Parameter für BT3 auf einer GPU

Not logged inCSS-Forum

Forum

CSS-Online

Help

Search

Login

CSS-Shop

Impressum

Datenschutz

Topic Hauptforen / CSS-Forum / Parameter für BT3 auf einer GPU

By Lothar Jung Date 2023-10-15 11:44 Upvotes 2

Hier auf Discord:

Big transformer 3. New network arch using smolgen-augmented self-attention from BT2. It has embedding size 768, ffn projection size 1024, 24 heads per layer, and 15 smolgen encoder layers with mish activation. There are also cuda optimizations available, which should reduce latency by 10 to 15%. It has 3 policy heads: vanilla, optimistic and soft. Vanilla and optimistic can be used for play, while soft helps speedup training. The optimistic policy head improves policy predictions drastically in tactical positions. It has 3 value heads: winner, q and st. Winner head is trained on game outcome, while q head is trained on position Q-value. The ST value head is a weighted average of short-term future value from current position. The ideas are from Katago methods: <https://github.com/lightvector/KataGo/blob/master/docs/KataGoMethods.md#optimistic-policy>.

Quickstart:
The new engine version can be found at <https://github.com/Ergodice/lc0/tree/uncertainty-weighting>
Executable files can be found at <https://ci.appveyor.com/project/Ergodice/lc0/builds/48245606>
BT3 can be found at <https://drive.google.com/file/d/1J_eUuulV3HB9om8dy9Ltmmtocr2JjaDx/view?usp=sharing>

To enable all the new features, put the following in the config file:
```--uncertainty-weighting-cap=1.03
--uncertainty-weighting-coefficient=0.13
--uncertainty-weighting-exponent=-0.88
--use-uncertainty-weighting=true```

If you are using a single GPU, add
--backend-opts=policy_head=vanilla,value_head=winner
Otherwise, check the Github for instructions.

Last updated: 10/11/23

Topic Hauptforen / CSS-Forum / Parameter für BT3 auf einer GPU