Not logged inCSS-Forum
Forum CSS-Online Help Search Login
CSS-Shop Impressum Datenschutz
Up Topic Hauptforen / CSS-Forum / SCCT – Tactical Test Suite's Results
- - By Sedat Canbaz Date 2015-07-19 15:10


Hello there,

SCCT – Tactical Test Suite's Current Results:
http://www.sedatcanbaz.com/chess/?page_id=1956

Note also that I plan to continue working over the current tactical test suite
And probably later I will test the top chess engines with 30 sec per position

Have fun,
Sedat
Parent - - By Bernhard Traven Date 2015-07-19 19:39
thank you Sedat!! 
Parent - - By Sedat Canbaz Date 2015-07-20 20:53
Not at all, it is my pleasure Bernhard )
Parent - - By Sedat Canbaz Date 2015-07-20 21:02
Btw, I've just updated the ranking list and more engines are included ...

It is interesting to note that,
2 (two)  newer versions  performed worse than the older ones
*Note: to be sure 100%, I replayed twice the below engines, and after each trying I've got almost same results

For examples,


Engine                    : Points
Alfil 15.4 x64            : 248
Alfil 15.7 x64            : 145

Fire 3 x64                : 206
Fire 4 x64                : 124


And I hope the next newer versions will be better in tactics !

Good luck....

Best,
Sedat
Parent - - By Horst Sikorsky Date 2015-07-20 22:13
drei Sekunden für eine Stellung
das finde ich extrem Lustig    
Parent - - By Thomas Müller Date 2015-07-21 06:51
ja total lustig 

Aber die 3sek gelten für beide (alle) versionen.
Und wenn dann bei der neueren 100 pkt weniger raus kommen, ist das nicht unbedingt ein "fortschritt"

gruß
thomas
Parent - - By Frank Qy. Date 2015-07-23 02:36
Hallo Thomas,

finde das diese Test-Sets sehr individuell nach "eigenem Empfinden" sind.

Glaube eher an folgendes:
Je weniger Figuren auf dem Brett desto weniger können wir eigentlich von einem positionellen oder taktischen Zug sprechen. Wir denken der Zug ist positionell oder taktisch weil irgend eine Kombination drin ist. Wir denken so weil unser Schachwissen nicht mehr zulässt. In Wirklichkeit wird es wahrscheinlich so sein, dass mit immer weniger Zügen es gar keine taktischen oder positionellen Züge mehr gibt weil dann Züge die zu Ergebnisveränderungen führen eher logische Züge sind. Wie gesagt für uns gar nicht logisch aussehen, eher fantastisch und wir sind geneigt zu sagen ... eine taktische Bombe oder eine positionelle Bombe. Es sind noch nicht mal Granaten, eher logische Abfolgen.

Glaube das wir von Taktik oder positionellen Zügen sprechen können wenn bei vielen Figuren auf dem Brett der Partieausgang unklar ist. Und der ist in der Regel fast immer unklar wenn viele Figuren auf dem Brett sind.

Müssen vielleicht ein wenig umdenken. Programme wie Stockfish oder Komodo sind extrem stark im auffinden von logischen Abfolgen und produzieren daher auch im Übergang zum Endspiel die sehr hohen Ratings, weil hier der Vorteil zu den Verfolgern fast übermächtig erscheint. Bei vielen Figuren auf dem Brett sind Programme wie Stockfish oder Komodo auch sehr stark aber die Differenz in Elo zu den Verfolgern ist deutlich geringer.

Bei einem Test-Set sind es meist auch immer Positionen die zu einem direkt verbesserten Ergebnis führen durch eine Kombination. Viele sprechen ja von einem Best-Move (ich auch). Ich könnte schon alleine aus meinen Eröffnungsanalysen wegen meiner Buchentwicklungen Positionen beisteuern die z. B. Komodo oder Stockfish gar nicht spielen andere Programme aber als Best-Move bewerten weil die Bewertung so extrem nach oben springt. ICE macht das z. B. sehr gut. Der ICE Programmierer hat beim Release etwas hierzu geschrieben. In der Regel hat ICE auch zu 80% Recht wenn nach den Buchzügen plötzlich die Bewertung bei +1.0 oder höher steht und andere im 0,5 Bereich oder darunter liegen. Im weiteren Verlauf geht die Bewertung dann immer höher ... wie gesagt zu mehr als 80% stimmt das was ICE ausgibt.

Jetzt schaue ich mir die gesammelten ICE Bewertungen mit taktischen Programmen wie Hakkapeliitta an und stelle fest ... ups die Engine springt voll darauf an. Bewertet zwar nicht so krass aber auch deutlich höher. Noch extremer bei Alfil. Dann wird gar erst Recht aggressiv gespielt und aus dem Spiel heraus dann die schönsten Kombinationen die mit Best-Move Abfolgen, wie bei einem Test-Set gar nicht zu produzieren sind.

Aus dem Spiel heraus und ein Test-Set sind daher unterschiedliche Dinge. Der Weg ist das Ziel und auf dem Weg zum Ziel bleiben Programme dann hängen (Hiarcs ist hier der Experte). Meine Hiarcs erspielt sich super schöne Stellungen aber zieht diese dann nicht durch, verheddert sich und verliert gar oft aus günstiger selbst erspielter Position. Auch Junior ist Experte bei solchen Situationen.

Wichtig ist für mich folgendes:
Das zu sehen ist das Engines unterschiedlich spielen, mit unterschiedlichen Ideen arbeiten. Ob sie nach Grönland wollen und von Trier nach Grönland den kurzen Abstecher über Uganda machen oder den direkten Weg suchen. Der Umweg über Uganda kann auch interessant sein.

Was mich langweilt ist ...
Wenn dann Engines unter verschiedenen Namen die Uganda Route wählen.
Das ist schon ziemlich krass auffällig wenn man sich mit den Partien beschäftig und ständig Statistiken erstellt. Ich erstelle so einige, nur eine einzige Statistik Idee ist auf meinen Seiten (Mittelspiel, Übergang Endspiel und Endspiel). Ich kann die leider nicht alle auf meine Webseiten bringen weil kein Mensch auf Erden kann nach den vielen Test-Runs ständig die Statistiken updaten. Dafür müsste ich ja 10 Leute einstellen und es würde sich auch kaum jemand ansehen.

Aber z. b. in den Hakkapeliitta Thread ...
Habe mal gehört das Critter taktisch gut ...
Die vielen Aussagen die hier und da getroffen wurden, was dann aufgeschnappt wurde und mit dem Wissen auf eigene Partien geachtet und bestätigt ...
Kein Fan von einem Programm z. B. Critter ... wird sich das was Critter gut macht mit 40 anderen Programmen ansehen um seine Meinung vielleicht neu zu bilden.

Im Grunde fallen solche Dinge auch nur dann auf wenn jemand sich einfach mit allen Programmen beschäftigt und die Ratinglistenbetreiber machen das oft ja auch bzw. müssen das ... sie testen die Programmen ja und sehen daher auch ... wenn sie denn genau hinschauen.

Gruß
Frank
Parent - - By Sedat Canbaz Date 2015-07-24 09:42 Edited 2015-07-24 10:18
Hello dear Frank,

First of all,
Thank you for your interest...you are one of my real chess friends !!
Yes...a few months ago, Graham informed about your posting in CCC forum
But on these days I was too busy with my NON-Chess life, sorry that I could not replay you...

Let's come back and talk about the current testings,
Of course (as you know) my German language is too bad, that's why I have no much idea exactly for what is talking about
But using Google translator,I noticed that some people are not satisfied with my current testings

No problem...my work is not perfect and I make often mistakes, but this is also true:
- As far as possible, I am trying to not repeat the same mistake twice
Actually those who never make mistakes, they make never anything... !)

I know too that,
3 seconds is not a long time duration...
But if they check the time controls, I mean the played games at 3 min, 5 min, 40/4 etc...
Then they will notice that min 80-90 % of the World's games are played at similar fast moves (similar as 3sec/move)

Yes...3 seconds per move is not perfect time control for analysis,
But Sorry...I have no free time to test more than 100 engines at 30 sec/move or 1 min/move etc...

And those who are not satisfied,
I suggest to check 30 sec or 5 min per move or other available rankings (NON SCCT) !)

For example, one of my favorite STS ranking (great work by Swaminathan and Dann Corbit)
https://sites.google.com/site/strategictestsuite/test-results

And without to not mention this I cant,
TTs conditions (3sec/move +64 bit+ SSE42, i7 980X ) are minus plus equal in chess speed as 10 sec/move + Q6600, 32 bits

In shortly, in my opinion,
My current testings are not ideal, but quite good indicator of which are the best engines in tactics!

To be honest,
I see the current TTs testing to be similar as IQs testing !)

For example we have two great chess players, who are one of the highest IQs in History:
Garry Kasparov – IQ 194
Judit Polgar – IQ 170

Btw, we should not forget to mention that,
Houdini's author created for us a really very successful chess engine !
Of course, Komodo, Stockfish and plus many others authors belong to this group !

And last,
The Elo is one of the most important factors...
And I think Tactics should be also considered as very important factor for chess !)

Btw, I often visit your rankings and read your news...!

Keep up the good work...

Best Regards,
Sedat
Parent - - By Frank Qy. Date 2015-07-24 13:07
Hi Sedat,

you are weclcome at any time.
Nice that you are OK and available for us.
Too many persons I lost in the latest years, maybe today I am too worried.

Again a new idea by yourself. I am thinking many years to create a test-set too with own game material from SWCR and FCT events.
All this is a lot of work and time needed.
Maybe more time as to develop the opening book I am working hard.

You can used different time controls for such a test, all is OK.

Only this one from my point of view:
A tactical test-suite should included positions with many pieces on board.
Wrote why I have such an opinion in German language in the message before.

Remember ...
On our discussion start of the year!
Now Rybka and Houdini goes in my FCT League System.
You are won ... I lost.



Rybka is very boring in the beginning of the game, I don't like such computer chess.
So worried I am in question of life ... Rybka is in question of middelgames.



Have a look in the stats on my page. 0 games won undo move 50 with mate and 4 games lost.
Place 16 in the current list of 21 TOP engines (in middlegame).

Houdini will be better, engine is great in tactics.
Interesting will be to compare the stats from Houdini in "Transposition into endgame" with Stockfish and Komodo.

Thanks for your friendly answere!!
Have a nice day!

Best
Frank
Parent - By Sedat Canbaz Date 2015-07-25 19:20
Hello Frank,

You are welcome at any time too
And thanks a lot for your kind words....

About Rybka issue,
Congrats....finely, you preferred the right choice !

Right now I am a little bit busy...later we will talk more

Have a nice weekend,
Sedat
Parent - - By Sedat Canbaz Date 2015-07-21 14:15
Hello Chess Friends,

Here are the latest TTS's results in 30 sec per move:
http://www.sedatcanbaz.com/chess/?page_id=1975

You can view 'Comprehensive Output in Report file':
http://www.sedatcanbaz.com/chess/files/tts_30sec.txt

Note: Very soon I plan to test some of the top engines in 300 sec (5 min) per move
*I will use the same tactical test suite, but with unsolved positions in 30 sec/move

Best Wishes,
Sedat
Parent - By Sedat Canbaz Date 2015-07-22 17:11
Update:

Many engines are included an now the ranking has 110 participants !)
Note: As far as possible I am interested to test +2500 Elo and above

Top 3 (5 min/move):
http://www.sedatcanbaz.com/chess/?page_id=1993
Note also that the above Top 3 standings at 5min/move is same as 3 sec/move

All Versions (3 sec/move):
http://www.sedatcanbaz.com/chess/?page_id=1956

Top 20 (3 sec/move):
http://www.sedatcanbaz.com/chess/?page_id=2015

BTW,
Unfortunately the latest Cheng release 4.39 could not perform better...
Strange and just another fiasco:

Cheng4 0.38 x64          : 179
Cheng 4.39 x64            : 164

Anoher note is that,
The older one, Pepito 1.59 as UCI (Intel) performed as expected...
But as we see, the latest JA version as WB perfomed even worse:

Pepito 1.59 UCI            :  88
Pepito 1.59.2 Wb JA     :  56

Greetings,
Sedat
Parent - - By Sedat Canbaz Date 2015-07-23 19:32
Dear Chess Friends,

I've just tested Ginkgo (by Frank Schneider), which is "private" so far
Note: This engine participated recently in WCCC and WCSC (Leiden/Netherlands)

And here are the latest standings with Ginkgo 1.0e x64:
http://www.sedatcanbaz.com/chess/?page_id=1956

Best,
Sedat
Parent - By Sedat Canbaz Date 2015-07-24 12:37 Edited 2015-07-24 13:27
Hello again )

Here is one strategy test competition more, which I run approx. 2 weeks ago:

Conditions:
Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz
Hash: 128, Threads: 1, time/pos: 12 Seconds
Test duration (per engine): 24 Minutes
Strategic Test Suite Results: Higher is Better
The Ranking is generated by STS Rating v3

Engine                Score
Houdini 4 x64         64/120
Komodo 9.01 x64       63/120
Stockfish 090715 x64  59/120
SugaR v5.4 x64        57/120


*Note that I tested the chess engines with 120 positions (based on STS1-STS15_LAN.epd)
Another note is that, I preferred the unsolved positions in 1 sec/move by the above engine versions
In shortly: No any of the used positions were solved in 1 sec/move by the current tested engines

And this should be known too,
My used 300 positions in 5 sec per move:
http://www.sedatcanbaz.com/chess/?page_id=1742
I preferred the unsolved positions in 3 sec/move by Rybka 4.1 x64

Probably this is the reason about why Houdini 4 could not win at 5 sec/move, where Sugar's performance was great...
Because almost all of us know that Houdini is based mainly on IPPO (Rybka)
For example,(on my private testings) Rybka 4.1 x64 at 5 sec/move managed to get just 40 points
It seems, extra adding 2 seconds (from 3 to 5), Rybka scored better than zero )

But here, Komodo 9.01 and Houdini 4 managed to win at slow time controls:
http://www.sedatcanbaz.com/chess/?page_id=1849

Of course here with all 1500 positions, Houdini is again Number ONE:
http://www.sedatcanbaz.com/chess/?page_id=1761

Yes...I'd just decided to share my thoughts....

Best,
Sedat
Parent - - By Sedat Canbaz Date 2015-07-29 10:24
Last Updates:

Engine                    : Points
Stockfish 6 x64           : 310
Stockfish 5 x64           : 287
Stockfish 6 w32           : 286
Houdini 1.03a x64         : 256
Protector 1.8.0 x64       : 243
Houdini 1.03a w32         : 242
Rybka 3 x64               : 215
Rybka 2.3.2a x64          : 162
Cerebro 3.03d w32         : 126
Trace 1.37a w32           : 109
Delphil 3.1 x64           :  99
GLC 3.034 w32             :  96
King of Kings 2.56 w32    :  96
Leila 0.53h w32           :  93
Kiwi 0.6d JA x64          :  91
Sage 3.53 w32             :  91
Dorky 4.3 x64             :  89
Counter 1.2 w32           :  87
Genius 7 w32              :  85
Dragon 4.6 w32            :  83
Patzer 3.80 w32           :  82
Bringer 1.9 w32           :  82
Abrok 6.0 w32             :  81
Resp 0.19 JA x64          :  80
Nejmet 3.07a w32          :  79
Comet B68 w32             :  79
Myrddin 0.86 JA x64       :  78
Quark 2.35 w32            :  73 
Capture R1 w32            :  72
Diablo 0.51 JA x64        :  71
LambChop 10.99 w32        :  71
Plisk 0.2.7d x64          :  71
Fridolin v2.00 x64        :  69
Amy 0.87b w32             :  64
Uralochka 1.1b w32        :  62


Notes:

- Both Top engines (Stockfish 6 and Houdini 4) are now the current Leaders !
* Very interesting...SF 6 performed better than the latest new SF versions
- Protector 1.8.0 x64 and Stockfish 5 x64 are tested via Polyglot adapter !)
* Due to both chess engines have same analyze bug under Arena 2.01 GUI
* Note also that Stockfish 6 and later SF versions don't have this bug
- Some buggy Winboard chess engines are tested via Wb2Uci adapter too
- Arena 2.01 GUI seems to be the more stabil for analysis (comparing with other Arena versions)
- A few older versions of Rybka, Komodo, Stockfish and Houdini are tested
* Now you can compare the points difference between older... 64-bit vs 32-bit
- I am impressed also by Cerebro 3.03d's performance: 126 tactical points !
* Cerebro is much weaker in Elo points than many ones, but in tactics: very good...!
- Genius 7 is tested on Virtual operating system (on Windows XP Pro 32-bit)
- More chess engines are coming soon !

All Versions:
http://www.sedatcanbaz.com/chess/?page_id=1956

Top 20:
http://www.sedatcanbaz.com/chess/?page_id=2015

Kind Regards,
Sedat
Parent - - By Sedat Canbaz Date 2015-07-30 16:52
New Update:

Code:
Engine                    : Points
Houdini 4 Tactical x64    : 363
Gaia 3.5 x64              : 110
Muse 0.899b w32           :  95
KnightDreamer w32 3.3     :  89
Capivara LK009b02c x64    :  88
Rodin 7 w32               :  87
Pupsi2 0.09 w32           :  87
OliThink 5.3.0 x64        :  86
JikChess 0.01 x64         :  86
SpiderChess 070603 w32    :  85
Sungorus 1.4 JA x64       :  83
RomiChess P3L x64         :  83
Simplex 098 JA x64        :  83
LittleThought 1.052 x64   :  80
CuckooChess 1.12 x64      :  79
Matacz 1.4 w32            :  79
Ges 1.36 w32              :  79
Anechka 008 w32           :  78
Typhoon 1.00-358 w32      :  77
CyberPagno 2.2 w32        :  76
Terra 3.4 w32             :  73
Ares 1.005 x64            :  73
Scidlet 3.61b2 JA x64     :  71
Ayito 0.2.994 w32         :  71
Chezzz 1.0.3 w32          :  71
TJchess 1.1 x64           :  69
Zeus 1.29 w32             :  68
Buzz 008 w32              :  68
ANT 2006f w32             :  68
Beowulf 2.4a w32          :  67
Queen 4.02 w32            :  66
Horizon 4.4 w32           :  66
Arion 1.7 w32             :  66
Snitch 1.6.2 w32          :  64
Averno 0.81 w32           :  60
Homer 2.01 w32            :  60
Aice 0.99.2 w32           :  60
Ifrit m1.8 x64            :  59
EveAnn 1.70 w32           :  55
Alex 2.14a x64            :  55


Some Details:
- I am very very impressed, an incredible performance by Houdini 4 Tactical x64
- Houdini 4 Tactical leads with more than 50 points over Stockfish, Komodo etc...
* Note: This time, Houdini 4 x64 is tested with Tactical Mode ON
- Sad and strange that a few chess engines do not support Analyze mode
* For example: Joker 1.1.14 (by HGM) belongs to those buggy engines
* Another interesting example: even Rebel 6 (1994) supports Analyze mode
- And I hope the next new releases will be more stable, better in tactics...

All Versions:
http://www.sedatcanbaz.com/chess/?page_id=1956

Top 20:
http://www.sedatcanbaz.com/chess/?page_id=2015

Best Regards,
Sedat
Parent - - By Sedat Canbaz Date 2015-07-31 13:54
Update - 31.07.2015:

Engine                    : Points
Stockfish 150715bt x64    : 303
SugaR v5.4b x64           : 303
StockfishTS 120715 x64    : 292
Hannibal 1.4b x64         : 161
Arasan 18 x64             : 154
Gothmog 10b10 w32         : 123
Cyclone xTreme w32        :  94
Adam 3.3 w32              :  91
FireFly 2.70 x64          :  82
BBChess 1.3 x64           :  78
Orion 02 w32              :  77
GreKo 12.8a WB w32        :  78
GreKo 12.8a UCI w32       :  72
AliChess 4.25 w32         :  70
Caligula 0.7b w32         :  61
Esc 1.16 CIPS w32         :  58
Chesser 2 w32             :  51


Some Notes:
- The tactical performance of the tested new Top 3 engines (SF derivatives) are good...
* But however, their points are not enough to catch/pass Houdini 4 Tactical x64
* Houdini 4 is almost 3 years old chess engine and still as number ONE in tactics!!
- The older version of Hannibal 1.4b x64 is tested too: 161 tactical points
* For example, the latest release (Hannibal 1.5 x64) managed to get much less: 131
- Not bad performance by Adam 3.3, as we know its Elo performance is not too high...
- What a pitty that Chesser 2 is ranked as last place (tested with 4MB hashtable)
* Chesser's author is Syed Fahad, and as far as I know he is just 14 years old
* I hope the next Chesser release will be better and to be able to use larger hastables
- GreKo 12.8a version is tested as WB and UCI (WB performed slightly better than UCI)
- The latest Arasan 18 UCI's performance is good: 21 points better than Arasan 17.5 UCI
* Congrats to Jon Dart and to all rest programmers, who are able to improve their engines!

Note also that,
I like very much the current tactical testing method - fast, reliable...

And I hope you like it too... !)

All Versions:
http://www.sedatcanbaz.com/chess/?page_id=1956

Top 20:
http://www.sedatcanbaz.com/chess/?page_id=2015

*Note: Top 20's standing remains as previous update !

Best Wishes,
Sedat
Parent - - By Sedat Canbaz Date 2015-08-01 18:18
Update - 01.08.2015:

Engine                    :  Points
Stockfish 160715 x64      :  306
Stockfish 150715bt x64    :  303
Houdini 2.0c x64          :  299
SF 130715MZ x64           :  296
Houdini 2.0c Kayra2 x64   :  294
Stockfish 300715 x64      :  292
StockfishTS 120715 x64    :  292
Stockfish 090715 x64      :  291
SugaR v5.4 x64            :  289
Bright 0.5c x64           :  142
T.Logic 20100131x x64     :  130
Pedone 1.2 x64            :  128
Ktulu 8 w32               :  109
Gibbon 2.69a x64          :   84
ZChess 2.22 w32           :   83
Feuerstein 0.461 w32      :   75
Gromit v3.82 w32          :   72
Xpdnt 091007 JA w32       :   70
PostModernist 1016 w32    :   70
ChessKISS 1.7c w32        :   68
EnginMax 5.24 x64         :   61
Gerbil 02 JA x64          :   61
Kurt 0.922b JA x64        :   60
Smash 1.0.3a w32          :   59
Warrior 1.03 w32          :   53
Taktix 2.23x w32          :   53
Vice 1.1 w32              :   51
Tscp 1.81 JA w32          :   47
Alarm 0.93.1 w32          :   36


More Details:
- The results of the experimental SF versions are not so stable, for example: Stockfish 300715
* So since today, only the strongest SF version (per author) will be published on my official site
* But however, I plan to publish each SF testing result over my favorite Computerchess forums
* Otherwise, the current TTS ranking will be look like: I am Stockfish beta tester !)
- Surprisingly, Houdini 2.0c Kayra2 x64 performed 5 points worse than Houdini 2.0c x64 default
* For example, in SCCT tours: Houdini 2.0c Kayra2 performed approx 20-30 Elo better than default
* Note: For more reliable ranking: Houdini 2.0c and Kayra2 engine versions are tested twice too
* And after each repeating the same chess engine version: usually the difference is 1 point
- Pedone 1.2 x64 seems to be stronger in tactics than older version (well-done to F. Gobbato)
- And just opposite, the older Ktulu 8 version performed better than Ktulu 9
- Plus the older TwistedLogic 20100131x x64 is almost equal in tactics to Hannibal 1.5, interesting...
- What a pity that Alarm 0.93.1's performance is too low than expected (tested twice too)
- And probably, I will continue testing the engines, which are mainly +2200 Elo and above...
* Due to many amateur engines (weaker than 2200 Elo) have serious bugs in Analyze mode
* If some +2200 Elo engines are missing in my current TTS ranking, then with a big possibility:
- Those ones have serious bugs in Analyze mode or simply they don't support Analyze mode

All Versions:
http://www.sedatcanbaz.com/chess/?page_id=1956

Top 20:
http://www.sedatcanbaz.com/chess/?page_id=2015

*Note: As before, Top 20's standing remains as previous update !

Best,
Sedat
Parent - By Sedat Canbaz Date 2015-08-02 22:50 Edited 2015-08-02 23:49
Dear Chess Friends,

Here is my latest TTS MP (1 Min/Move) competition:
http://www.sedatcanbaz.com/chess/?page_id=2081

You can view 'Comprehensive Output in Report file':
http://www.sedatcanbaz.com/chess/files/tts60sec.txt

More Details:
- The Top 5 Engines are tested as MP (using 6 cores per engine)
- 22 Unsolved positions are used (not solved in 6 sec/move by the current MP engines)
- The current used tactical/strategic database is derived from 2100 positions
* The Top 5 MP engines managed to solve approx 99% of the positions (in 6 sec/move)
* During the test (in 6 sec/move), the MP engines are played with 6 cores too (for solving)
* Actually my goal was to use more positions, but as we see: 2078 positions were solved
* Just imagine how much are strong the Top chess engines in strategy and tactics
- And Houdini 4 is again the Winner - My Congratulations to Robert Houdart!

Best Regards,
Sedat
Parent - - By Michael Scheidl Date 2015-08-03 14:35
Thanks for your tests! I draw two major conclusions:

1. The interpretation of Komodo as being "not so tactical" has clearly been proven wrong.
2. Surprising tactical strength of the new Alfil version(s).


As we find the top-3 virtually on par in terms of tactics on the top, I wonder if this particular element of an engine's strength profile has again gained importance, in contrast to one or two decades ago where the top engine(s) sometimes could afford to be of somewhat slower tactical abilities than the competition. For example, when Shredder 9 was undisputed at the top, I think it was certainly not the best combinator.

(The same could be observed in the 1980s but that is not more than an insignificant historical remark.)

Anyway, it suits my taste
Parent - - By Sedat Canbaz Date 2015-08-05 11:12 Edited 2015-08-05 11:20
Hello Michael,

Not at all, it's my pleasure, and thanks for your interest...

Just my 2 cents over this issue,
Who claims that Komodo 9.01 is not so strong in Strategy and Tactics
He is completely wrong...

For those, who have still some suspicious I suggest to check the below links
For example in 2 competitions (with slower time controls), Komodo managed to win:
http://www.sedatcanbaz.com/chess/?page_id=1849

Another example is here:
http://www.sedatcanbaz.com/chess/?page_id=2081
Even Komodo scored slightly better than SF, if we check the report file

Komodo's Rated time: 19:04 = 1144 Seconds
Stockfish's Rated time: 19:36 = 1176 Seconds

In other words,
According to my overall strategic/tactical test suite's results,
Komodo, Stockfish, Houdini's performances are just great (Best Top 3 engines in solving) !

Btw,  I am not much satisfied by Shredder 12 x64's tactical performance: as we see its points is 118

So soon I will test older Shredder version, maybe v10...and I wonder what is the tactical improvement between both versions

And according to CEGT (40/4), the difference is more than 200 Elo:
Deep Shredder 12 x64 1CPU   2800
Deep Shredder 10 x64 1CPU   2588

Best,
Sedat
Parent - - By Sedat Canbaz Date 2015-08-05 19:15
Update - 05.08.2015:

Engine                    : Points
Shredder 10 x64           : 129
Zct 032500 JA x64         : 125
WChess 2000 v1.2 w32      :  97
ECE 12.01 w32             :  89
Flux 2.2 w32              :  85
Giraffe 20150801 x64      :  64
Chessterfield i4b w32     :  58


More Details:
- The old Shredder 10's seems to be stronger in tactics than Shredder 12, any opinions ?
* For example, Shredder 12 is approx. 200 Elo stronger, but performed 118 points (11 points worse...)
- Not bad performance by Zct and Wchess, better than expected...

All Versions:
http://www.sedatcanbaz.com/chess/?page_id=1956

Top 20:
http://www.sedatcanbaz.com/chess/?page_id=2015

Regards,
Sedat
Parent - - By Sedat Canbaz Date 2015-08-07 01:34 Edited 2015-08-07 01:41
Update - 07.08.2015:

Engine                    : Points
TwinFish 0.07 x64         : 316
Critter 1.6 x64           : 314  
Stockfish 290114 x64      : 282  
Houdini 1.5a x64          : 282
Strelka 6 w32             : 274
Vitruvius 1.14a x64       : 263
Equinox 3.30 x64          : 263
RobboLito 021Q x64        : 247
Mars 3.38 x64             : 246
Bouquet 1.8 x64           : 245
BlackMamba 2.0 x64        : 240
Firenzina 2.3.2 x64       : 239
DeepSaros eXp x64         : 238
LEOpard 0.7c x64          : 234
IvanHoe 9.46b x64         : 221
Naum 4.6 x64              : 213


More Details:
- This time I decided to test some IPPO engines, also a few SF versions too
- For more reliable standings: the Top chess engines are tested twice
* I mean for: TwinFish 0.07, Stockfish 290114, Critter 1.6, Houdini 1.5a
* And after repeating the same engine: usually the difference is 1 point
- TwinFish 0.07 performed extremely strong...better than all SF versions so far, why ??
* I wonder really a lot: How a clone engine can be better in tactics than all SF versions ??
*Note: TwinFish is based on Stockfish dev 14 01 29 6:02PM (TimeStamp:1391014933)
- Critter 1.6a performed incredibly strong too, better than Houdini 1.5a, plus all SF versions (exception Twinfish)
- TTS testing is currently paused, but I am planning to resume it after SCCT CS V

Btw,
I've run a few testing more, now you can compare the differences with/without adapters

A little note (as UCI):
- Twinfish 0.07, Stockfish 290114 have the same Analyze bug as Protector 1.8 and Stockfish 5


Test 1:
TwinFish 0.07 x64 via Polyglot adapter     : 316
TwinFish 0.07 x64 as UCI                   : 286

Test 2:
Stockfish 290114 x64 via Polyglot adapter  : 282
Stockfish 290114 x64 as UCI                : 264

Test 3:
Houdini 1.5a x64 as UCI                    : 282
Houdini 1.5a x64 via Polyglot adapter      : 276


All Versions:
http://www.sedatcanbaz.com/chess/?page_id=1956

Top 20:
http://www.sedatcanbaz.com/chess/?page_id=2015

*Note: Top 20 and All versions standings remain as previous update !

Kind Regards,
Sedat
Parent - - By Sedat Canbaz Date 2015-08-30 15:05 Edited 2015-08-30 15:09
Update - 30.08.2015:

Engine                    : Points
Stockfish 170815 x64      : 309
Synapse RZ4 WSET Tact x64 : 301
KingAsad D100815 x64      : 301
SugaR PrO v1.0 64         : 300
DeathDanger 3 x64         : 298
Komodo 9.2 x64            : 291
Ginko 1.2 x64             : 216
Andscacs 0.82 x64         : 191
Tornado 7 x64             : 116
Arminius 060815 x64       :  97
Giraffe 280815 x64        :  93


More Details
TTs official ranking includes only (since 30.08.2015):
1) Public and strongest versions
2) Engines, which are up to 55% similarity (+56% out)
* Exception there will be for:
- Private engine, in case of better performance than the leader (Houdini 4 tactical)
- For derivative engines: in case of +20 points better than based original engine
- For clone engines: in case of +100 points better than based original engine
* Houdini 4 Tactical (based mainly on Rybka) managed to do that...why not other engine?!
* Otherwise, TTs rankings will be based mainly on Stockfish/Rybka derivatives or clones
* Thanks for your understanding !

A few notes more,
Stockfish 170815 performed better than all previous SF development versions
*But however, even this new SF release could not do better than Stockfish 6
Synapse RZ4 WSET Tactical x64 has been tested with 'Tactical mode' enabled
Ginko 1.2 x64 performed 14 points better than previous Ginkgo 1.0e x64
Giraffe 280815 x64 performed 29 points better than Giraffe 010815 x64
Congrats to all engine authors who managed to improve their engines...

Unexpected and disappointed tactical results by some newer versions:
- Komodo 9.2 x64 (17 points less than Komodo 9.01 x64)
- Tornado 7 x64 (12 points less than Tornado 5 x64)
- Andscacs 0.82 x64 (8 points less than Andscacs 0.81 x64)
- Arminius 060815 x64 (1 point less than Arminius 2014 x64)

Some new engines which have analyze bugs:
Alfil 15.8 and Fizbo 1.5 have the same analyze bug as Protector 1.8 and Stockfish 5
I don't know exactly, but it seems some authors copy or study also the bad parts...
Also since today,
I will not use any polyglot adapter for those buggy UCI engines, due to we are in 2015

And I hope the newer versions can analyze and to be better in tactics !)

SCCT - TTs (3 seconds per position) - All Versions:
https://sites.google.com/site/computerschess/tts-3sec

SCCT - TTs Conditions/Rules:
https://sites.google.com/site/computerschess/tts-3sec-conditions

Best,
Sedat
Parent - - By Benno Hartwig Date 2015-08-30 15:13
What do you think:
What does this test really show an declare us, if several newer and in practice stronger Engines only reach less points than their weaker ancestors?

Benno
Parent - - By Klaus S. Date 2015-08-30 15:31
Benno Hartwig schrieb:

What do you think:
What does this test really show an declare us, if several newer and in practice stronger Engines only reach less points than their weaker ancestors?

Benno

... and what / where are those tactical testpositions?
Parent - By Sedat Canbaz Date 2015-08-30 15:56
Wilfried Lübkemann schrieb:

Benno Hartwig schrieb:

What do you think:
What does this test really show an declare us, if several newer and in practice stronger Engines only reach less points than their weaker ancestors?

Benno

... and what / where are those tactical testpositions?


On my computers ))

Ok, as I mentioned before,
• TTS ranking is a quite good indicator of which are the best engines for tactics
* To be honest, I see my current TTs testing to be similar as IQs testing !)
• I preferred 600 test positions, which mostly of them can be called 'Tactical'
* Note: I consider to publish my used EPD database after the end of TTS testings

Ah one thing more,
Engine                    : Points
Houdini 4 Tactical x64    : 363
Houdini 4 x64             : 310


What does that mean ?)

It looks like my used database proves that it  is a tactical

Best,
Sedat
Parent - - By Sedat Canbaz Date 2015-08-30 15:46
Benno Hartwig schrieb:

What do you think:
What does this test really show an declare us, if several newer and in practice stronger Engines only reach less points than their weaker ancestors?

Benno


Just my 2 cents over this issue,
We should not mix apples with oranges !
Maybe some of them are stronger in Elo, but in the same time: they are weaker in tactics !
And now what is more important... ?
I think that all it depends on users choice...!)

For more details about why some newer engines are weaker...
You need to direct this question to those programmers
Probably they will explain better than me...

Best,
Sedat
Parent - By Benno Hartwig Date 2015-08-30 17:55 Edited 2015-08-30 18:12

> For more details about why some newer engines are weaker...


Yes, you are right.
It is interesting to see, how stronger Engines could get that strongness allthough they might be weaker in tactics.
Perhaps there could be an Chance for improvement.

Benno
Parent - - By Sedat Canbaz Date 2015-09-22 18:25 Edited 2015-09-22 18:33
Update: 22.09.2015

Engine                    : Points
Shark 150915 x64          : 315
Stockfish 190915 x64      : 312
SF TPA0918MZ x64          : 310
SugaR 160915 x64          : 304
Orka 150915 x64           : 300
Stockfish 300815IP x64    : 298
Ultron 1.1 x64            : 240
Judas 1.02 x64            : 211
Mars T34 090545 x64       : 252
Bitfoot 1.0.d9aeb43 x64   :  76
Laser 0.1 x64             :  56
Zurichess Freibourg       :  54
Sayuri 040915 x64         :  49
Supra 21.0 Pro x64        :  25


More Details,
Finely, 2 (two) new SF versions performed better than Stockfish 6
Note that Stockfish 6's tactical result was 310 points

I wonder a lot about who will be the 1st engine programmer,
Who will be able to pass the 3 old Houdini 4...we will wait a lot...?)

Some engines, which dont support analyze mode or I could not run them:
Fischerle 0965, Embla-0.4, Satana.2.1.11

SCCT - TTs (3 seconds per position) - All Versions:
https://sites.google.com/site/computerschess/tts-3sec

SCCT - TTs's Conditions/Rules:
https://sites.google.com/site/computerschess/tts-3sec-conditions

Regards,
Sedat
Parent - By Sedat Canbaz Date 2015-09-25 10:51
Update: 25.09.2015

Engine                     : Points
Houdini 3 Tactical x64     : 356
StockfishTS 240915 x64     : 317
Schooner 1.4.2 x64         :  87
Clubfoot 1.0.048995d x64   :  57
Galjoen 0.30 w32           :  53


More Details,
-Houdini 3 Tact performed incredible strong, slightly weaker than Houdini 4 Tact
-StockfishTS 240915: best performance so far, comparing with other SF versions

Some engines, which crashed on my systems:
Embla 0.5, Neurone XXIII

SCCT - TTs (3 seconds per position) - All Versions:
https://sites.google.com/site/computerschess/tts-3sec

SCCT - TTs Conditions/Rules:
https://sites.google.com/site/computerschess/tts-3sec-conditions

Best,
Sedat
Parent - - By ? Date 2015-09-25 19:48
Sedat Canbaz schrieb:

Update: 22.09.2015
Some engines, which dont support analyze mode or I could not run them:
Fischerle 0965, Embla-0.4, Satana.2.1.11


Fischerle 0.9.65 64-bit should work perfectly if you have a current Java Runtime Environment installed on your computer; just follow the instructions included in its distribution http://www.stuckardt.de/index.php/component/docman/doc_download/53-fischerle096564or32.html

Fischerle 0.9.65 64-bi is currently tested in a CCRL-40/40 gauntlet (port 16065) and runs just fine. Fischerle should be used via its UCI interface and preferably under the Arena GUI (3.5 or 3.0).

It goes without saying that it supports the analysis mode.

Roland
Parent - - By Sedat Canbaz Date 2015-09-26 10:06
[quote="?"]
Sedat Canbaz schrieb:

Update: 22.09.2015
Some engines, which dont support analyze mode or I could not run them:
Fischerle 0965, Embla-0.4, Satana.2.1.11


Fischerle 0.9.65 64-bit should work perfectly if you have a current Java Runtime Environment installed on your computer; just follow the instructions included in its distribution <a class='ura' href='http://www.stuckardt.de/index.php/component/docman/doc_download/53-fischerle096564or32.html'>http://www.stuckardt.de/index.php/component/docman/doc_download/53-fischerle096564or32.html</a>

Fischerle 0.9.65 64-bi is currently tested in a CCRL-40/40 gauntlet (port 16065) and runs just fine. Fischerle should be used via its UCI interface and preferably under the Arena GUI (3.5 or 3.0).

It goes without saying that it supports the analysis mode.

Roland


Hmm...I don't know exactly why, but both Fischerle 0.9.65 64-bit / 32-bit  crashes on my PC system

For example here is the screenshot:



And here is my used Java version, which is installed
* Note: I managed to test several other java engines, exception Fischerle engine



Best,
Sedat
Parent - - By Roland Date 2015-09-26 15:55
Hi Sedat,

(1) did you try to start Fischerle from within Arena?

(2) Did you try to start it via the .bat script instead of the .exe? Of course, they should be equivalent, but nay be they behave different on your system, though.

If both (1) and (2) fail, you could try the following: open a console window, navigate to the Fischerle installation directory (i.e., the folder that contains the
Fischerle exe) and type in "java". Does it find your Java installation? If so, then you could copy / paste the following command line into the console
window, which should then start Fischerle:

java -Xms1400m -Xmx1400m -XX:+UseParallelGC -jar "dist\Fischerle.jar" uci

Of course, you won't see anything; since it is started in the uci mode, this is just fine. In case the startup fails, however, an error message should be displayed in the console window. If so: could you please post a screenshot of this message here?

Roland
Parent - - By Sedat Canbaz Date 2015-09-27 00:02
Roland schrieb:

Hi Sedat,

(2) Did you try to start it via the .bat script instead of the .exe? Of course, they should be equivalent, but nay be they behave different on your system, though.



Hello Roland,

Thanks for the useful info...Finely I managed to run it via the .bat script

And very soon I will test Fischerle engine

Best Regards,
Sedat
Parent - - By ? Date 2015-09-27 02:46
Sedat Canbaz schrieb:

Roland schrieb:

Hi Sedat,

(2) Did you try to start it via the .bat script instead of the .exe? Of course, they should be equivalent, but nay be they behave different on your system, though.



Hello Roland,

Thanks for the useful info...Finely I managed to run it via the .bat script

And very soon I will test Fischerle engine

Best Regards,
Sedat


Great news, thanks!

And thanks a ton for directing my attention to the issue with running the .exe starters under your environment, since this indicates a possible problem with the tool that I'm employing for compiling them from the .bat scripts.

Which operating system are you using: Windows 8?

Best,
Roland
Parent - By Sedat Canbaz Date 2015-09-28 12:25
Zitat:


Great news, thanks!

And thanks a ton for directing my attention to the issue with running the .exe starters under your environment, since this indicates a possible problem with the tool that I'm employing for compiling them from the .bat scripts.

Which operating system are you using: Windows 8?

Best,
Roland


Not at all...it is my pleasure

It was on Windows 8.1

Btw, sorry that I still did not test your engine, but in a few days I plan to test it under current TTs conditions

Good luck,
Sedat
Parent - By Sedat Canbaz Date 2015-09-29 22:34
As I promised, I've just tested Fischerle 0.9.65 x64, for standings:
https://sites.google.com/site/computerschess/tts-3sec

Best,
Sedat
Parent - - By GS Date 2015-09-26 10:23
Fischerle works perfectly under Shredder-Classic as well.
We are perform a test for our CEGT 40/4 at the moment:
http://cegt.forumieren.com/t412-testing-fischerle-0-9-65-x64
under Shredder-Classic on an Intel i5.
Parent - By Roland Date 2015-09-26 16:01
Thanks for this information! Nice to hear that Fischerle works perfectly under Shredder Classic!

And thanks, of course, for testing it!
Parent - - By Sedat Canbaz Date 2015-10-29 13:29
Update: 29.10.2015

Engine                     : Points
Stockfish 251015 x64       : 304
NirvanaChess 2.2 x64       : 207
Protector 1.9.0 x64        : 157
Pedone 1.3 x64             : 125
ProDeo 2.0 w32             : 112
Maverick 1.5 x64           : 104
CyberPagno 2.3 w32         :  84
NanoSzachy 4.1 x64         :  78


Details,
- Stockfish 251015 (Author: lucasart) performed 13 points less than StockfishTS 240915
- NirvanaChess 2.2 x64 performed 18 points better than NirvanaChess 2.1c x64
- Protector 1.9.0 is bugy as previous its versions (performed much worse: 157 points)
* Protector 1.8.0 is tested via Polyglot adapter, where v1.9.0 as native (UCI)
- The latest Donna 3.1 x64 version does not support analyze mode too
- Pedone 1.3 x64 performed 3 points less than Pedone 1.2 x64
- ProDeo 2.0 w32 performed 5 points less than ProDeo 1.87 w32
- Maverick 1.5 x64 performed 5 points better than Maverick 1.0 x64
- NanoSzachy 4.1 x64 performed 1 points less than NanoSzachy 4 x64
- CyberPagno 2.3 w32 performed 8 points better than CyberPagno 2.2 w32

*CyberPagno, Maverick, NirvanaChess are the only ones in this update,
which managed to scored better than older versions...congrats to their developers !

More Details,
Probably I will publish my used EPD database after when X chess engine will pass Houdini !)
So till to this date, TTs test positions are planning to be kept as private (as before)
And I really wonder: who will be this hero ?) who will be able to add best tactical improvement!

SCCT - TTs (3 seconds per position) - All Versions:
https://sites.google.com/site/computerschess/tts-3sec

SCCT - TTs Conditions/Rules:
https://sites.google.com/site/computerschess/tts-3sec-conditions

Best Regards,
Sedat
Parent - - By Michael Scheidl Date 2015-10-29 15:51
Thanks. - It would be interesting to know which result the standard Houdini 4 (as seen e.g. in TCEC) can achieve.
Parent - By Michael Scheidl Date 2015-10-30 13:28
Nevermind, meanwhile I have found it in this thread (310).
Up Topic Hauptforen / CSS-Forum / SCCT – Tactical Test Suite's Results

Powered by mwForum 2.29.3 © 1999-2014 Markus Wichitill