We plan to publish here evaluation results, i.e., aggregated charts as well as actual results (not particularly useful as far as the results are concerned because these are yes/no answers).
These results may be updated as soon as new systems are available or improvements are observed.
We run the experiments below on a Debian Linux virtual machine configured with four processors and 20GB of RAM running under a Dell PowerEdge T610 with 2*Intel Xeon Quad Core 2.26GHz E5607 processors and 32GB of RAM, under Linux ProxMox 2 (Debian).
The solvers were not genuinely reentrant. Hence, each test case has been run in a separate process; after that, the first case of each suite has been run as a warm up.
All solvers are Java programs. The Java virtual machines were run with a maximum heap size of 2024MB and a timeout at 20s (20000ms). Raising memory size to 1GB and timeout to 40s does not change timeout results. The μ-calculus solvers take advantage of a native BDD library. Using the native implementation doubles the speed of these solvers, however, it also brings large initialization time (in spite of warm-up set up).
Reported figures are the average of 5 runs (we run the tests 7 times and ruled out each time the best and worst performance).
Note that for each of these charts, the tests have been run again, so
their actual results may slightly differ.
These are the results of [ISWC2013], after patching the BDD library and improving the initialisation strategy of LMU. They show far less timeouts from TreeSolver (only on p23, p24). The full result set is results-2013-07.zip.
This version shows changes only for the μ-calculus solvers. The UCQProj results were also in the submitted version of [ISWC2013]. These give slower results but fewer timeouts. However, it also shows a near-constant and slow behaviour from TreeSolver. This is a hint of the initialisation of the solver dominating the results.
These results were in the submitted version of [ISWC2013]. They show many timeouts from TreeSolver (on nop3, nop4, nop15, nop16, p3, p4, p15, p16, p23, p24, p25, p26, rdfs7, rdfs8) and AFMU (on nop3, nop4, nop15, nop16, p3, p4, p9, p10, p15, p16, p23, p24, p25, p26, rdfs9). These were mostly due to a bug in the BDD library (libbuddy).
The initial version reported in [RR2012] was not accurate. It included too much virtual machine startup and systems were not compared in the exact same way. It is not reproduced here.