nissy-core

The "engine" of nissy, including the H48 optimal solver.
git clone https://git.tronto.net/nissy-core
Download | Log | Files | Refs | README | LICENSE

commit da31fbb09fec50c127bd16a99fca81bdd013c40d
parent 39e736f68c0d92b47bd124357a33c02d2bd2fdc5
Author: Sebastiano Tronto <sebastiano@tronto.net>
Date:   Thu, 19 Jun 2025 16:31:58 +0200

Add benchmark results

Diffstat:
Abenchmarks/benchmarks.md | 275+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Rutils/benchmark_scrambles/scrambles-16.txt -> benchmarks/scrambles/scrambles-16.txt | 0
Rutils/benchmark_scrambles/scrambles-17.txt -> benchmarks/scrambles/scrambles-17.txt | 0
Rutils/benchmark_scrambles/scrambles-18.txt -> benchmarks/scrambles/scrambles-18.txt | 0
Rutils/benchmark_scrambles/scrambles-19.txt -> benchmarks/scrambles/scrambles-19.txt | 0
Rutils/benchmark_scrambles/scrambles-20.txt -> benchmarks/scrambles/scrambles-20.txt | 0
Abenchmarks/scrambles/superflip.txt | 1+
7 files changed, 276 insertions(+), 0 deletions(-)

diff --git a/benchmarks/benchmarks.md b/benchmarks/benchmarks.md @@ -0,0 +1,275 @@ +# Benchmarks for the H48 optimal solver and comparison with vcube + +This page contains some benchmarks for some of the possible +configurations of the H48 optimal solver. For comparison we also +include similar measurements for Andrew Skalski's impressive optimal +solver [vcube](https://github.com/Voltara/vcube), which has been taken +as a benchmark reference throughout the development of H48. The two +solvers differ both in method (although they both use an A* search) +and implementation. + +Similar benchmarks can be found: + +* In Chen Shuang's + [dedicated github page](https://github.com/cs0x7f/cube_solver_test/wiki), + which includes many more solvers. At the time of writing, the version + of H48 used for these benchmarks is a rather old one. Note that + cube48opt is a re-implementation of the H48 solver by Chen Shuang. +* In Enrico Tenuti's [thesis](https://github.com/enricotenuti/h48thesis), + also using an old implementation of H48, but including more data and + nice plots. + +## Setting + +*All movecounts are in Half Turn Metric* + +For the benchmarks we used some sets of random positions that can be found +in `benchmarks/scrambles`. We divide them by optimal solution length, +because the time to solve a scramble grows exponentially on the number of +moves, so mixing up positions with long and short solutions is going to +make the shorter solution almost irrelevant. Even within the same solution +length the time to solve a random position can vary significantly, +so we have used sets of 25 scrambles and taken the average of those, +except for the [Superflip](https://en.wikipedia.org/wiki/Superflip), +which is a single scramble. + +For short solutions, the very low numbers we get with the largest solver +may be particularly inaccurate, because writing the solution (together +with some log messages) to standard output may take a significant portion +of the time, depending on the terminal emulator used and other factors. + +We ran two different tests: finding a single optimal solution and finding +*all* optimal solution. Both problems involve an exhaustive search at +depth `n-1`, but the second also involves and exhaustive search at depth +`n`. Comparisons with vcube are only done in the first case, because +vcube's current configuration does not allow for finding multiple +solutions, as far as I am aware. + +Since the size of the pruning table used by the solver is of utmost +importance, we include two statistics: time per cube and time per cube +*adjusted by table size*. The adjustment we took simply consists in +multiplying the time per cube by the size of the table. Empirically +this leads to more even results across the board, as the speed of the +solvers of the same family seems to scale linearly with the size of the +pruning table. + +All benchmark tests were done on the following configuration: + +* CPU: AMD Ryzen 7 7700 (8 cores, 16 virtual threads) +* Memory: 2x Corsair Vengeance 32GB 5600MHz +* Motherboard: Gigabyte B650M K +* Operating system: Debian 12 (Linux kernel 6.1.0) +* Compiler: GCC 12.2.0 for H48 and Clang 14.0.6 for vcube + +## Single solution + +Average time for finding a single optimal solution. + +### Single thread + +Time per cube (in seconds, lower is better). + +| Solver | Size |17 moves|18 moves|19 moves|20 moves|Superflip| +|:---------|:-------|-------:|-------:|-------:|-------:|--------:| +|H48 h11 k2|56.1GiB | 0.23 | 1.15 | 5.08 | 31.30 | 278.47 | +|vcube 404 |31.8GiB | 0.30 | 1.25 | 6.87 | 57.49 | 291.31 | +|H48 h10 k2|28.1GiB | 0.34 | 1.80 | 7.77 | | | +|vcube 308 |21.2GiB | 0.20 | 1.11 | 6.92 | | | +|H48 h9 k2 |14.1GiB | 0.42 | 2.84 | 12.86 | | | +|vcube 208 | 7.3GiB | 0.57 | 4.41 | 20.75 | | | +|H48 h8 k2 | 7.1GiB | 0.86 | 6.66 | 27.40 | | | +|H48 h7 k2 | 3.6GiB | 1.47 | 8.90 | 42.46 | | | +|vcube 112 | 2.4GiB | 1.01 | 9.39 | | | | +|H48 h6 k2 | 1.8GiB | 2.28 | 16.89 | | | | + +Time per cube adjusted for tables size (in seconds \* GiB, lower is better). + +| Solver | Size |17 moves|18 moves|19 moves|20 moves|Superflip| +|:---------|:-------|-------:|-------:|-------:|-------:|--------:| +|H48 h11 k2|56.1GiB | 12.90 | 64.51 | 284.99 |1755.93 |15622.17 | +|vcube 404 |31.8GiB | 9.54 | 39.75 | 218.47 |1828.18 | 9263.66 | +|H48 h10 k2|28.1GiB | 9.55 | 50.58 | 218.34 | | | +|vcube 308 |21.2GiB | 4.24 | 23.53 | 146.70 | | | +|H48 h9 k2 |14.1GiB | 5.92 | 40.04 | 181.33 | | | +|vcube 208 | 7.3GiB | 4.16 | 32.19 | 151.48 | | | +|H48 h8 k2 | 7.1GiB | 6.11 | 47.29 | 194.54 | | | +|H48 h7 k2 | 3.6GiB | 5.29 | 32.04 | 152.86 | | | +|vcube 112 | 2.4GiB | 2.42 | 22.53 | | | | +|H48 h6 k2 | 1.8GiB | 4.10 | 30.40 | | | | + +### Multithread (4 threads) + +Time per cube (in seconds, lower is better). + +| Solver | Size |17 moves|18 moves|19 moves|20 moves|Superflip| +|:---------|:-------|-------:|-------:|-------:|-------:|--------:| +|H48 h11 k2|56.1GiB | 0.06 | 0.31 | 1.31 | 7.96 | 77.81 | +|vcube 404 |31.8GiB | 0.10 | 0.38 | 1.88 | 16.98 | (a) | +|H48 h10 k2|28.1GiB | 0.10 | 0.47 | 2.00 | 13.54 | 114.35 | +|vcube 308 |21.2GiB | 0.06 | 0.42 | 1.95 | 17.73 | (a) | +|H48 h9 k2 |14.1GiB | 0.14 | 0.83 | 3.82 | 25.98 | 162.72 | +|vcube 208 | 7.3GiB | 0.17 | 1.49 | 5.88 | | (a) | +|H48 h8 k2 | 7.1GiB | 0.27 | 2.02 | 7.94 | | | +|H48 h7 k2 | 3.6GiB | 0.35 | 2.59 | 12.41 | | | +|vcube 112 | 2.4GiB | 0.29 | 3.15 | 12.06 | | (a) | +|H48 h6 k2 | 1.8GiB | 0.65 | 4.79 | 23.91 | | | + +Time per cube adjusted for tables size (in seconds \* GiB, lower is better). + +| Solver | Size |17 moves|18 moves|19 moves|20 moves|Superflip| +|:---------|:-------|-------:|-------:|-------:|-------:|--------:| +|H48 h11 k2|56.1GiB | 3.37 | 17.39 | 73.49 | 446.56 | 4365.92 | +|vcube 404 |31.8GiB | 3.80 | 12.08 | 59.78 | 539.96 | (a) | +|H48 h10 k2|28.1GiB | 2.81 | 13.21 | 56.20 | 380.47 | 3213.24 | +|vcube 308 |21.2GiB | 1.27 | 8.90 | 41.34 | 375.88 | (a) | +|H48 h9 k2 |14.1GiB | 1.97 | 11.70 | 53.86 | 366.32 | 2294.35 | +|vcube 208 | 7.3GiB | 1.24 | 10.88 | 42.92 | | (a) | +|H48 h8 k2 | 7.1GiB | 1.92 | 14.34 | 56.37 | | | +|H48 h7 k2 | 3.6GiB | 1.26 | 9.32 | 44.68 | | | +|vcube 112 | 2.4GiB | 0.70 | 7.56 | 28.94 | | (a) | +|H48 h6 k2 | 1.8GiB | 1.17 | 8.62 | 43.04 | | | + +(a) vcube cannot parallelize on a single scramble, the results for the +Superflip are going to be the same as in the single thread case. + +### Multithread (16 threads) + +Time per cube (in seconds, lower is better). + +| Solver | Size |17 moves|18 moves|19 moves|20 moves|Superflip| +|:---------|:-------|-------:|-------:|-------:|-------:|--------:| +|H48 h11 k2|56.1GiB | 0.02 | 0.10 | 0.43 | 2.48 | 26.29 | +|vcube 404 |31.8GiB | 0.03 | 0.16 | 0.67 | 6.36 | (a) | +|H48 h10 k2|28.1GiB | 0.03 | 0.16 | 0.74 | 4.43 | | +|vcube 308 |21.2GiB | 0.04 | 0.22 | 0.89 | 9.53 | (a) | +|H48 h9 k2 |14.1GiB | 0.04 | 0.26 | 1.18 | 8.31 | | +|vcube 208 | 7.3GiB | 0.08 | 0.80 | 2.38 | | (a) | +|H48 h8 k2 | 7.1GiB | 0.08 | 0.60 | 2.48 | | | +|H48 h7 k2 | 3.6GiB | 0.11 | 0.81 | 3.91 | | | +|vcube 112 | 2.4GiB | 0.15 | 1.66 | 5.18 | | (a) | +|H48 h6 k2 | 1.8GiB | 0.21 | 1.53 | 7.82 | | | + +Time per cube adjusted for tables size (in seconds \* GiB, lower is better). + +| Solver | Size |17 moves|18 moves|19 moves|20 moves|Superflip| +|:---------|:-------|-------:|-------:|-------:|-------:|--------:| +|H48 h11 k2|56.1GiB | 1.12 | 5.61 | 24.12 | 139.13 | 1474.87 | +|vcube 404 |31.8GiB | 1.08 | 5.09 | 21.31 | 202.25 | (a) | +|H48 h10 k2|28.1GiB | 0.84 | 4.50 | 20.79 | 124.48 | | +|vcube 308 |21.2GiB | 0.85 | 4.66 | 18.87 | 202.04 | (a) | +|H48 h9 k2 |14.1GiB | 0.56 | 3.67 | 16.64 | 117.17 | | +|vcube 208 | 7.3GiB | 0.58 | 5.84 | 17.37 | | (a) | +|H48 h8 k2 | 7.1GiB | 0.57 | 4.26 | 17.60 | | | +|H48 h7 k2 | 3.6GiB | 0.40 | 2.92 | 14.07 | | | +|vcube 112 | 2.4GiB | 0.36 | 3.98 | 12.43 | | (a) | +|H48 h6 k2 | 1.8GiB | 0.38 | 2.75 | 14.08 | | | + +(a) vcube cannot parallelize on a single scramble, the results for the +Superflip are going to be the same as in the single thread case. + +## All optimal solutions (TODO) + +Average time for finding all optimal solutions. + +### Single thread + +Time per cube (in seconds, lower is better). + +| Solver | Size |17 moves|18 moves|19 moves|20 moves| +|:---------|:-------|-------:|-------:|-------:|-------:| +|H48 h11 k2|56.1GiB | | | | | +|H48 h10 k2|28.1GiB | | | | | +|H48 h9 k2 |14.1GiB | | | | | +|H48 h8 k2 | 7.1GiB | | | | | +|H48 h7 k2 | 3.6GiB | | | | | +|H48 h6 k2 | 1.8GiB | | | | | + +Time per cube adjusted for tables size (in seconds \* GiB, lower is better). + +| Solver | Size |17 moves|18 moves|19 moves|20 moves| +|:---------|:-------|-------:|-------:|-------:|-------:| +|H48 h11 k2|56.1GiB | | | | | +|H48 h10 k2|28.1GiB | | | | | +|H48 h9 k2 |14.1GiB | | | | | +|H48 h8 k2 | 7.1GiB | | | | | +|H48 h7 k2 | 3.6GiB | | | | | +|H48 h6 k2 | 1.8GiB | | | | | + +### Multithread (4 threads) + +Time per cube (in seconds, lower is better). + +| Solver | Size |17 moves|18 moves|19 moves|20 moves| +|:---------|:-------|-------:|-------:|-------:|-------:| +|H48 h11 k2|56.1GiB | | | | | +|H48 h10 k2|28.1GiB | | | | | +|H48 h9 k2 |14.1GiB | | | | | +|H48 h8 k2 | 7.1GiB | | | | | +|H48 h7 k2 | 3.6GiB | | | | | +|H48 h6 k2 | 1.8GiB | | | | | + +Time per cube adjusted for tables size (in seconds \* GiB, lower is better). + +| Solver | Size |17 moves|18 moves|19 moves|20 moves| +|:---------|:-------|-------:|-------:|-------:|-------:| +|H48 h11 k2|56.1GiB | | | | | +|H48 h10 k2|28.1GiB | | | | | +|H48 h9 k2 |14.1GiB | | | | | +|H48 h8 k2 | 7.1GiB | | | | | +|H48 h7 k2 | 3.6GiB | | | | | +|H48 h6 k2 | 1.8GiB | | | | | + +### Multithread (16 threads) + +Time per cube (in seconds, lower is better). + +| Solver | Size |17 moves|18 moves|19 moves|20 moves| +|:---------|:-------|-------:|-------:|-------:|-------:| +|H48 h11 k2|56.1GiB | | | | | +|H48 h10 k2|28.1GiB | | | | | +|H48 h9 k2 |14.1GiB | | | | | +|H48 h8 k2 | 7.1GiB | | | | | +|H48 h7 k2 | 3.6GiB | | | | | +|H48 h6 k2 | 1.8GiB | | | | | + +Time per cube adjusted for tables size (in seconds \* GiB, lower is better). + +| Solver | Size |17 moves|18 moves|19 moves|20 moves| +|:---------|:-------|-------:|-------:|-------:|-------:| +|H48 h11 k2|56.1GiB | | | | | +|H48 h10 k2|28.1GiB | | | | | +|H48 h9 k2 |14.1GiB | | | | | +|H48 h8 k2 | 7.1GiB | | | | | +|H48 h7 k2 | 3.6GiB | | | | | +|H48 h6 k2 | 1.8GiB | | | | | + +## Other notes + +* Missing values in the table mean that the test is very slow and I did + not want to wait for it to finish. I may add these values in the future. +* All the measurements above exclude the time needed to load the pruning + tables into memory, which can be quite significant for large tables. To + repeat these measurements, one can use the tool `301_solve_file`. For + example: + +``` +./build tool solve_file h48h7k2 ./benchmarks/scrambles/scrambles-16.txt +``` + + To find all solutions, add something like `99999 0` at the end of the + command. This will tell the tool to find up to `99999` solutions that + are at most `0` moves longer than optimal. +* The measurements also excluded the one-off computation of the pruning + tables which, for reasons related to the cube coordinates used, is + significantly slower for H48 compared to vcube. +* H48's and vcube's approach to multithreading are extremely different: + H48 parallelize the search for each cube individually, vcube solves + multiple cubes in parallel by dedicating a single thread to each of + them. Both apporaches have pros and cons: vcube's approch has less + overhead in coordination between the threads, but often some threads + may be left without work when there are no more cubes left to solve. +* Per-cube parallelization means that H48 will always be faster than + vcube when solving a single cube. +* vcube only supports x86 processors (Intel, AMD), while H48 runs on any + architecture, including e.g. ARM (Macbook M series, android phones) + and can be compiled to WebAssembly as well. diff --git a/utils/benchmark_scrambles/scrambles-16.txt b/benchmarks/scrambles/scrambles-16.txt diff --git a/utils/benchmark_scrambles/scrambles-17.txt b/benchmarks/scrambles/scrambles-17.txt diff --git a/utils/benchmark_scrambles/scrambles-18.txt b/benchmarks/scrambles/scrambles-18.txt diff --git a/utils/benchmark_scrambles/scrambles-19.txt b/benchmarks/scrambles/scrambles-19.txt diff --git a/utils/benchmark_scrambles/scrambles-20.txt b/benchmarks/scrambles/scrambles-20.txt diff --git a/benchmarks/scrambles/superflip.txt b/benchmarks/scrambles/superflip.txt @@ -0,0 +1 @@ +U R2 F B R B2 R U2 L B2 R U' D' R2 F R' L B2 U2 F2