commit d5c60236c06e67e8da45e63843b37e5d619e9d7b parent ff68d44aefada070b756a1bdbb0d4d722779440d Author: Sebastiano Tronto <sebastiano@tronto.net> Date: Wed, 17 Dec 2025 20:54:23 +0100 Improve performance of H48 by 10-30% using interleaved fallback tables. (See the last 12 commits or so) H48 now uses interleaved fallback tables, similarly to nxopt / vcube. This allowed us to simplify the code (we don't use k != 2 anymore for H48) and gives some nice performance improvement, although repeated measurements show inconsistent results. The actual speed up depends on the table, and it is more pronounced on larger versions of the solver. The nasty part about using interleaved tables is that they are the most efficient when they are aligned in memory to 512 bits, but the core library defers memory management to the implementor, so there is no way to ensure this. For example, any application that wants to save the tables to a file and then re-load them on a subsequent run (such as our shell and tools) should make sure to load the data in a 512-bit aligned memory buffer. The best we can do on the library side having our main lookup table be 512-bit alinged within the whole solver data (which includes e.g. cocsepdata and a preamble). I have also moved some conditionals around in the various checks in the search dfs hoping to improve the performance further, but the results have been barely noticeable. The benchmarks have been updated. Moreover, there is now a way to re-run them with a single script (see the updates in the benchmarks folder for details). Further attempts at optimizing the code via known techniques (such as prefetching) have failed, but I'll come back to this.