h48

A prototype for an optimal Rubik's cube solver, work in progress.
git clone https://git.tronto.net/h48
Download | Log | Files | Refs | README | LICENSE

commit 69535c84b64ea9d19e0e06f35b5866d2ca352d38
parent bdc7004014d3b62a72867bc78961feb7c4f73c2a
Author: Sebastiano Tronto <sebastiano@tronto.net>
Date:   Tue, 23 Apr 2024 11:04:07 +0200

Updated TODO

Diffstat:
MTODO.txt | 80+++++++++++++++++++++----------------------------------------------------------
1 file changed, 21 insertions(+), 59 deletions(-)

diff --git a/TODO.txt b/TODO.txt @@ -1,6 +1,11 @@ -TODO big pruning table looks correct, improve speed? - use small table to prune, visited array...? - numbers so far (h=0, k=0): +TODO pruning tables: + - go back to nissy-style BFS for both cocsep and eoesep + - compute selfsim and keep list of representatives + - four different ruotines for k=4,2,1 for eoesep + - try: do not compute CO, but use its binary representation + (x8 memory for cocsep) + + numbers so far (eoesep h=0, k=0): 0 1 1 1 2 4 @@ -10,13 +15,12 @@ TODO big pruning table looks correct, improve speed? 6 41605 7 474128 8 4953846 + 9 34776317 -TODO checkdata and hash check for cocsep +TODO checkdata (available from cube.h) and hash check for cocsep TODO benchmarks for solve and table generation -TODO optimization for transform edges only (and test) -TODO ARM NEON part -## H48 optimal solver +## H48 optimal solver (some has already been implemented) First compute co + csep. Use csep as a binary number (2^7 instead of 70, loose a factor of 1.8 but still fits in a few megabytes or less). Use @@ -46,16 +50,11 @@ switch. Here NISS may be useful. ## Other solvers -* ptable should contain some extra data at the beginning: an integer (size) - a checksum some summary info, maybe even in text form -* multisolve -* use threads * nxopt (various sizes, for comparison; also use base value probing and benchmarking) * Coordinate solver for replacing nissy backend (specify in the comments that coordinates return 0 if solved) -* improve light solver with no table; consider using a small, hard-coded - table, e.g. H48 corner table? +* simple solver with small table for short solutions ## ARM NEON intrinsics and other architectures @@ -65,50 +64,32 @@ switch. Here NISS may be useful. ## Optimizations -* check which is faster: foreach_move or simple for loop? same for trans - -### General things - +* use threads: how to detect at runtime? what is sane number to default to? + pthreads or threads.h? +* multisolve with adaptive threading +* transform edges only for h48 coord calculation * Moves: don't do full compose for U*, D*, *2 (I removed this because I was using shuffle intructions wrong, should re-do it) * Trans: don't do full compose, for some trans composing perm is enough. Split out sumco() as a separate function and refactor, optimize. * Use multi-move (up to 4/5 moves at once) -* CO is the worst part of moving, transforming and inverting. Try basing - everything on representing the cube without CO and apply it only at the - end to check that it is actually solved. * see if vcube's method to flip all corners is better * find a better way for computing the inverse? -* Improve avx2 instructions in general - -### Threading - -* THREADS build time option for the number of threads. If set to one, - do not include any threading library or code. Try detecting at build - time, or set to a sane default (e.g. 8? 16?) for generic builds. -* pthread or threads.h? I am more familiar with pthread, but threads.h - is standard (from C11, so it requires switching to it from C99). - Does using threads.h help in any way (e.g. building on Windows)? ## Improvements and other things * Rename to libnissy (prefix public functions with nissy_?) -* add centers (and moves...) +* add centers (and slice moves and rotations) + for avx2: centers in the same lane as corners, numbered from 9 to 14 * for CO: move to bits 5 and 6, no need for padding bit * manipulate move sequences (invert, unniss, cleanup, mirror / transform...) * NISS: Add mask to moves (e.g. U | NISS where NISS = 32 or something); adapt readmoves and writemoves. -* Consider adding centers and other moves (for avx2: centers in the - same lane as corners, numbered from 9 to 14) -* More I/O: +* More I/O formats: + reid format nissy - ascii art (color = 1 letter) + ascii art (color = 1 letter? color print?) twizzle binary https://www.experiments.cubing.net/cubing.js/spec/binary/ - reid? -* print ptables (or layout data in such a way that can be printed - easily, e.g. first bytes are null-terminated strig and can be - printed by user) -* remove writetrans? ## "Front-end" @@ -135,25 +116,6 @@ For example, to apply the transformation RBm (mirrored RB) to a cube C: 3. Apply the cube C to the transformed solved cube 4. Apply the transformations of step 1a and 1b in reverse -## Roadmap - -* Tests for multisolve -* More tests for simple solver? -* Benchmarks? -* More complex optimal solvers, pruning tables -* (More) benchmarks -* Multithreading (build-time option number of threads) -* Other optimizations -* Extend cube and moves to include centers -* NISS -* Move manipulation utilities -* Coordinate solvers and other steps -* More output formats (see https://experiments.cubing.net/cubing.js/3x3x3-formats) -* Adapters for other languages (at least python) -* More documentation (or keep all in cube.h?) -* Rename to libnissy -* Release 1.0 - ## Future work? * A* on GPU? https://github.com/mwarzynski/uw_parallel_a_star