Moved TODO.txt to personal folder - h48 - A prototype for an optimal Rubik's cube solver, work in progress.

commit f25a10e19eca294c4e6a99e4f80ce5cfd11a0e5f
parent 4a016679d1f5f33ab0679dfc315222797075dc65
Author: Sebastiano Tronto <sebastiano@tronto.net>
Date:   Sun, 18 Aug 2024 10:41:52 +0200

Moved TODO.txt to personal folder

Diffstat:
D TODO.txt  | 193 -------------------------------------------------------------------------------

1 file changed, 0 insertions(+), 193 deletions(-)
diff --git a/TODO.txt b/TODO.txt
@@ -1,193 +0,0 @@
-# Documentation and cleanup
-
-- Internal documentation
-  - h48 solver and co
-    - what is h and k
-  - cube representation and base routines?
-- Split files
-  - h48 into:
-    - h48_base (coordinate computation)
-    - h48_map (only the hashmap thing)
-    - h48_gendata (table generation)
-    - h48_solve (including stats solver)
-
-# H48 table generation
-  - compute all tables for h<11
-    x compute visited up to a fixed depth 8
-    - compute additional step (if needed) to fill <=base
-    - brute-force the last 2 steps (only 18 moves + 18*15 move pairs, + sim)
-    - compare with known h0 results (from long-running test)
-  - compute table for h=11
-    - can it be unified to the other computation, or is it much better
-      to do it ad hoc?
-    - derive small tables from the large one to check correctness
-      - this requires too much ram, but I can print the summary of the table
-  - Add long-running test for h0k4 (maybe as a tool?)
-    - tests for other sizes?
-  - gendata tool: save tables?
-  - optimize
-    - use only transform_edges (need compose trans)
-    - parallelize with pthread
-
-# Solver (Enrico)
-
-- Add a solver for h=0
-- check if this (or equivalent) works:
-    ./run solve -solver H48 -options "2;20" -n 1 -M 10 -cube \
-         "$(./run frommoves -moves "UFRUFU")"
-
-table base for k=2 (4 most common values start at)
-  0	8
-  1	8
-  2	8
-  3	8 or 9 (very close)
-  4	9
-  5	9
-  6	9
-  7	9 or 10 (very close)
-  8	10
-  9	10
-  10	10
-  11	11
-
-Solver
-  - cleanup h48 solver
-    - do not copy dfsarg, change and undo
-    - implement and use premove (and test) instead of inverting
-    - improve name of tables file in shell.c (include h value, maybe k, max)
-  - benchmark for solve
-    table generation, where to keep tables? in benchmark folder or in tables/?
-  - more tricks for solver, optimize, try larger tables
-  - remove solve_simple and maybe the whole solve_generic
-  - shell: silently accept other formats too?
-  - shell: allow generating multiple tables for different options
-  - gendata: move info at start of tables?
-
-Cleanup cube_public and interface
-  - remove options, use only solver name
-    cleanup also benchmark
-  - write a generic parse + dispatch to solver
-    (maybe use helper function from solve_h48 for parsing hXkY)
-
-Goal: find out which k value is best
-  - temporarily call current table and solver "k4" instead of h48
-  - write table generation and solver for k2 and k1
-  - benchmark for different sizes!
-
-Improvements
-  - check hash of generated data
-  - use interleaved tables (e.g. big table with k=2 or k=1 and interleaved
-    small table with k=4 for better backup pruning)
-
-small things
-  - maybe move part of the logic for coord_h48 (and its inverse) to
-    utils.h (subsettoindex-like)
-  - rename TYPE build switch to something more intuitive like ARCH
-    remove one of TYPE and CUBE_TYPE (why do I have two?)
-  - merge constants and utils?
-
-## H48 optimal solver (some has already been implemented)
-
-First compute co + csep. Use csep as a binary number (2^7 instead of 70,
-loose a factor of 1.8 but still fits in a few megabytes or less). Use
-co + csep as an index in a table whose entries have: 6 bits for ttrep,
-12 bits for rep, 4 bits for pruning. Optionally, 4 more bits could be
-used for the base of the pruning table, if we want to have a different
-base for each corner state; but probably not useful.
-
-If the first pruning is enough, or if the base value of the pruning table
-(see below) is too low, do not compute the full coordinate (which includes
-epsep + partial EO, 12 different sizes depending on how many edges).
-
-Otherwise, transform edges only using ttrep and compute full coordinate.
-Look up in table. 3 types of table:
-1. 4 bits per entry, full pruning table
-2. 3 bits with base value (let's try, why not)
-3. 2 bits with base value, nxopt style
-4. 1 bit per entry, telling only if more or less than mid value
-Types 2-4 require benchmarks, a lot of them.
-
-Inverse probing (no need to compute inverse, compute one at the beginning
-and keep adding premoves); better do first part of pruning for both
-normal and inverse and only then search in the full table.
-
-If inverse probing gives tight bound, reduce branching factor, optionally
-switch. Here NISS may be useful.
-
-## Other solvers
-
-* nxopt (various sizes, for comparison; also use base value probing
-  and benchmarking)
-* Coordinate solver for replacing nissy backend (specify in the comments
-  that coordinates return 0 if solved)
-* simple solver with small table for short solutions
-
-## ARM NEON intrinsics and other architectures
-
-* For ARM: use two uint8x16_t (or uint8x16x2_t) and vqtbl* instructions;
-  see https://developer.arm.com/architectures/instruction-sets/intrinsics
-* Implement also SSE? Why not...
-
-## Optimizations
-
-* Moves: don't do full compose for U*, D*, *2 (I removed this because I
-  was using shuffle intructions wrong, should re-do it)
-* transform edges only for h48 coord calculation
-* ptable: since it is fully symmetric, do only U or U2 at depth 1
-* use threads: how to detect at runtime? what is sane number to default to?
-* multisolve with adaptive threading
-* Trans: don't do full compose, for some trans composing perm is enough.
-  Split out sumco() as a separate function and refactor, optimize.
-* Use multi-move (up to 4/5 moves at once)
-* see if vcube's method to flip all corners is better
-* find a better way for computing the inverse?
-* Transform with big table: make static cube actually static (how?)
-* Use selfsim: in generating some tables, it is in thery possible to only check
-  the few transformations that give self-similarity instead of all 48.
-  The performance drop is almost insignificant, but I would like to figure out
-  the mistake I made previously.
-
-## Improvements and other things
-
-* Rename to libnissy (prefix public functions with nissy_?)
-* add centers (and slice moves and rotations)
-  for avx2: centers in the same lane as corners, numbered from 9 to 14
-* for CO: move to bits 5 and 6, no need for padding bit
-* manipulate move sequences (invert, unniss, cleanup, mirror / transform...)
-* NISS: Add mask to moves (e.g. U | NISS where NISS = 32 or something);
-  adapt readmoves and writemoves.
-* More I/O formats:
-  reid format
-  nissy
-  ascii art (color = 1 letter? color print?)
-  twizzle binary https://www.experiments.cubing.net/cubing.js/spec/binary/
-
-## "Front-end"
-
-* nissy shell: make more usable, meaningful error messages etc
-  for example, it should check that the solver is valid
-* Write adapter code for other languages:
-  python
-  hare (see blog post 2023-12-01 for ffi)
-  rust, go
-  dart ffi, js
-  java
-
-## More documentation?
-
-* Add documentation comments inside cube.c?
-* Copy this to cube.c
-
-Transformations can be either simple rotations or a rotation composed
-with a mirroring.  A composed rotation + mirror is obtained by applying
-the corresponding rotation to the solved cube mirrored along the M plane.
-
-For example, to apply the transformation RBm (mirrored RB) to a cube C:
-	1. Apply a mirror along the M plane to the solved cube
-	2. Rotate the mirrored cube with z' y2
-	3. Apply the cube C to the transformed solved cube
-	4. Apply the transformations of step 1a and 1b in reverse
-
-## Future work?
-
-* A* on GPU? https://github.com/mwarzynski/uw_parallel_a_star

	h48 A prototype for an optimal Rubik's cube solver, work in progress.
	git clone https://git.tronto.net/h48
	Download \| Log \| Files \| Refs \| README \| LICENSE