h48

A prototype for an optimal Rubik's cube solver, work in progress.
git clone https://git.tronto.net/h48
Download | Log | Files | Refs | README | LICENSE

commit f148aa65dfe587e7b5d2f48b5005be670788dbe6
parent ce5239e50d800289f516d4d17c37bee4bd77e98f
Author: Sebastiano Tronto <sebastiano@tronto.net>
Date:   Mon,  5 Aug 2024 17:20:44 +0200

Updated README.md, added help command to shell

Diffstat:
MREADME.md | 215+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------
MTODO.txt | 30+++++++++++++++++++++++-------
Mshell.c | 212++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------
Rtools/gendata_h48/gendata_h48.c -> tools/001_gendata_h48/gendata_h48.c | 0
Rtools/stats_tables_h48/stats_tables_h48.c -> tools/002_stats_tables_h48/stats_tables_h48.c | 0
Dutils/cube_h_with_format_documentation | 147-------------------------------------------------------------------------------
6 files changed, 399 insertions(+), 205 deletions(-)

diff --git a/README.md b/README.md @@ -1,46 +1,229 @@ -# Prototype for a new optimal solver +# H48: prototype for a new optimal solver and nissy backend -Work in progress. Everything is in a state of flux and can change without -notice. +**Warning**: this library is work in progress, breaking changes can +happen without notice. -## Building and running tests +H48 is an experimental Rubik's cube solver. The main goal is experimenting +with various optimal solving methods and pruning tables, with +[nxopt](https://github.com/rokicki/cube20src/blob/master/nxopt.md) and +[vcube](https://github.com/Voltara/vcube) as inspiration and benchmark +reference. -First run +In the future this project may evolve as a new "back-end" for the classic +[nissy](https://github.com/sebastianotronto/nissy-classic). + +## Building + +First run the configuration script to detect the system +configuration. This is going to select a C compiler and +architecture-specific optimizations. ``` $ ./configure.sh ``` -or `TYPE=AVX2 ./configure.sh` if you want to use AVX2 instructions. +These settings can be overridden, for example: + +``` +$ CC=clang ./configure.sh # Force use of clang instead of default cc +``` + +The support for ARM-specific optimizations (NEON instructions) is +incomplete. To compile correctly on these processors (e.g. Mac M1/M2/M3) +you need to manually disable optimizations: + +``` +$ TYPE="" ./configure.sh # Can be combined with CC=... +``` + +Once the configuration is done, you can build with make -Then +``` +$ make +``` + +## Running tests + +This project includes a suite of "unit" test. They can be run with: ``` $ make test ``` -to run the tests. You can also run only the tests that match a chosen -regex, for example: +To run only a subset of the tests, set the `TEST` variable to a regular +expression that matches only the name of the tests you want to run: ``` $ TEST=coord make test ``` -Due to ongoing changes, benchmarks are currently broken. +Each subfolder of the test folder contains a test. A test can consist +of multiple test cases (.in files). Running a test means compiling and +running the corresponding test against each test case. When a test case +is run, the .in file is read a the output of the program is compared +to the corresponding .out filei using diff(1). If the two differ, the +difference is printed out and no other test is run. + +The results of the last test case run is saved in test/last.out (standard +output, the results compared with the .out files) and test/last.err +(standard error). + +Tests are always run in "debug mode": this means that optimizations are +disabled and some extra logging is enabled. + +See the test folder and test/test.sh for details. + +## Running "tools" + +In the tools folder there are some small programs that test various +functionality of the H48 library. They work similarly to test, but they +are not run in debug mode. + +To run a tool you must select it with the environment variable `TOOL`. +For example the command: + +``` +TOOL=stats make tool +``` + +Will run the stats_tables_h48 tool. Like for tests, the value of the +`TOOL` variable can be any regular expression matching the name of the +tool. Unlike tests, one and only one tool will be selected for each run. + +Each tool run is automatically timed, so these tools can be used as +benchmark. The output as well as the time of the run are saved to a +file in the tools/results folder. -## Solving +## Running commands manually -Notes for myself while this is work in progress +This project also includes a rudimentary shell that can be used to run +commands manually. To build the shell use: ``` $ make shell -$ ./run frommoves -moves (scramble) ``` -copy the result, then +This will create an executable called `run`. Then you can for example +get a cube from a sequence of moves: + +``` +$ ./run frommoves -moves "R' U' F" +JLQWSVUH=ZLCUABGIVTKH +``` + +Or you can get a random cube + +``` +$ ./run randomcube +WDSQREVX=VBKYDUCJXWAb +``` + +If you don't like this format, you can convert it: + +``` +$ ./run convert -fin B32 -fout H48 -cubestr "WDSQREVX=VBKYDUCJXWAb" +UL1 UB0 BL0 FR1 DF0 UR1 DB0 FL0 DR1 DL1 UF0 BR1 DFR2 DBR0 DFL2 UFR2 UBL2 UFL0 UBR2 DBL2 +``` + +To solve a cube (experimental) you can use: ``` -$ ./run solve -solver "H48" -options "2;20" -n 1 -M 10 -cube (paste here) +$ ./run solve -solver "h48" -options "0;20" -n 1 -M 4 -cube "JLQWSVUH=ZLCUABGIVTKH" +Found 0 solutions, searching at depth 0 +Found 0 solutions, searching at depth 1 +Found 0 solutions, searching at depth 2 +Found 0 solutions, searching at depth 3 +Solution found: F' U R +F' U R ``` -Options can be changed from `2;20` to `n;20` for larger tables. +For a full list of available command, use `run help`. + +## Cube formats + +The cube is represented as a string in one of the following formats, +all explained below: + +* H48 +* LST +* B32 (the default) + +In all of the formats, the permutation of the center pieces is not +stored. This means that the cube is assumed to be in a fixed orientation. + +More formats will become available in the future. + +### Cube format: H48 + +In the H48 format, each edge is represented by two letters denoting the +sides it belongs to and one number denoting its orientation (0 oriented, 1 +mis-oriented). Similarly, each corner is represented by three letters and +a number (0 oriented, 1 twisted clockwise, 2 twisted counter-clockwise). + +The solved cube looks like this: + +``` +UF0 UB0 DB0 DF0 UR0 UL0 DL0 DR0 FR0 FL0 BL0 BR0 UFR0 UBL0 DFL0 DBR0 UFL0 UBR0 DFR0 DBL0 +``` + +The cube after the move F looks like this: + +``` +FL1 UB0 DB0 FR1 UR0 UL0 DL0 DR0 UF1 DF1 BL0 BR0 UFL1 UBL0 DFR1 DBR0 DFL2 UBR0 UFR2 DBL0 +``` + +Whitespace (including newlines) between pieces is ignored when reading the +cube. A single whitespace character is added between pieces when writing. + +You can find more examples of this format in the utils/cubes folder. + +## Cube format: LST + +In the LST format, a cube is represented by a comma-separated list of +integers. Each piece is represented by an (unsigned) 8-bit integer. The 4 +least-significant bits determine which piece it is, the other 4 determine +the orientation. + +Edges are numbered as follows (see also constants.h): + +UF=0 UB=1 DB=2 DF=3 UR=4 UL=5 DL=6 DR=7 FR=8 FL=9 BL=10 BR=11 + +Corners are numbered as follows: + +UFR=0 UBL=1 DFL=2 DBR=3 UFL=4 UBR=5 DFR=6 DBL=7 + +The orientation of the edges is with respect to F/B, the orientation of +corners is with respect to U/D. + +In this format, the solved cube looks like this: + +``` +0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 +``` + +The cube after the move F looks like this: + +``` +36, 1, 38, 3, 66, 5, 64, 7, 25, 1, 2, 24, 4, 5, 6, 7, 16, 19, 10, 11 +``` + +### Cube format: B32 + +This format is a "base 32" encoding of the cube. It is not meant to be +human-readable, but it is compact while still being plain text. Each +piece, including the orientation value, is encoded as a number from 0 +to 31, and this number is then converted to an uppercase letter (0-26) +or to a lowercase letter (27-31). Edges and corners are separated by a +single = character. + +In this format, the solved cube looks like this: + +``` +ABCDEFGH=ABCDEFGHIJKL +``` + +The cube after the move F looks like this: + +``` +MBODSFQH=ZBCYEFGHQTKL +``` diff --git a/TODO.txt b/TODO.txt @@ -1,7 +1,19 @@ -H48 table generation - - fix gendata tool - - use to test generation of full h0k4 table - - remove ifdef and old code +# Documentation and cleanup + +- Add comments to cube.h + - one for each function? +- Internal documentation + - h48 solver and co + - what is h and k + - cube representation and base routines? +- Split files + - h48 into: + - h48_base (coordinate computation) + - h48_map (only the hashmap thing) + - h48_gendata (table generation) + - h48_solve (including stats solver) + +# H48 table generation - compute all tables for h<11 x compute visited up to a fixed depth 8 - compute additional step (if needed) to fill <=base @@ -10,17 +22,21 @@ H48 table generation - compute table for h=11 - can it be unified to the other computation, or is it much better to do it ad hoc? + - derive small tables from the large one to check correctness + - this requires too much ram, but I can print the summary of the table - Add long-running test for h0k4 (maybe as a tool?) - tests for other sizes? + - gendata tool: save tables? - optimize - use only transform_edges (need compose trans) - parallelize with pthread -(OLD: - - Fails for UFRUFU, try command +# Solver (Enrico) + +- Add a solver for h=0 +- check if this (or equivalent) works: ./run solve -solver H48 -options "2;20" -n 1 -M 10 -cube \ "$(./run frommoves -moves "UFRUFU")" -) table base for k=2 (4 most common values start at) 0 8 diff --git a/shell.c b/shell.c @@ -13,10 +13,36 @@ #define SOLUTIONS_BUFFER_SIZE 500000 /* Should be enough */ #define MAX_PATH_LENGTH 10000 /* Should be enough */ +#define _flag_cube "-cube" +#define _flag_perm "-perm" +#define _flag_command "-command" +#define _flag_str_cube "-cubestr" +#define _flag_format "-format" +#define _flag_format_in "-fin" +#define _flag_format_out "-fout" +#define _flag_moves "-moves" +#define _flag_trans "-trans" +#define _flag_solver "-solver" +#define _flag_options "-options" +#define _flag_nisstype "-nisstype" +#define _flag_minmoves "-m" +#define _flag_maxmoves "-M" +#define _flag_optimal "-O" +#define _flag_maxsolutions "-n" + +#define _info_cubeformat(cube) cube " must be given in B32 format." +#define _info_movesformat "The accepted moves are U, D, R, L, F and B, " \ + "optionally followed by a 2, a ' or a 3." +#define _info_transformat "The transformation must be given in the format " \ + "(rotation|mirrored) (2 letters), for exmple " \ + "'rotation UF' or 'mirrored BL'." +#define _info_formats "The available formats are H48, B32 and SRC." + typedef struct { int command_index; char cube[22]; char cube_perm[22]; + char *str_command; char *str_cube; char *str_format; char *str_format_in; @@ -45,6 +71,7 @@ static int64_t randomcube_exec(args_t *); static int64_t datasize_exec(args_t *); static int64_t gendata_exec(args_t *); static int64_t solve_exec(args_t *); +static int64_t help_exec(args_t *); static int parse_args(int, char **, args_t *); static bool parse_int8(char *, int8_t *); @@ -52,6 +79,7 @@ static bool parse_int64(char *, int64_t *); static bool set_cube(int, char **, args_t *); static bool set_cube_perm(int, char **, args_t *); +static bool set_str_command(int, char **, args_t *); static bool set_str_cube(int, char **, args_t *); static bool set_str_format(int, char **, args_t *); static bool set_str_format_in(int, char **, args_t *); @@ -68,24 +96,6 @@ static bool set_maxsolutions(int, char **, args_t *); static bool set_id(int, char **, args_t *); static uint64_t rand64(void); - -#define COMMAND(N, E) { .name = N, .exec = E } -struct { - char *name; - int64_t (*exec)(args_t *); -} commands[] = { - COMMAND("compose", compose_exec), - COMMAND("inverse", inverse_exec), - COMMAND("applymoves", applymoves_exec), - COMMAND("applytrans", applytrans_exec), - COMMAND("frommoves", frommoves_exec), - COMMAND("convert", convert_exec), - COMMAND("randomcube", randomcube_exec), - COMMAND("datasize", datasize_exec), - COMMAND("gendata", gendata_exec), - COMMAND("solve", solve_exec), - COMMAND(NULL, NULL) -}; #define OPTION(N, A, S) { .name = N, .nargs = A, .set = S } struct { @@ -93,23 +103,118 @@ struct { int nargs; bool (*set)(int, char **, args_t *); } options[] = { - OPTION("-cube", 1, set_cube), - OPTION("-perm", 1, set_cube_perm), - OPTION("-cubestr", 1, set_str_cube), - OPTION("-format", 1, set_str_format), - OPTION("-fin", 1, set_str_format_in), - OPTION("-fout", 1, set_str_format_out), - OPTION("-moves", 1, set_str_moves), - OPTION("-trans", 1, set_str_trans), - OPTION("-solver", 1, set_str_solver), - OPTION("-options", 1, set_str_options), /* TODO: remove, use only solver */ - OPTION("-nisstype", 1, set_str_nisstype), /* TODO: remove, use flags */ - OPTION("-m", 1, set_minmoves), - OPTION("-M", 1, set_maxmoves), - OPTION("-O", 1, set_optimal), - OPTION("-n", 1, set_maxsolutions), + OPTION(_flag_cube, 1, set_cube), + OPTION(_flag_perm, 1, set_cube_perm), + OPTION(_flag_command, 1, set_str_command), + OPTION(_flag_str_cube, 1, set_str_cube), + OPTION(_flag_format, 1, set_str_format), + OPTION(_flag_format_in, 1, set_str_format_in), + OPTION(_flag_format_out, 1, set_str_format_out), + OPTION(_flag_moves, 1, set_str_moves), + OPTION(_flag_trans, 1, set_str_trans), + OPTION(_flag_solver, 1, set_str_solver), + OPTION(_flag_options, 1, set_str_options), /* TODO: remove, use only solver */ + OPTION(_flag_nisstype, 1, set_str_nisstype), /* TODO: remove, use flags */ + OPTION(_flag_minmoves, 1, set_minmoves), + OPTION(_flag_maxmoves, 1, set_maxmoves), + OPTION(_flag_optimal, 1, set_optimal), + OPTION(_flag_maxsolutions, 1, set_maxsolutions), OPTION(NULL, 0, NULL) }; + +#define COMMAND(N, S, D, E) { .name = N, .syn = S, .desc = D, .exec = E } +struct { + char *name; + char *syn; + char *desc; + int64_t (*exec)(args_t *); +} commands[] = { +/* TODO: add synopsis and description here */ + COMMAND( + "compose", + "compose " _flag_cube " CUBE " _flag_perm " PERM", + "Apply on CUBE the permutation defined by PERM. " + _info_cubeformat("CUBE and PERM"), + compose_exec + ), + COMMAND( + "inverse", + "inverse " _flag_cube " CUBE ", + "Compute the inverse of the given CUBE. " + _info_cubeformat("CUBE"), + inverse_exec + ), + COMMAND( + "applymoves", + "applymoves " _flag_cube " CUBE " _flag_moves " MOVES", + "Apply the given MOVES to the given CUBE. " + _info_cubeformat("CUBE") " " _info_movesformat, + applymoves_exec + ), + COMMAND( + "applytrans", + "applytrans " _flag_cube " CUBE " _flag_trans " TRANS", + "Apply the single transformation TRANS to the given CUBE. " + _info_cubeformat("CUBE") " " _info_transformat, + applytrans_exec + ), + COMMAND( + "frommoves", + "frommoves " _flag_moves " MOVES", + "Return the cube obtained by applying the given MOVES " + "to a solved cube. " _info_movesformat, + frommoves_exec + ), + COMMAND( + "convert", + "convert " _flag_str_cube " CUBESTR " + _flag_format_in " FORMAT_IN " _flag_format_out " FORMAT_OUT", + "Convert the cube described by CUBESTR from FORMAT_IN to " + "FORMAT_OUT." + _info_formats " " + "CUBESTR must be a valid cube in the FORMAT_IN format.", + convert_exec + ), + COMMAND( + "randomcube", + "randomcube", + "Returns a random cube in B32 format.", + randomcube_exec + ), + COMMAND( + "datasize", + "datasize" _flag_solver " SOLVER " _flag_options " OPTIONS", + "Return the size in bytes of the data table used by " + "SOLVER when called with the given OPTIONS.", + datasize_exec + ), + COMMAND( + "gendata", + "gendata" _flag_solver " SOLVER " _flag_options " OPTIONS", + "Generate the data table used by " + "SOLVER when called with the given OPTIONS.", + gendata_exec + ), + COMMAND( + "solve", + "solve" _flag_solver " SOLVER " _flag_options " OPTIONS " + "[" _flag_minmoves " n] [" _flag_maxmoves " N] " + _flag_cube " CUBE", + "Solve the given CUBE using SOLVER with the given OPTIONS, " + "using at least n and at most N moves. " + _info_cubeformat("CUBE"), + solve_exec + ), + COMMAND( + "help", + "help [" _flag_command " COMMAND]", + "If no COMMAND is specified, prints some generic information " + "and the list of commands. Otherwise it prints detailed " + "information about the specified COMMAND.", + help_exec + ), + COMMAND(NULL, NULL, NULL, NULL) +}; char *tablepaths[] = { "tables/", @@ -397,6 +502,33 @@ solve_exec(args_t *args) return 0; } +static int64_t +help_exec(args_t *args) +{ + int i; + + if (args->str_command == NULL || args->str_command[0] == '\0') { + printf("This is a rudimentary shell for the H48 library.\n"); + printf("Available commands and usage:\n\n"); + for (i = 0; commands[i].name != NULL; i++) + printf("%-15s%s\n", commands[i].name, commands[i].syn); + printf("\nUse 'help COMMAND' for more information.\n"); + } else { + for (i = 0; commands[i].name != NULL; i++) + if (!strcmp(args->str_command, commands[i].name)) + break; + if (commands[i].name == NULL) { + printf("Unknown command %s\n", args->str_command); + return 1; + } + printf("Command %s\n\n", commands[i].name); + printf("Synopsis: %s\n\n", commands[i].syn); + printf("Description: %s\n", commands[i].desc); + } + + return 0; +} + static int parse_args(int argc, char **argv, args_t *args) { @@ -507,6 +639,14 @@ set_cube_perm(int argc, char **argv, args_t *args) } static bool +set_str_command(int argc, char **argv, args_t *args) +{ + args->str_command = argv[0]; + + return true; +} + +static bool set_str_cube(int argc, char **argv, args_t *args) { args->str_cube = argv[0]; @@ -602,7 +742,8 @@ set_maxsolutions(int argc, char **argv, args_t *args) return parse_int64(argv[0], &args->maxsolutions); } -void log_stderr(const char *str, ...) +void +log_stderr(const char *str, ...) { va_list args; @@ -611,7 +752,8 @@ void log_stderr(const char *str, ...) va_end(args); } -int main(int argc, char **argv) +int +main(int argc, char **argv) { int parse_error; args_t args; diff --git a/tools/gendata_h48/gendata_h48.c b/tools/001_gendata_h48/gendata_h48.c diff --git a/tools/stats_tables_h48/stats_tables_h48.c b/tools/002_stats_tables_h48/stats_tables_h48.c diff --git a/utils/cube_h_with_format_documentation b/utils/cube_h_with_format_documentation @@ -1,147 +0,0 @@ -/****************************************************************************** -Cube type definition - -Each piece is represented by an (unsigned) 8-bit integer. The 4 -least-significant bits determine which piece it is, the other 4 determine -the orientation. - -Edges are numbered as follows (see also cube.c): -UF=0 UB=1 DB=2 DF=3 UR=4 UL=5 DL=6 DR=7 FR=8 FL=9 BL=10 BR=11 - -Corners are numbered as follows: -UFR=0 UBL=1 DFL=2 DBR=3 UFL=4 UBR=5 DFR=6 DBL=7 - -The orientation of the edges is with respect to F/B, the orientation of -corners is with respect to U/D. - -The permutation of the center pieces is not stored. This means that the -cube is assumed to be in a fixed orientation. - -TODO: define EO and CO better, explain how to use them -TODO: encode centers? - -The exact cube type structure depends on your system's configuration. If -you operate on the cube only via the functions provided below, you don't -need to worry about this. -******************************************************************************/ - -/* Apply the second cube on the first as a move sequence */ -cube_t compose(cube_t, cube_t); - -/* Invert the cube */ -cube_t inverse(cube_t); - -/* Check if a cube represent a valid state (possibly unsolvable) */ - -/* TODO comment on these and the format for moves and trans */ -/* For trans, only one trans is supported */ -cube_t applymoves(cube_t, const char *); -cube_t applytrans(cube_t, const char *); - -/****************************************************************************** -Read / write utilities - -Reading and writing is not done directly via stdin / stdout, but via an -array of char (called buf in the prototypes below). - -Multiple representations of the cube as text are supported: - -- H48: a human-readable format. - Each edge is represented by two letters denoting the sides it - belongs to and one number denoting its orientation (0 oriented, 1 - mis-oriented). Similarly, each corner is represented by three letters and - a number (0 oriented, 1 twisted clockwise, 2 twisted counter-clockwise). - - The solved cube looks like this: - - UF0 UB0 DB0 DF0 UR0 UL0 DL0 DR0 FR0 FL0 BL0 BR0 - UFR0 UBL0 DFL0 DBR0 UFL0 UBR0 DFR0 DBL0 - - The cube after the moves R'U'F looks like this: - - FL1 BR0 DB0 UR1 UF0 UB0 DL0 FR0 UL1 DF1 BL0 DR0 - UBL1 DBR1 UFR2 DFR2 DFL2 UBL2 UFL2 DBL0 - - Whitespace (including newlines) between pieces is ignored when reading the - cube. A single whitespace character is added between pieces when writing. - -- SRC: format used to generate code for internal use. - If OUT is the output in SRC format, one can use `cube_t cube = OUT` to - declare a new cube object. - -- LST: a format for internal use and generating code. - The cube is printed as a comma-separated list of 20 integers, as they appear - in cube_t. Corners come first, followed by edge (unlike H48). -******************************************************************************/ - -cube_t readcube(const char *format, const char *buf); -void writecube(const char *format, cube_t cube, char *buf); - -/****************************************************************************** -Solvers - -The solutions are returned as a newline-separated list of characters. Moves -are separated by single spaces. - -Unless specified otherwise, all the solutions are not trivially simplifiable. -This means that sequences like U U2 or R L R will not appear in any solution. -Moreover, two consecutive parallel moves are always going to be sorted in -increasing order. For example, L R2 may never appear in a solution, but R2 L -could. -******************************************************************************/ - -int64_t solve( - /* The cube to solve. Must be solvable. */ - cube_t cube, - - /* Supported solvers: - * "optimal" - currently the same as "simple" - * "simple" - a simple, slow solver without tables - */ - const char *solver, - - /* Some solvers accept extra options,like "!filter". */ - const char *options, - - /* Can be "normal", "inverse", "mixed" or "linear". */ - const char *nisstype, - - /* The minimum number of moves. Must be >= 0. */ - int8_t minmoves, - - /* The maximum number of moves. If negative, the maximum length - * is unlimited. - */ - int8_t maxmoves, - - /* The maximum number of solutions. */ - int64_t maxsols, - - /* All solutions at most "optimal" moves from the shortest solution - * (respecting minmoves) are found. If negative, it is ignored. - */ - int8_t optimal, - - /* Some solvers require extra data to function properly (for example, - * pruning tables). This data can be generated with gendata(). - */ - const void *data, - - /* The solutions (return parameter) */ - char *solutions -); - -/* Solving n cubes optimally, one solutions per cube. Options are similar - * to solve(). - */ -void multisolve( - int n, - cube_t *cube, - const char *solver, - const void *data, - char *sols -); - -/* Returns the number of bytes written to data, -1 in case of error. - * TODO: write down how much memory every solver requires. */ -int64_t gendata(const char *solver, const char *options, void *data);