# Building expressions lists - 2

Posted 06-19-2019 at 02:46 PM by dogpatch

So, in mid-March I applied the intelligent sorting logic to lump together, and thus identify algebraically equivalent duplicate expressions, and also those expressions which are not (apparently) algebraic equivalents, but appear to produce the exact same solutions. Of these groups of effectively equivalent expressions, my program then eliminated all but the preferred (briefest and 'best-looking') expression. This produced a list of 1315 algebracic expressions, to which I began to refer as my master list.

I now had to re-work my program to generate solutions using only this master list of expressions. This was a radical departure from my original brute force approach. Rather than generating every conceivable RPN expression and evaluating each one to possibly produce a solution, the re-worked program ran through all 1315 expressions for a given year, or for all years. Each algebraic expression was converted to RPN, then evaluated using fractional-exponential integer math (fx.h).

Proof in the pudding: Using the 1315 expressions, the re-worked program was able to produce all 46980 solutions. This might have been a great stroke to my ego, but fellow LQer astrogeek had already produced a minimal list of 193 expressions, and Beryllos was approaching the same mark. Well, 1315 was a start.

Looking at the master list of 1315 expressions, ranked and sorted as described, only 96 had a denominator value of 1 (one), meaning that, for one or more solutions, the expression shared the solution with NO other expressions. If not for the given expression, the given solution would not be attainable. Which is to say, each of the 96 expressions were absolutely needed. To be precise, 10142 of the full 46980 solutions absolutely needed one of the 96 expressions. Plus, many other of the 46980 solutions, while not requiring them, could be solved by one of the 96. Running my program with just the 96 produced 42822 solutions, only 4158 short of the complete set of 46980 solutions.

Now the challenge could be expressed in a different manner: how to solve these 4158 using a minimum number of the remaining 1219 (1315 - 96) expressions? With the problem now defined at a smaller scale, I came up with the idea of a bitmap disk file. (Was later encouraged upon learning that Beryllos was using a similar bitmap approach.)

The bitmap consisted of 715 records of 101 bytes each (so, it was not properly a bitmap, but a two-dimensional array of bytes). Modified the program to run through all 715 4-digit numbers, using the 96 required expressions to produce 42822 solutions, marking each solution by writing a '1' to the corresponding byte of the file. Modified the program again to make the same run, but using the remaining 1219 expressions, and reading, not writing, the bitmap file. When a solution was already marked '1' by the first run, that solution was ignored. The 2nd run produced only those 4158 solutions not solved by the 1st run, using the 1219 expressions not used by the 1st run. This yielded a new expression list, ranked and sorted: 1219 expressions used to produce 4158 solutions.

Looking at this new list, only 767 of the 1219 expressions were used to solve for any of the 4158. How to further reduce the numbers? Ranking and sorting of expressions was an ingenious technique, but included some supposition and conjecture as to which expressions shared the exact same solution set. Needed at this point was a definitive way to prove which groups of expressions produced the exact same solutions, thereby allowing the elimination of all but one expression in the group.

So I created a 2nd bitmap file. This one contained 767 records (one per expression) of 4158 bits (520 bytes), a bit for each of the 4158 solutions. Modified the program yet again to do an exhaustive run using the 767 expressions, limited by reading the first bitmap file as described above to generate 4158 solutions, and setting the appropriate bit in the 2nd bitmap for each valid solution. Examining (programatically) this bitmap easily identified groups of two or more expressions which had identical bitsets, thus definitively revealing that they shared the exact same solution set. Result: 331 redundancies could be eliminated, leaving 436 expressions to produce the 4158 solutions. This was good progress: added to the first 96, my program coiuld now produce all 46980 solutions using 532 expressions. Much better than 1315, still not even close to astrogeek's 193.

Thus began several days of culling expressions, making new lists, looking for extraneous expressions to eliminate, making slow and uncertain progress. Worse, each new modification of version 4 of the program made it that much more disordered, unreadable, and subject to stupid errors. Failure to synchronize between input lists and the two bitmap files resulted in many false starts and dead ends. By 20 March, I had slowly and painfully reached a point where 107 expressions could generate 43120 solutions, leaving 3860 unsolved. After careful culling, I could solve the remaining 3860 with just 356 expressions. 356 + 107 = 463 expressions to solve all 46980 values: puny progress, and my dizzying heap of input lists, bitmaps, and program modifications were growing by the hour.

Finally decided to semi-automate the process via a separate program (lenny.c). Using the same input list and 2nd bitmap file (coordinated), lenny didn't even try to evaluate equations; it simply used the bitmap representation to recognize those expressions with identical solution sets, and eliminate all but the first (recall that the ranked and sorted list makes it easy to identify the briefest and 'best-looking' expression in a given group). Then I put this logic into a big loop within lenny. Each trip through the loop a certain number of expressions were eliminated from further consideration. As well, a certain number of solutions, being now accounted for, were eliminated from the bitmap for all expressions. The next loop, therefore, could make further eliminations based upon the on-the-fly altered bitmap, with no need for program modifications nor re-worked lists and bitmaps.

On 22 March, using the latest input list of 356, and a bitmap of 3860 solutions, lenny quickly looped through the bitmap, and selected 85 expressions to solve the 3860. Added to the 107 from 20 March, I now had a list of 192 expressions. Re-running the main program with these 192 expressions, it could solve all 46980! One fewer than astrogeek and Beryllos!!

Upon closer inspection, however, I found that some of the equations gave wrong answers. A bug had crept into my fractional-exponential math routines, particularly in the fx_power() routine, which gave an occasional false result. (As I recall, it was sometimes failing to fail in the event of a math overrun / underrun.) So, I had to fix fx.h, and return to a known stable place of several days earlier.

On 24 March, after several more false starts, I started over with 87 expressions which could produce 45641 solutions, confirming that this was stable, and all solutions valid. Culling duplicates, I could solve the remaining 1339 using 426 expressions. Careful to create coordinated input list of 426 expressions and a bitmap of 426 expressions / 1339 solutions, I reconfigured lenny and let her select what she would. She quickly selected 106 of the 426 to solve the 1339. I then re-ran the main program with my new list of 87 + 106 = 193 expressions, and it solved all 46980 values! Matched astrogeek's and Beryllos' minimal number of 193. And all solutions appeared valid; no more bugs in my fx logic.

Watching lenny run, it would select an expression that provided several hitherto unsolved solutions. The next loop, its selected expression would provide fewer new solutions. And so on, until, for the final several loops, each new expression only yielded one new solution. Thinking I could do better than that, I hit upon the idea of representing the bitmap graphically, so as to engage my visual cortex to 'see' more effective combinations of expressions.

So I created an html version of lenny (lenny.php), which displayed the bitmap as a two-dimensional html table, 1339 solutions wide x 426 expressions deep. Added javascript to allow me to select any given expression, and visually see how the table was affected. Then select another and another, employing my visual cortex along with the frontal lobe. It became a sort of game, like playing 'Concentration' on my computer. But no matter what strategy I employed, I inevitably found myself, like the original lenny program, having to select expressions that yielded fewer and fewer new solutions, with many single-solution expressions toward the end of the 'game'. Result: couldn't improve upon 106 expressions to solve 1339 values. The list might vary greatly from game to game, but the final score was always 106. Thus, the mimimal list was always 87 + 106 = 193 expressions.

Started over from new initial lists and ran through the above routine several times before finally concluding that I could not improve upon astrogeek and Beryllos, and that 193 was the minimal list size that we had been seeking. The lists of expressions might vary, but it was going to take a minimum of 193 expressions to produce all 46980 solutions per the LQ 2019 Challenge.

Later, and on my own initiative, I enhanced my fractional-exponential math routines, and my algebraic / RPN conversion to handle decimal points as well. This added to the complexity of the original LQ 2019 Challenge, and also made several new solutions available. Using the above methodology, but without the false starts and almost endless program modifications, I came up with 54534 solutions using a 'concise' list of 1426, or a 'mimimal' list of 485 expressions, accepting decimal points in the expressions. Because my brain was nearing overload at this point, and I didn't agonizingly double- and triple-check my results as before, I have a hunch these numbers might be improvable.

Much of the above exists only on my home computer in the form of poorly organized lists and C code containing much temporary or later abandoned code, patches, and commented out code. Below are what I deemed worthy to upload:

LQpuzzle.php This is my latest (version 6) response to the LQ 2019 Challenge. It contains a little history and other notes, and also offers the C code (LQ2019.c and fx.h) and pre-generated expression lists (exp.dot.lists.h) for download.

lenny.php Above-mentioned two-dimensional html table to provide a graphical 'game' methodology for selecting a minimal list of expressions. Almost no documentation included. The uploaded version is for a later stage of generating expression lists that accept decimal points. At the point where this version was used, there are 158 expressions to solve 95 solutions. Playing this game, I always come up with a minimum of 32 expressions. Try it for yourself. If you can fill all 95 solutions with fewer than 32 expressions, please print the expressions (via 'Print' button), and tell me! Javascript is required.

I now had to re-work my program to generate solutions using only this master list of expressions. This was a radical departure from my original brute force approach. Rather than generating every conceivable RPN expression and evaluating each one to possibly produce a solution, the re-worked program ran through all 1315 expressions for a given year, or for all years. Each algebraic expression was converted to RPN, then evaluated using fractional-exponential integer math (fx.h).

Proof in the pudding: Using the 1315 expressions, the re-worked program was able to produce all 46980 solutions. This might have been a great stroke to my ego, but fellow LQer astrogeek had already produced a minimal list of 193 expressions, and Beryllos was approaching the same mark. Well, 1315 was a start.

Looking at the master list of 1315 expressions, ranked and sorted as described, only 96 had a denominator value of 1 (one), meaning that, for one or more solutions, the expression shared the solution with NO other expressions. If not for the given expression, the given solution would not be attainable. Which is to say, each of the 96 expressions were absolutely needed. To be precise, 10142 of the full 46980 solutions absolutely needed one of the 96 expressions. Plus, many other of the 46980 solutions, while not requiring them, could be solved by one of the 96. Running my program with just the 96 produced 42822 solutions, only 4158 short of the complete set of 46980 solutions.

Now the challenge could be expressed in a different manner: how to solve these 4158 using a minimum number of the remaining 1219 (1315 - 96) expressions? With the problem now defined at a smaller scale, I came up with the idea of a bitmap disk file. (Was later encouraged upon learning that Beryllos was using a similar bitmap approach.)

The bitmap consisted of 715 records of 101 bytes each (so, it was not properly a bitmap, but a two-dimensional array of bytes). Modified the program to run through all 715 4-digit numbers, using the 96 required expressions to produce 42822 solutions, marking each solution by writing a '1' to the corresponding byte of the file. Modified the program again to make the same run, but using the remaining 1219 expressions, and reading, not writing, the bitmap file. When a solution was already marked '1' by the first run, that solution was ignored. The 2nd run produced only those 4158 solutions not solved by the 1st run, using the 1219 expressions not used by the 1st run. This yielded a new expression list, ranked and sorted: 1219 expressions used to produce 4158 solutions.

Looking at this new list, only 767 of the 1219 expressions were used to solve for any of the 4158. How to further reduce the numbers? Ranking and sorting of expressions was an ingenious technique, but included some supposition and conjecture as to which expressions shared the exact same solution set. Needed at this point was a definitive way to prove which groups of expressions produced the exact same solutions, thereby allowing the elimination of all but one expression in the group.

So I created a 2nd bitmap file. This one contained 767 records (one per expression) of 4158 bits (520 bytes), a bit for each of the 4158 solutions. Modified the program yet again to do an exhaustive run using the 767 expressions, limited by reading the first bitmap file as described above to generate 4158 solutions, and setting the appropriate bit in the 2nd bitmap for each valid solution. Examining (programatically) this bitmap easily identified groups of two or more expressions which had identical bitsets, thus definitively revealing that they shared the exact same solution set. Result: 331 redundancies could be eliminated, leaving 436 expressions to produce the 4158 solutions. This was good progress: added to the first 96, my program coiuld now produce all 46980 solutions using 532 expressions. Much better than 1315, still not even close to astrogeek's 193.

Thus began several days of culling expressions, making new lists, looking for extraneous expressions to eliminate, making slow and uncertain progress. Worse, each new modification of version 4 of the program made it that much more disordered, unreadable, and subject to stupid errors. Failure to synchronize between input lists and the two bitmap files resulted in many false starts and dead ends. By 20 March, I had slowly and painfully reached a point where 107 expressions could generate 43120 solutions, leaving 3860 unsolved. After careful culling, I could solve the remaining 3860 with just 356 expressions. 356 + 107 = 463 expressions to solve all 46980 values: puny progress, and my dizzying heap of input lists, bitmaps, and program modifications were growing by the hour.

Finally decided to semi-automate the process via a separate program (lenny.c). Using the same input list and 2nd bitmap file (coordinated), lenny didn't even try to evaluate equations; it simply used the bitmap representation to recognize those expressions with identical solution sets, and eliminate all but the first (recall that the ranked and sorted list makes it easy to identify the briefest and 'best-looking' expression in a given group). Then I put this logic into a big loop within lenny. Each trip through the loop a certain number of expressions were eliminated from further consideration. As well, a certain number of solutions, being now accounted for, were eliminated from the bitmap for all expressions. The next loop, therefore, could make further eliminations based upon the on-the-fly altered bitmap, with no need for program modifications nor re-worked lists and bitmaps.

On 22 March, using the latest input list of 356, and a bitmap of 3860 solutions, lenny quickly looped through the bitmap, and selected 85 expressions to solve the 3860. Added to the 107 from 20 March, I now had a list of 192 expressions. Re-running the main program with these 192 expressions, it could solve all 46980! One fewer than astrogeek and Beryllos!!

Upon closer inspection, however, I found that some of the equations gave wrong answers. A bug had crept into my fractional-exponential math routines, particularly in the fx_power() routine, which gave an occasional false result. (As I recall, it was sometimes failing to fail in the event of a math overrun / underrun.) So, I had to fix fx.h, and return to a known stable place of several days earlier.

On 24 March, after several more false starts, I started over with 87 expressions which could produce 45641 solutions, confirming that this was stable, and all solutions valid. Culling duplicates, I could solve the remaining 1339 using 426 expressions. Careful to create coordinated input list of 426 expressions and a bitmap of 426 expressions / 1339 solutions, I reconfigured lenny and let her select what she would. She quickly selected 106 of the 426 to solve the 1339. I then re-ran the main program with my new list of 87 + 106 = 193 expressions, and it solved all 46980 values! Matched astrogeek's and Beryllos' minimal number of 193. And all solutions appeared valid; no more bugs in my fx logic.

Watching lenny run, it would select an expression that provided several hitherto unsolved solutions. The next loop, its selected expression would provide fewer new solutions. And so on, until, for the final several loops, each new expression only yielded one new solution. Thinking I could do better than that, I hit upon the idea of representing the bitmap graphically, so as to engage my visual cortex to 'see' more effective combinations of expressions.

So I created an html version of lenny (lenny.php), which displayed the bitmap as a two-dimensional html table, 1339 solutions wide x 426 expressions deep. Added javascript to allow me to select any given expression, and visually see how the table was affected. Then select another and another, employing my visual cortex along with the frontal lobe. It became a sort of game, like playing 'Concentration' on my computer. But no matter what strategy I employed, I inevitably found myself, like the original lenny program, having to select expressions that yielded fewer and fewer new solutions, with many single-solution expressions toward the end of the 'game'. Result: couldn't improve upon 106 expressions to solve 1339 values. The list might vary greatly from game to game, but the final score was always 106. Thus, the mimimal list was always 87 + 106 = 193 expressions.

Started over from new initial lists and ran through the above routine several times before finally concluding that I could not improve upon astrogeek and Beryllos, and that 193 was the minimal list size that we had been seeking. The lists of expressions might vary, but it was going to take a minimum of 193 expressions to produce all 46980 solutions per the LQ 2019 Challenge.

Later, and on my own initiative, I enhanced my fractional-exponential math routines, and my algebraic / RPN conversion to handle decimal points as well. This added to the complexity of the original LQ 2019 Challenge, and also made several new solutions available. Using the above methodology, but without the false starts and almost endless program modifications, I came up with 54534 solutions using a 'concise' list of 1426, or a 'mimimal' list of 485 expressions, accepting decimal points in the expressions. Because my brain was nearing overload at this point, and I didn't agonizingly double- and triple-check my results as before, I have a hunch these numbers might be improvable.

Much of the above exists only on my home computer in the form of poorly organized lists and C code containing much temporary or later abandoned code, patches, and commented out code. Below are what I deemed worthy to upload:

LQpuzzle.php This is my latest (version 6) response to the LQ 2019 Challenge. It contains a little history and other notes, and also offers the C code (LQ2019.c and fx.h) and pre-generated expression lists (exp.dot.lists.h) for download.

lenny.php Above-mentioned two-dimensional html table to provide a graphical 'game' methodology for selecting a minimal list of expressions. Almost no documentation included. The uploaded version is for a later stage of generating expression lists that accept decimal points. At the point where this version was used, there are 158 expressions to solve 95 solutions. Playing this game, I always come up with a minimum of 32 expressions. Try it for yourself. If you can fill all 95 solutions with fewer than 32 expressions, please print the expressions (via 'Print' button), and tell me! Javascript is required.

**This and the previous blog post are essentially mirrored at my website**Total Comments 0