LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 03-08-2019, 03:21 PM   #151
Beryllos
Member
 
Registered: Apr 2013
Location: Massachusetts
Distribution: Debian
Posts: 500

Original Poster
Rep: Reputation: 286Reputation: 286Reputation: 286

Quote:
Originally Posted by dogpatch View Post
... would be interested in your shorter list(s), and your methodology for creating them.
I had a pretty neat, step-by-step method, until I added the minus signs. Now I am having more difficulty. Here is the method I have so far:
  1. Generate a list of every possible validly-formed (syntactically correct) expression. In this part, I consider negative signs only within exponents and to change the sign of the whole expression. I think that covers all configurations of signs (see 2 posts back). This gives 2374 expressions to begin.
  2. To speed up the rest of the procedure, I generate a lookup table which tells, for each expression, every whole number it can produce in every year. This is a large table, having about 1.7M entries. On my computer, it takes a half hour to generate the lookup table, and I save it to a file so I don't have to do that repeatedly.
  3. I search the lookup table for expressions that have no solutions at all. Among them are some that are obvious, at least in retrospect, like a-bcd, and many more that are not so obvious. This allows us to remove 314 expressions.
  4. I look for sets of expressions that share the same solutions throughout all years. Many of these are algebraically equivalent after permutation of digits, like a*b+c^d and a^b+c*d. Some are not algebraically equivalent but share the same solution set nonetheless. This allows us to remove another 1438 expressions, leaving us with "only" 622 expressions having different solution sets.
  5. Next I search for whole-number solutions in any year that can only be obtained with one expression. Those expressions would necessarily have to be included in any minimal set of expressions. Among the total 46980 solutions, I find 11408 which can only be obtained by one expression.
  6. Then gather the expressions which lead to those 11408 solutions. I find that 148 expressions cover them all. Those 148 must appear in the minimal set.
  7. Then find all the other solutions that can be obtained with those expressions. Those other solutions are solvable by other expressions, but we might as well solve them with the expressions already deemed necessary. I find that the 148 expressions can obtain 46362 solutions in all.
  8. That leaves us still looking for the remaining 618 solutions, and those solutions are associated with a set of 168 remaining expressions. The problem at this stage is too large to solve by brute force. We need more tricks.
I have two more tricks that I have been working on, but it's a work in progress and I haven't been able to work it to a definitive conclusion. Some work is saved by searching for solutions that can be solved by the same set of expressions. Then some expressions can be added to the minimal set by looking for subsets of expressions that always appear together. I'll describe that in more detail when I get further along.

Last edited by Beryllos; 03-09-2019 at 08:46 AM.
 
2 members found this post helpful.
Old 03-09-2019, 11:41 PM   #152
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=14, FreeBSD_10{.0|.1|.2}
Posts: 4,987
Blog Entries: 11

Rep: Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915
Quote:
Originally Posted by dogpatch View Post
So, would be interested in your shorter list(s), and your methodology for creating them.
Sorry for being so slow to reply.

My methodology is not nearly as well considered as Beryllos, but here goes...

As my approach has been to evaluate user supplied expressions from the start, I never had a programmatic algorithm for generating expressions, only a methodology for writing them.

As you will note from my early posts, I organized my initial expression set and their evaluation around the number of zero digits in the permutation under test. Very early on, my expressions were written directly into the digit permutation loop, and were grouped according to number of potential zero digits along with tests to block division by zero and other undefined operations.

When I had written the beginnings of my final evaluation application, my rules were removed to a simple text file and decoupled from other tests, but the potential number of zero digits continued to be a useful organizing and "discovery" principle until my list had become longer than about 120 expressions.

The basic idea in pseudo code was like this...

Code:
/* We know nothing about digits, all might be zero, what expressions make sense? Always evaluated... */
        a+b+c+d
        a*b*c*d

if a>0 { /* We know with certainty that one digit is non-zero, what makes sense? */
        a-(b+c+d)
        a+(b*c*d)
        a-(b*c*d)
        a^(b*c*d)
        a^(b+c+d)
        (b*c*d)^a
        (b+c+d)^a
        (b+c+d)/a
        (b*c*d)/a
        ab+c+d
        abc-d
}


if a*b >0 { /* We know with certainty that two digits are non-zero, what makes sense? */
        a-b+(c*d)
        a^b+(c*d)
        a^c+b^d
        ac+bd
        ac-bd
        ...
        (acd)/b
}

... and so on
Note that this evaluation scheme imposed ordering on the variables in the expressions as well, which became irrelevant later.

This approach was useful and productive early, probably due in part to the fact that new expressions were easier to find! But after my program had matured and the number of expressions became much longer, I refactored all expressions to uniform a, b, c, d order and alpha-sorted the list (but retained all those expressions).

From about 120 expressions total, I had the advantage that Beryllos' code was producing a useful set of solution totals which told me where I needed to find new expressions for my list! Once I knew where I was missing a solution it was easier to write more expressions for my own list, or use one from Beryllos' output in some cases! My approach then was to first try to refactor one or more of my existing expressions to cover the new case, and only add a new expression as a second choice.

Later, when you had begun to generate signed expressions, I found a few new solutions from your program as well. When I found a new one produced by a signed expression I was able to write an unsigned to get the same result in all cases.

I began to consider my list "complete" at about 180 expressions. Late additions resulting from 100% comparison with both yours and Beryllos' results took it to 194 at most as I recall (perhaps 196?).

I have noted recently that both you and Beryllos have mentioned lists of about 194 expressions resulting from your own efforts. As mentioned, my own peaked at 194 expressions. I can think of no particular reason for that to be so specific, but it is an interesting coincidence.

When we were first in agreement on what the complete solution set looked like, I was able to remove, and in a few cases refactor expressions in my list until I had 176 expressions which produced the complete solution set.

With the addition of Beryllos' recent finds of signed exponent expressions, my list is now fixed at 178 expressions.

So, my methodology changed over time, and your two exhaustive approaches provided no small advantage to me in filling out my own list, so thanks for that!

My recent thoughts on the problem have continued to center on finding interesting minimal, or optimal sets of expressions. I have not had much time for that in recent days, but I have found a new minimal set of 15 expressions which produce all 49 solutions for 2019, which I will post while here (previous best was 16). Hopefully I will be able to catch up in the coming week.

Code:
+-------------+
| expression  |
+-------------+
| (a+b)*(c-d) |
| (a+b^c)/d   |
| (a-b-c)^d   |
| (abc)/d     |
| a*(bc-d)    |
| ab*(c-d)    |
| ab+cd       |
| ab+c^d      |
| ab-c-d      |
| ab-cd       |
| ab-c^d      |
| ab/(c/d)    |
| ab/c+d      |
| ab/c-d      |
| a^b-c+d     |
+-------------+
15 rows in set (0.00 sec)
Unfortunately, the "cost" in terms of evaluation time for this set is much higher than for the 16 expression set... but it is a very interesting problem to consider!
 
2 members found this post helpful.
Old 03-11-2019, 09:02 PM   #153
dogpatch
Member
 
Registered: Nov 2005
Location: Central America
Distribution: Mepis, Android
Posts: 360
Blog Entries: 2

Rep: Reputation: 152Reputation: 152
Thank you both for your recent posts. I agree with you, astrogeek, that this is a very interesting problem, every bit as challenging as the original challenge. And i agree with you, Beryllos, that the addition of leading negative signs has added greatly to the complexity.

Meanwhile, i was starting to put some of the things i've learned here to practical use, like learning to parse algebraic expressions, and using fractional integer math. Then it occurred to me that both of these could be used in tackling this second challenge, that of creating a minimal list of expressions. As of this posting, i've got it down to 193 expressions which produce all 46980 solutions. (Formerly had 194 expressions which only produced 46890 solutions, 90 short.) I can't say with any confidence that i will be able to match astrogeek's 178, but will continue to try.

Like Beryllos, my first step was to create a list of expressions that is reasonably complete, while eliminating obvious redundancies and impossibilities. I think i may have to revisit this step, as i instinctively think that in order to produce the smallest minimal set possible, i will ironically want to start with a very inclusive full list.

I hope to study both your notes carefully in the next few days. In the meanwhile, my latest methodology has been to score or rank each expression, and produce a list sorted by that ranking. To do this, my expression structure now contains a fractional number, the denominator of which is the smallest number of expressions shared for any given value for a given year. So, for example, as i do my exhaustive run, if there are 12 expressions that produce a given value for a given year, those 12 expressions will each be assigned a fractional 'score' of 1/12. The next time one these same expressions is used in conjunction with 11 other expressions, its score will be raised to 2/12 (but NOT reduced to 1/6). If it is later used in conjunction with just 2 other expressions, the numerator is reset to 1 and the denominator to 3, a score of 1/3. If it is ever the only expression used, its score becomes 1/1, if that happens again, it becomes 2/1, etc. To be sure, this is not really fractional math, but the fractional structure i had already in place was handy and does the job nicely. Sorting this list first by denominator (smaller values score higher) then by numerator (larger score higher) enables me to produce a list with the strongest expressions at the top, and helps me visually recognize which expressions may be eliminated. Obviously, the expressions with a denominator of 1 are essential and cannot be eliminated. This explains my instinct for creating a starting list as complete possible, so as to start out with very few essential equations.

Once i have a fairly complete starting list, i may take astrogeek's earlier advice and create a database. Or perhaps 2 databases, one with 46980 solution records, each record of which will point to the expressions used to solve, and the other database having a record for each of the full list expressions, pointing back to the solutions each expression can solve. I still haven't worked out the algorithm for connecting these doubly related databases, so still can't say how feasible or even possible this may prove to be.
 
2 members found this post helpful.
Old 03-12-2019, 01:57 AM   #154
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=14, FreeBSD_10{.0|.1|.2}
Posts: 4,987
Blog Entries: 11

Rep: Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915
I am hopeful that one or both of you, starting from a different expression set, will be able to arrive at a smaller minimal set than my own. There must be many such sets, and I would have been incredibly lucky to have hit the smallest on the first blind shot! I am neither that lucky nor that good a shot...

I have had some time this evening to play with my minimizer/optimizer queries and have produced a few interesting new observations.

First, we find when looking at the problem as a covering set problem, it is one of the original set of problems proved to be NP-Complete. For us, that mostly means that it may potentially take a long time to solve, and takes much longer, the larger the set of combinations becomes. If we take the number of solutions, 'r' (101) and the number of possible expressions to produce them 'n' (178) as a first approximation, then the scale of the combinations we must evaluate, in common notation is of the order...

Code:
                             (r+n-1)!   (101 + 178 - 1)!
C(n,r), with repetitions = ---------- = ---------------- = ???
                             r!(n-1)!    101!(178 - 1)!
OR
                              n!             178!
C(n,r), no repetitions = ---------- = ---------------- = ???
                          r!(n-r)!    101!(178 - 101)!
I leave you to do the math, but either way that is the scale of the problem...

In reality it is often smaller than that, and depends on the combination we are solving for because we are working backwards from known solutions and known generating expressions. For 2019 (combination 0129) and my own set of expressions and their solutions, r = 49, n = 177, and for 2020 r=13, n=171. But you get the idea!

Now, because we are working from known solutions we can quickly eliminate some solutions and their generating expressions from
further consideration - all expressions which produce a solution not produced by any other expression must be included, and any other solutions they may produce will also be covered by them and may be removed from the set of choices for future consideration.

For 2019 there are 12 solutions produced by a single expression, three pairs of which are produced by the same expressions, meaning we have 9 expressions and can remove 12 solutions from the list up front. We might find one other expression which produces a bunch of other solutions but best case that would be 24 (one per permutation), so in our dreaams we might hope for a minimal set with only 11 expressions. But no such luck, it is easily shown that there are no two other expressions which produce enough solutions to fill the matrix... same for 12 expressions... but by the time we begin looking for a 13 expression minimal set the choices become much more abundant and the problem becomes exceedingly more difficult!

The common algorithm for solving these problems which produces an approximate cover set, is to repeatedly take the next remaining expression set which produces the most remaining solutions until they are all covered (called a greedy algorithm in many current references, new term for an old idea). Using a "greedy" algorithm I get 18 expressions to cover all solutions for 2019, implemented in SQL it takes about 8 seconds to complete on my old hardware.

So I tried to optimize the runtime, and the results, by adding a look-ahead aspect to the query, to see what the next set of choices would look like for each choice at the current level. I found that I could get a minimal set of 16 by this method (but not the same improvement for all combinations).

Further refinement of that approach led to a minimal set of 15 and 3 seconds for 0129 and which does improve most result sets, and I have not been able to improve upon for most combinations. However I have found a simple variation which can sometimes reduce the minimal sets by one expression. For example, for 2049 and 1835, the 2019/15 algorithm produces 22 for each (an improvement on previous results of 23), but I can get 21 minimal for each by making a different choice in the look ahead. This appears to be a dice roll, not deterministic in any way I can see.

But to know that the smallest possibility for 2019, which is not known to actually exist, would be 13 expressions, and to have gotten down to an actual found set of 15 for an exceedingly difficult problem - very satisfying! I would be delighted to see someone find a 13 or 14 minimal set - like Dr. Elizabeth Shaw, I am still searching!

Last edited by astrogeek; 03-12-2019 at 02:23 AM. Reason: typos, added comment
 
2 members found this post helpful.
Old 03-12-2019, 12:26 PM   #155
dogpatch
Member
 
Registered: Nov 2005
Location: Central America
Distribution: Mepis, Android
Posts: 360
Blog Entries: 2

Rep: Reputation: 152Reputation: 152
Quote:
Originally Posted by astrogeek View Post
...I would have been incredibly lucky to have hit the smallest on the first blind shot! I am neither that lucky nor that good a shot...
I don't buy your 'blind shot' stuff any more than i bought the earlier bit about your poor math skills. Nevertheless this old amateur will continue trying to keep up with you and Beryllos.

I found the 'NP Complete' link quite interesting, even though i still don't fully grasp the concept, nor your explanations about sets, etc. The best i can do is relate to a (self-imposed) challenge i faced some years ago, when i decided to try my hand at programmatically solving Sudoku puzzles. Thinking the logic to do so would be too complex for me to put into code, i tried the simple brute force method of generating all possible Sudoku solutions. This (polynomial?) approach, i soon realized, was going to take thousands of years to complete on my 500mHz computer. So, i began to make the Sudoku generation logic a bit more intelligent, moving step by step away from simple brute force, and eventually succeeded in producing about 1 million valid Sudoku grids per second, thus reducing the predicted time to generate all valid Sudokus down to just a couple weeks. This was still unacceptably slow, but then i realized that by making my brute force logic more intelligent, i had also created the logic that i originally thought would be too difficult to code. The same logic that could produce one million valid Sudoku grids per second, could also solve a single-solution Sudoku in about 1/millionth of a second. Perhaps (?) this is roughly what the aforementioned link means when he talks about moving away from the polynomial approach, and what Beryllos means when he said "we need more tricks". Am i sort of understanding the concept?
 
2 members found this post helpful.
Old 03-12-2019, 04:05 PM   #156
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=14, FreeBSD_10{.0|.1|.2}
Posts: 4,987
Blog Entries: 11

Rep: Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915
Quote:
Originally Posted by dogpatch View Post
I hope to study both your notes carefully in the next few days. In the meanwhile, my latest methodology has been to score or rank each expression, and produce a list sorted by that ranking...
Very interesting way to think about ranking expressions! I have included a rank column in my own data model, although it is simply an integer value and I have not made any effective use of it as yet. I like the fractional idea and will give that some thought myself.
Quote:
Originally Posted by dogpatch View Post
Once i have a fairly complete starting list, i may take astrogeek's earlier advice and create a database. Or perhaps 2 databases, one with 46980 solution records, each record of which will point to the expressions used to solve, and the other database having a record for each of the full list expressions, pointing back to the solutions each expression can solve. I still haven't worked out the algorithm for connecting these doubly related databases, so still can't say how feasible or even possible this may prove to be.
By way of encouragement, and to (finally) share my own code and data model, I have cobbled together a new blog post from my still incomplete drafts on the subject, and uploaded my current code. Find it here, New Years Data To DB - The Model and The Data .

As for linking the two data sets, you might get some ideas from my method of staging the solution data then linking it to the expression data from there to the final data relation, or table. I am using an RDBMS, so not sure how applicable that may be to a file based database, however.

Quote:
Originally Posted by dogpatch View Post
I don't buy your 'blind shot' stuff any more than i bought the earlier bit about your poor math skills. Nevertheless this old amateur will continue trying to keep up with you and Beryllos.
I appreciate the kind comments and vote of confidence, however undeserved they may be!

Quote:
Originally Posted by dogpatch View Post
I found the 'NP Complete' link quite interesting, even though i still don't fully grasp the concept, nor your explanations about sets, etc. The best i can do is relate to a (self-imposed) challenge i faced some years ago, when i decided to try my hand at programmatically solving Sudoku puzzles. ... Am i sort of understanding the concept?
That is about as well as I understand it, so I hope that is correct!

Basically, in one sense it means that the time required to find a solution is very sensitive to the number of possibilities and can grow quite dramatically as the size of the problem increases. It also means that there is no known way to work around it - no quick solutions. There are however, some useful methods of finding approximate solutions more quickly, such as the greedy algorithm mentioned.

Last edited by astrogeek; 03-12-2019 at 06:31 PM. Reason: typos
 
2 members found this post helpful.
Old 03-14-2019, 05:14 PM   #157
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=14, FreeBSD_10{.0|.1|.2}
Posts: 4,987
Blog Entries: 11

Rep: Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915
Quote:
Originally Posted by dogpatch View Post
I still don't fully grasp the concept, nor your explanations about sets, etc.
Admittedly, my mentions of some of that have been brief, perhaps even a little cryptic, which I will attribute to the restrictions of time and space available for some of my posts.

It is important to me that I try to remedy that deficiency as much as possible, to which end I have added a new chapter to my blog posts on the topic:

New Years Data To DB - Arithmetic, Graphs and Relations

It is intended to present for yourself and any others interested, the lay of the land. How graphs, sets and databases fit together and fit into our explorations of the newyears puzzle, at least from my perspective.

I hope it is of some benefit, especially to those like yourself who enjoy the excursions into new areas of math, logic and the foundations of our shared computing knowledge! I hope it might even be a little fun for you, as it is for me!
 
2 members found this post helpful.
Old 03-23-2019, 01:01 AM   #158
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=14, FreeBSD_10{.0|.1|.2}
Posts: 4,987
Blog Entries: 11

Rep: Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915
I have some progress to report on my search for minimal covering sets for the solution data.

After my last post I found I was struggling a bit with the problem of how to generate an exact minimal covering set for the set of all solutions produced by my minimal expressions set. The "NP-Complete-ness" of the problem means that there is no known exact method other than to evaluate all possible combinations of expressions and solutions, and that is a lot of terms to evaluate - on the scale of 178! times 101!.

The approximate methods produce some good results in reasonable time, but cannot answer the questions of what is the actual limit to minimal sets, and how many of a given size are there...

A little time away from the problem, slowly ruminating on the general case as explained in a favorite text book resulted in a new approach - and a definitive final result!

There are two problems which must be addressed: The conceptual mathematical method itself, in my case as expressed in the language of sets, as expressed in SQL. And optimizing away the limitations of time and hardware required to execute that method, as much as possible.

I'll defer the gory details of the math for a blog post, but it involves generating a product-of-sums expression from a matrix representation of the relation, multiplying that out to get a sum-of-products expression which is reduced by a set of rules called the "covering algebra". (Prather, Elements of Discrete Mathematics, pg 443, for those interested!)

Thinking about what that actually meant led to a very efficient strategy for attacking the problem, the somewhat direct implementation of which also helps in optimizing around platform limitations.

The successful strategy has been to reduce the data relations to the lowest state possible - a bit-field matrix of expressions and solutions, extended to all of those factorial terms. We don't have to build the whole matrix, we only need to keep track of the sums and products of rows and columns as we generate them, and make their manipulation as efficient as possible.

The rows of the matrix (actually the current product state) are stored in 101-bit wide fields, or vectors if you prefer, each defined by two 64-bit integer column types (remember we are working in a database, and all MySQL BIT operations are performed on 64-bit integers). Each bit, 0-100 corresponds to an integer solution, the columns of our matrix. The names in the queries are more cryptic, but the rows and their meanings are defined as here:

Code:
FULL_SOLUTIONS - Bits 0-100 set if solution set for a combination includes the corresponding integer value.
EXPRESSION_SET - Bits 0-100 set if expression generates corresponding solution, one row per expression.
UNIQUE_EX_MASK - Bits 0-100 set to BIT_OR of all unique expression solutions, XORed with FULL_SOLUTIONS.
PRO_COVER_MASK - Current covering product sub-set, bits 0-100 set for _not_ yet covered solutions.
This was my first extensive use of MySQL's family of BIT_* operators which turned out to be simple and right for the task.

Our product terms are the sum of bit manipulations on these bit-fields and masks. To determine membership in a product set, we take the product EXPRESSION_SET AND PRO_COVER_MASK, non-zero is member. When the expression generates a new product term its PRO_COVER_MASK becomes (EXPRESSION_SET AND PRO_COVER_MASK) XOR PRO_COVER MASK.

When the XORed PRO_COVER_MASK becomes zero the product is a valid covering set, and is removed to the found products table.

When a product does not generate any new terms (i.e. AND produces no non-zero result), no term is generated and I have arranged it so that the dead-end product naturally falls out of the running without further handling.


To begin, we reduce the number of potential terms by pulling out any expressions which produce unique terms, and all other expressions whose solutions are fully covered by those. This reduces our candidate expressions from 178 to 134 for 2019 combination.

To simplify how we find candidate expressions for a product, we seed the product relation, one row per remaining expression in the non-unique expressions relation, with PRO_COVER_MASK for the product initialized to (EXPRESSION_MASK AND UNIQUE_EX_MASK) XOR UNIQUE_EX_MASK.

The product generating query generates all products of the current product set and all members of the non-unique expressions set defined by this set expression...

Code:
 Next product terms = { non-unique exp | exp_id>last_exp_id, EXPRESSION_SET AND PRO_COVER_MASK !=0 }
...a very simple test which means we take the next highest expression in DB order which produces a new solution not yet covered by this product, and multiply it into the product, creating a new, distinct product term.

The query which does that is really almost as simple as the above set expression and is applied only once per product depth (terms), but a single query can produce many millions of new products!

For this reason, we simply cannot produce the complete set of product terms for many (most) combinations (134! times 101!, for example, is still an outrageous number), so we must put some limits on how far we want to go, how many product terms we want to calculate.

The approximate solution algorithm I had originally implemented is very fast and produces valid solution sets which are minimal, meaning small, but not exact - the smallest possible. So it seems a good idea to use that approximate soultion as the limit which will allow us to definitively determine whether there are any smaller minimal sets, without wasting time producing larger sets.

I have now done this for 2019 and a few smaller sets with very good results!

Unlike my own previous uncertainty, I can now say with certainty, there are no minimal expression sets smaller than 15 which produce the complete solution set!

There are, in fact, 1292 distinct 15-expression sets which cover all solutions, my original 15-member set among them, of course. Finding that result produced the following number of product sets, found and total...

Code:
MariaDB [newyear]> select count(*) from productFound;
+----------+
| count(*) |
+----------+
|     1292 |
+----------+

MariaDB [newyear]> select count(*) from product;
+-----------+
| count(*)  |
+-----------+
| 185365143 |
+-----------+
 
MariaDB [newyear]> select * from productFound where elist like '15,%';
+---------+---------------------+-------+--------+--------+---------+
| prod_id | elist               | terms | exp_id | cover0 | cover64 |
+---------+---------------------+-------+--------+--------+---------+
|     314 | 15,18,44,77,143,169 |     6 |    169 |      0 |       0 |
|     315 | 15,18,44,77,143,172 |     6 |    172 |      0 |       0 |
...
|     331 | 15,44,77,88,143,169 |     6 |    169 |      0 |       0 | <- original minimal set
...
|     332 | 15,44,77,88,143,172 |     6 |    172 |      0 |       0 |
|     333 | 15,44,77,88,143,177 |     6 |    177 |      0 |       0 |
|     334 | 15,44,77,97,143,169 |     6 |    169 |      0 |       0 |
|     335 | 15,44,77,97,143,172 |     6 |    172 |      0 |       0 |
|     336 | 15,44,77,97,143,177 |     6 |    177 |      0 |       0 |
+---------+---------------------+-------+--------+--------+---------+
18 rows in set (0.00 sec)
(Note: the terms column, 6 is the number of expressions in the product, add the 9 unique expressions removed at the start to get 15 total).

So, to find a definitive 1292 actual 15-member solutions out of 185 million 15-member products, not counting twice that number discarded in the process, and in less than an hour on my old hardware, makes me a happy man!

It also turns out that a 16 member minimal set I had previously produced is _not_ in the product relation and so, could not lead to a minimal covering set. It turns out that it was not "irredundant" in covering set terminology, meaning that all the solutions it produced would have been produced by other members of the set. This gives both a negative and a positive test of what I will now call my "covering algebra" solution making it all the more satisfying!

Last edited by astrogeek; 03-26-2019 at 01:01 AM. Reason: Clarity, tpoys, touch-up, correctness
 
2 members found this post helpful.
Old 03-23-2019, 09:23 AM   #159
Beryllos
Member
 
Registered: Apr 2013
Location: Massachusetts
Distribution: Debian
Posts: 500

Original Poster
Rep: Reputation: 286Reputation: 286Reputation: 286
Wow! That is an impressive bit of mathematics and programming, with nice results too. I haven't worked through all the logic but I think I am getting a rough understanding of your methodology.

I love the idea of encoding sets and masks as bit-fields, and performing logic operations on them. That has got to be the fastest way (on general-purpose hardware) to screen millions of items.

1292 distinct 15-expression sets sounds like a lot, or maybe it's not. The factorial is a really hard-hitting function.

At my end, I have been scribbling on paper and dabbling in code in my spare time, but not getting very far yet. Still doing it in python too, though C looks tempting now. I know nothing about SQL. Maybe I should give it a closer look.
 
1 members found this post helpful.
Old 03-24-2019, 01:42 AM   #160
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=14, FreeBSD_10{.0|.1|.2}
Posts: 4,987
Blog Entries: 11

Rep: Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915Reputation: 2915
Quote:
Originally Posted by Beryllos View Post
Wow! That is an impressive bit of mathematics and programming, with nice results too. I haven't worked through all the logic but I think I am getting a rough understanding of your methodology.

I love the idea of encoding sets and masks as bit-fields, and performing logic operations on them. That has got to be the fastest way (on general-purpose hardware) to screen millions of items.
Thanks for the comments, but the only impressive thing is that my slow old brain got it at all!

Of course it is all just Boole and DeMorgan, but it has been rewarding to see how the properties of the basic laws, especially idempotence and absorption, are used to create a "new" algebra. Then to see it work in the end, and to understand why, is always rewarding!

Quote:
Originally Posted by Beryllos View Post
1292 distinct 15-expression sets sounds like a lot, or maybe it's not. The factorial is a really hard-hitting function.

At my end, I have been scribbling on paper and dabbling in code in my spare time, but not getting very far yet. Still doing it in python too, though C looks tempting now. I know nothing about SQL. Maybe I should give it a closer look.
1292 does seem like a lot and I was surprised to see that at first. Then to realize that is 1292 needles out of a 185+ million subset of a truly astronomical - factorial haystack, quickly puts it back into perspective. Remembering my difficulties and surprise at finding just one, made 1292 seem overwhelming initially. Now I can hardly believe there are not many more.

My first attempts at running to this depth resulted in exceeding some system limits after I had seen 400+ million products evaluated. I changed a few MySQL system parameters and also changed my query to identify and drop products which could not lead to a solution much earlier, hence the 185 million subset hides the actual number considered to this point.

Learning the relational theory/model as a mathematical and logical idea is very rewarding. SQL is actually a kind of poor implementation of it in some ways, but still it does provide the necessary bits to make it work as it should. The shortcomings are mostly that it is easy to use it in non-relational ways, and it takes some care to keep it strictly relational. Kind of like using arithmetic without regard to some basic properties like associativity. You can easily produce results which look OK but are nonsense... but the RDBMSs are very good at what they do overall once you learn how to talk to them!

UPDATE: Continuing optimizations and validation have led to a revised covering schema and methods which have provided great gains in processiong time and further reducing the number of products which survive at each term level, which really helps. I can now tell whether all remaining solutions lie in the future of some product path, as opposed to "any" remaining solutions as before. Rather than continue to update this post play-by-play, I will prepare a blog post soon which covers all the relevant parts and summary result sets.

Last edited by astrogeek; 03-26-2019 at 03:02 PM. Reason: tpoys
 
1 members found this post helpful.
Old 03-26-2019, 10:21 PM   #161
dogpatch
Member
 
Registered: Nov 2005
Location: Central America
Distribution: Mepis, Android
Posts: 360
Blog Entries: 2

Rep: Reputation: 152Reputation: 152
Quote:
Originally Posted by Thomas Edison
I have not failed, I've just found 10,000 ways that won't work.
Perhaps that's a bit of an exaggeration, but i have found more ways that don't work than those that do.

First off, want to say that i agree with Beryllos that astrogeek's mathematical acumen is quite inspiring (and Beryllos' as well). Am envious - and somewhat self-justified also in that i, too, decided to use a bit array representation of the solution sets.

One way in which i have parted company from both of you is that i've kept on including leading negative signs, and not just for exponents or for the whole equation. My current minimal set, at 193, includes these:
Code:
(a-(-b^c))^d
a+(-b/c)^-d
a-(-b/c)^-d
a^b-(-c^d)
In each case, i could theorize that the leading negative could result in a savings of minimum expressions, since the even-ness / odd-ness or the relative magnitude of some digits could affect whether the leading negative were essential. Thus the single expression shown might replace two expressions in which leading negatives are not considered. That was at least my theory. But the proof is in the pudding, and i've not even come close yet to astrogeek's 178. And it sounds like he might soon beat his own mark. (sigh!)

Of greater importance to me (and referring again to the Edison quote), my flailing around with leading negatives has served to flush out a couple nasty little bugs in my fractional-exponential math routines. In fact, i've started spending more time further debugging and developing those routines than on the minimal expression set challenge.

On that note, i may devote a separate post below to an interesting (to me, at least) issue regarding fractional-exponential math.
 
2 members found this post helpful.
Old 03-26-2019, 11:57 PM   #162
dogpatch
Member
 
Registered: Nov 2005
Location: Central America
Distribution: Mepis, Android
Posts: 360
Blog Entries: 2

Rep: Reputation: 152Reputation: 152
I admit this post is wandering away from the original thread somewhat, but i want to post this to those LQ programmers familiar with fractional-exponential integer math, and those who have proven superior mathematical ability (i.e. to Beryllos and astrogeek, and anyone else who fits the bill).

Am very grateful to Beryllos for introducing me to fractional-exponential integer math programming. I have long been irritated at floating point inaccuracies and its departure from pure mathematics. And, as seen above, pure integer math simply cannot resolve some important alebraic expressions. With regard to the 2019 puzzle challenge, fractional-exponential logic is much better suited to a perfectly accurate resolution of such expressions as
Code:
8^(5/3)
, which resolves, of course, to 32.

It gets even more fun when we seek to resolve a negative number that has a fractional exponent, such as
Code:
-8^(5/3)
which resolves to -32. It is resolvable only because the root, i.e. the exponent's denominator, is an odd number. Were the root an even number, the expression could not be resolved.

Or could it?

Take, for example,
Code:
-1^(1/2)
, the square root of a negative one. Now, here is where you put on your mathematician's hat, and recall with me the mathematical concept of i, the square root of negative one, and to the mathematical domain of imaginary numbers, as opposed to real numbers.

Am now thinking that this capacity ought also be allowed in my fractional-exponential math routines. Pure mathematical principles say that imaginary numbers, those based upon i, can be used in calculations and equations just as real numbers are. And that said equations can often resolve back into the realm of real numbers. Thus, i^2=-1, i^3=-i, i^4=1, i^5=i, etc.

So, i have begun allowing an imaginary fractional-exponential expression such as
Code:
(-7/1)^(3/4)
to stand even though i cannot resolve it, since a further operation might bring it back into the real number domain. For example,
Code:
(-7/1)^(3/4)*(-7/1)^(5/4)
resolves very nicely to
Code:
(-7/1)^(8/4)
which simplifies to -7^2 and thence to 49.

As mentioned above, this does not relate directly to the 2019 challenge, but is at least indirectly related. What do you think? Is my thinking sound, or am i guilty of programming voodoo?

Last edited by dogpatch; 03-27-2019 at 12:04 AM. Reason: italicize 'i'
 
2 members found this post helpful.
Old 03-27-2019, 02:10 PM   #163
Beryllos
Member
 
Registered: Apr 2013
Location: Massachusetts
Distribution: Debian
Posts: 500

Original Poster
Rep: Reputation: 286Reputation: 286Reputation: 286
Your thinking is mathematically sound. Not voodoo at all. It is similar to the idea of storing fractions (as numerator and denominator) until a later calculation brings it back to an integer. You have extended it to store fractional powers which could yield an integer power at a later step. It's interesting that your method handles both irrational and imaginary numbers naturally, with just the 4 structure members and basic arithmetic methods. Nice!
 
2 members found this post helpful.
Old 03-27-2019, 02:48 PM   #164
Beryllos
Member
 
Registered: Apr 2013
Location: Massachusetts
Distribution: Debian
Posts: 500

Original Poster
Rep: Reputation: 286Reputation: 286Reputation: 286
After I wrote that, I was trying to work out what the "basic arithmetic" methods or formulas are. I am sure that multiplication and division of these structures can be done, but I don't see a way to do other operations (addition, subtraction, and power) for every case. For example, I don't think we could calculate 2^(1/2)+3^(1/2) and express the result in the (a/b)^(c/d) format.

Nevertheless, it is possible is to work with such expressions in a more general kind of structure. Recall how we can accumulate any number of terms on paper and resolve them in subsequent steps. In the same way, we could write a program to keep all those terms in memory, and manipulate them according to the rules of algebra. I'm pretty sure that programs like Mathematica and Matlab have such a system worked out. It's a big task, though.

Last edited by Beryllos; 03-27-2019 at 02:53 PM.
 
2 members found this post helpful.
Old 03-27-2019, 03:11 PM   #165
Beryllos
Member
 
Registered: Apr 2013
Location: Massachusetts
Distribution: Debian
Posts: 500

Original Poster
Rep: Reputation: 286Reputation: 286Reputation: 286
Quote:
Originally Posted by dogpatch View Post
My current minimal set, at 193, ... and i've not even come close yet to astrogeek's 178.
My minimal set is also 193. I think astrogeek was taking the absolute value at the end, which allows a few additional results. (The absolute value is mathematically equivalent to putting an optional negative sign in front of the whole expression.) In astrogeek's program (astrogeek-years-v1.0.2), the absolute value is user-selectable by way of a command-line option.
 
2 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: Minilens Fun Open Source Puzzle Platform Game LXer Syndicated Linux News 0 10-27-2017 11:54 AM
LXer: SpaceChem, One of the Best Indie Puzzle Game Released this Year for Linux LXer Syndicated Linux News 0 06-19-2011 03:50 PM
Puzzle Challenge: Collecting lots of files from many people? General General 6 06-25-2010 06:35 AM
<fun> The Windows Crash </fun> Simon Bridge General 6 08-26-2007 07:46 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 09:14 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration