# New Years Data to DB - The Idea

Posted 02-28-2019 at 02:33 AM by astrogeek

The initial efforts of all participants in the newyears thread, and comparison and validation of the results produced by the three different approaches to the problem, have led us to agreement that we have indeed found a complete set of solutions. At least complete within the bounds of our combined ability to imagine and test generating expressions to this point!

Knowing what a complete solution set looks like, we have each shifted emphasis from the generating code, to a more critical analysis of the results themselves, and of the somewhat different sets of expressions which produce it.

Beryllos has refined his efforts to find any elusive additional solutions, partly by testing whether some as yet untried sign inversion might produce something new. Dogpatch has also looked into the effects of signed variables on his results. Together their efforts should decide whether there is more to be found in that region or not.

My own initial efforts have produced a minimal set of expressions which still produces the complete solution set, a set of 176 distinct expressions. Although my generating program allows for signed variables, I decided early to maintain my own expression set in unsigned form, converting signed to unsigned as I encountered them. I suspect that there are no signed variants whose unsigned versions would not have turned up in the previous brute force expression permutations, but I do not know that to be true. So I would be surprised, but delighted if a signed expression produced some unique new result!

When I began to think about what further insights we could gain from the results set, and where we might find clues where to look for additional solutions, I began to think about putting the result data into a relational database. With the move from generation to analysis this makes good sense.

Why put the data into a relational database?

When we begin to ask questions like, "What is the minimal set of expressions which will produce the complete result set?", we have moved solidly into the realm of graphs, combinatorics, relations and sets. With the complete set of algebraic solutions in hand, most other interesting questions we might ask about the expressions and solutions all involve the ideas of

Putting the data into a relational model, a relational database, allows us to begin to ask those questions in the language of sets, using set operators to obtain the answers.

And to be clear, the main idea here is not generation of new solutions, but to better understand the expressions and solutions we have in new and interesting ways.

I have always loved math in its many forms, but am really not very good at it... I often miss the obvious and must take everything in small steps with text books at hand and still make frequent stumbles down dead ends. On the other hand, I have glimpsed its supreme beauty at times, and have gained full confidence in the reliability of its concepts and methods when correctly understood, by hard won experience. I also know a little of the thrill of

So, in that spirit of continuing discovery let's follow where the data leads!

I had intended to post the details of my data model with code for importing the data by now, but time and space have conspired to repeatedly delay that effort. So I will defer that for yet another day and instead post an interesting result of my own early explorations of our solution set.

I will ask a more specific version of the question above:

This is pretty much the definition of a covering set problem, and the related idea of reachability.

If we think of the problem as a graph with expressions as vertices on the left, and the solutions (integers 0-100) as vertices on the right with permutations being the edges, or paths from an expression to a solution, we are asking for the minimal set of edges which will make every possible solution reachable from at least one expression.

It turns out that there are multiple ways of asking, and answering the above question. For example, we can think of it as a graph with permutations as the vertices on the left, and the expressions as the connecting edges. Doing so produces a different "minimal" set.

In fact, there may be multiple such sets depending on how we phrase the question, so it seems better to refer to them as "optimal" sets, according to whatever criteria has been selected, rather than minimal.

I have arrived at a a set of SQL queries which produces results for these two formulations of the problem and have generated results for all 715 combinations of four digit years.

First, taking expressions as vertices and permutations as edges, I get the following optimal set of expressions and permutation for the year 2019 (combination 0129) which minimizes the number of expressions:

If I rephrase it to minimize the number of permutations, at the expense of a few more expressions, we get this result:

If we think of the problem in terms of

But now we might ask another question:

I do not have answers for these yet, but I suspect that neither of these solutions is the optimal in terms of cost for all solution sets. I have not discovered how to phrase that question.

So this is my introduction to application of relational mehtods to the newyear puzzle solution set!

Are we having fun yet? Definitely!

I am hopeful that I will be able to post my code, data model and a few more results within a few days. In the mean time, think of the questions you want to ask of the data!

Knowing what a complete solution set looks like, we have each shifted emphasis from the generating code, to a more critical analysis of the results themselves, and of the somewhat different sets of expressions which produce it.

Beryllos has refined his efforts to find any elusive additional solutions, partly by testing whether some as yet untried sign inversion might produce something new. Dogpatch has also looked into the effects of signed variables on his results. Together their efforts should decide whether there is more to be found in that region or not.

My own initial efforts have produced a minimal set of expressions which still produces the complete solution set, a set of 176 distinct expressions. Although my generating program allows for signed variables, I decided early to maintain my own expression set in unsigned form, converting signed to unsigned as I encountered them. I suspect that there are no signed variants whose unsigned versions would not have turned up in the previous brute force expression permutations, but I do not know that to be true. So I would be surprised, but delighted if a signed expression produced some unique new result!

**NOTE: I am delighted to say I was wrong!**

Before press time Beryllos has now found not one, but two new expressions producing two new solutions! Very nice!Before press time Beryllos has now found not one, but two new expressions producing two new solutions! Very nice!

When I began to think about what further insights we could gain from the results set, and where we might find clues where to look for additional solutions, I began to think about putting the result data into a relational database. With the move from generation to analysis this makes good sense.

Why put the data into a relational database?

When we begin to ask questions like, "What is the minimal set of expressions which will produce the complete result set?", we have moved solidly into the realm of graphs, combinatorics, relations and sets. With the complete set of algebraic solutions in hand, most other interesting questions we might ask about the expressions and solutions all involve the ideas of

*sets*and*graphs*.Putting the data into a relational model, a relational database, allows us to begin to ask those questions in the language of sets, using set operators to obtain the answers.

And to be clear, the main idea here is not generation of new solutions, but to better understand the expressions and solutions we have in new and interesting ways.

I have always loved math in its many forms, but am really not very good at it... I often miss the obvious and must take everything in small steps with text books at hand and still make frequent stumbles down dead ends. On the other hand, I have glimpsed its supreme beauty at times, and have gained full confidence in the reliability of its concepts and methods when correctly understood, by hard won experience. I also know a little of the thrill of

*mathematical discovery*(*as does Beryllos!*), even when it is only a "discovery" for myself, very satisfying! It is that, which makes these things interesting in the first place, and makes the effort worthwhile!So, in that spirit of continuing discovery let's follow where the data leads!

I had intended to post the details of my data model with code for importing the data by now, but time and space have conspired to repeatedly delay that effort. So I will defer that for yet another day and instead post an interesting result of my own early explorations of our solution set.

I will ask a more specific version of the question above:

Code:

For a given combination, what is the minimal set of expressions and permutations which will produce the complete set of solutions?

If we think of the problem as a graph with expressions as vertices on the left, and the solutions (integers 0-100) as vertices on the right with permutations being the edges, or paths from an expression to a solution, we are asking for the minimal set of edges which will make every possible solution reachable from at least one expression.

It turns out that there are multiple ways of asking, and answering the above question. For example, we can think of it as a graph with permutations as the vertices on the left, and the expressions as the connecting edges. Doing so produces a different "minimal" set.

In fact, there may be multiple such sets depending on how we phrase the question, so it seems better to refer to them as "optimal" sets, according to whatever criteria has been selected, rather than minimal.

I have arrived at a a set of SQL queries which produces results for these two formulations of the problem and have generated results for all 715 combinations of four digit years.

First, taking expressions as vertices and permutations as edges, I get the following optimal set of expressions and permutation for the year 2019 (combination 0129) which minimizes the number of expressions:

Code:

2019 Minimal expressions to produce full result set... +-------------+ | expression | +-------------+ Matrix of found solutions | (a+b)*(c-d) | +-------------+ 0: 0 1 2 3 4 5 6 7 8 9 | (a+b)/(c+d) | | permutation | 10: 10 11 12 13 14 . 16 17 18 19 | (a-b-c)^d | +-------------+ 20: 20 21 22 . . . . 27 28 29 | (ab/c)^d | | 0192 | 30: 30 . . . . . . . 38 39 | (abc)/d | | 1029 | 40: . . . . 44 45 46 . . . | a*(b+c^d) | X | 1092 | = 50: . . . . . . . . . . | a*(bc-d) | | 1290 | 60: . . . . 64 . . . . 69 | ab*(c-d) | | 1902 | 70: 70 71 72 . . . . . 78 . | ab+cd | | 2019 | 80: 80 81 82 . . . . 87 88 89 | ab+c^d | | 2190 | 90: 90 91 92 93 . 95 . . . . | ab-c-d | | 2910 | 100: 100 | ab-cd | | 9021 | Total solutions 2019 = 49 | ab-c^d | | 9102 | | ab/c+d | | 9210 | | ab/c-d | +-------------+ | a^b-c+d | (11 rows) +-------------+ (16 rows)

Code:

2019 Minimal permutations to produce full result set... +-------------+ | expression | +-------------+ | (a+b)*(c-d) | | (a+b)/(c+d) | | (a+b)^(c+d) | | (a+b)^c-d | Matrix of found solutions +-------------+ | (a-b)^(c+d) | 0: 0 1 2 3 4 5 6 7 8 9 | permutation | | (ab/c)^d | 10: 10 11 12 13 14 . 16 17 18 19 +-------------+ | (abc)/d | 20: 20 21 22 . . . . 27 28 29 | 1029 | | a*(b+c^d) | 30: 30 . . . . . . . 38 39 | 1092 | | a*(bc-d) | 40: . . . . 44 45 46 . . . | 1290 | X | ab*(c-d) | = 50: . . . . . . . . . . | 1902 | | ab+(c*d) | 60: . . . . 64 . . . . 69 | 2019 | | ab+c+d | 70: 70 71 72 . . . . . 78 . | 2190 | | ab+cd | 80: 80 81 82 . . . . 87 88 89 | 9021 | | ab+c^d | 90: 90 91 92 93 . 95 . . . . | 9102 | | ab-c-d | 100: 100 +-------------+ | ab-cd | Total solutions 2019 = 49 (8 rows) | ab-c^d | | ab/c+d | | ab/c-d | | a^b-c+d | +-------------+ (20 rows)

*cost to generate*the solution set, we may define the cost as the number evaluations, or the product of permutations and expressions. In this example the cost for minimal permutaitons, 160, is lower than that for minimal expressions, 176. This turns out to be true for all combinations, at least by my current methods (which are subject to refinement!).But now we might ask another question:

Code:

Is either of these solutions the absolute optimal solution in terms of minimal cost? If not, how might we find that absolute optimal solution? More generally how might we find all solution subsets for a given combination?

So this is my introduction to application of relational mehtods to the newyear puzzle solution set!

Are we having fun yet? Definitely!

I am hopeful that I will be able to post my code, data model and a few more results within a few days. In the mean time, think of the questions you want to ask of the data!

Total Comments 0