[SOLVED] How to skip a number even if it's a random form of it?
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
It is already solved but essentially you should use hash here. Hash which should results the same no matter of order.
That is not correct. By design hash values typically depend totally on the data order to be useful - different order results in different hash value.
You could of course design a hash algorithm which did not depend on the input data order, and would return the same hash for the same values. But I suspect it could probably be proved that calculating such a hash would require significantly more processing than simply sorting (i.e. reordering) the values for the task at hand. I say that because such a hash would necessarily have to sort the values in some manner to obtain a canonical data chunk on which to base the hash value, then it would still have to calculate the value.
Alternatively, try to imagine a "simplest" hash algorithm which does not internally reorder the values, such as taking the sum of values as the hash value. It could not distinguish between 4-5-0, 3-4-2 and 7-1-1, and an infinity of other cases, making it useless for the task at hand.
MD5 sums would seem to be completely inappropriate for this use case. If you would care to suggest an example using MD5 which would produce the same hash value for 4-5-7 as for 7-4-5, without reordering, please do so!
In future, if you are going to suggest solutions to a problem, please include at least a simple example. Simply saying ,"Use hash or MD5", is not helpful.
Last edited by astrogeek; 11-15-2018 at 12:45 PM.
Reason: typo
I can say that way: if it is about learning then OP would benefit knowing something about hashes or tricks where data are divided into small parts easier to handle. Say even simple hash 1+3+4 is useful. But in this case good hash is total sum and number of odd entries, say 1 2 3 has sum 6, there is 2 odd entries, but 0 2 4 has total sum also 6 but number of odd entries is 0. So these sequences follow in different groups. The procedure now is to take sequence, count its hash and put in respective file, next proceed sequentially all these files. If files are small enough go straightforward and sort each sequence in file. If not try to find out additional trick to divide even more.
Say even simple hash 1+3+4 is useful. But in this case good hash is total sum and number of odd entries, say 1 2 3 has sum 6, there is 2 odd entries, but 0 2 4 has total sum also 6 but number of odd entries is 0.
Such approaches have their uses, but I can't see where this is one of them.
How would you generalize to values with more than one digit, or lines with more, or less than three values?
Quote:
Originally Posted by igadoter
So these sequences follow in different groups. The procedure now is to take sequence, count its hash and put in respective file, next proceed sequentially all these files. If files are small enough go straightforward and sort each sequence in file. If not try to find out additional trick to divide even more.
Which only proves my point - after all that summing and hashing, it must still fall back to a sort test, or unspecified "additional trick to divide even more". So it will always be more complex than the simple sort on which it ultimately depends, whereas the simple sort is both sufficient and complete and extends to other numbers of values and to values with more than one digit.
If you are thinking in terms of hashes, then think of it like this:
Quote:
The hash algorithm is a numerical sort, and the sorted values are the hash.
You can make it more difficult than that if you want, but the essence of such programming is to find the simplest solution, and sort-as-hash is difficult to improve on in this case.
Last edited by astrogeek; 11-15-2018 at 03:53 PM.
Reason: typos
Tricks are things to be invented. Like Edison who invented light bulb. No recipes here. The one of possible worst case is all lines contain the same numbers up to order:
1 2 3
2 3 1
...
3 2 1
hashes don't work here but they gave clue it is worst case.
For me hash is just function hash(data) -> key, which for example allows to represent data as table of lists - index of the table is just key given by hash, so
key1: data -> data -> data -> ... EOL
key2: data -> data -> ...EOL
.
.
data with the same hash (key) are being grouped together.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.