Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
I have a file of words and want to encode them in a numeric form, based on position. This is best explained by example:
PEOPLE ==> 123152
Reading left to right:
P was first encountered at position 1 so it is encoded as 1.
E was first encountered at position 2 so it is encoded as 2.
O was first encountered at position 3 so it is encoded as 3.
P (again) was first encountered at position 1 so it is encoded as 1.
L was first encountered at position 5 so it is encoded as 5.
E (again) was first encountered at position 2 so it is encoded as 2.
SENSE => 12312
COMMITTEE => 123356688
POSITION => 12345428
I have done this encoding in REXX with the TRANSLATE function, but cannot figure out how to do it with a Linux command (or string of commands).
The desirable solution uses commands but not awk or Perl.
Thank you all for the thought and suggestions. I've worked on this problem and made progress.
I want to avoid, if possible, a solution with explicit loops. I want to use, if possible, the tr command because it seems so similar to the REXX TRANSLATE built-in function. This is what I've got at present.
Both examples generate the desired encoding. I'd like to generalize this solution to work for input words of any length. I've barely begun to learn about Regular Expressions, and think REs may be the key to a general solution. Ideas?
Well I don't have a solution but I can see an issue with extending this.
If we break down the last two parts, as echo speaks for itself:
1. tr 'ELPOEP' '123456789abcdef' - the reversing is not to much of an issue, but what happens to words, all be they rare, that are longer than 16 characters?
2. tr '654321' '123456789abcdef' - firstly is the same issue above with length of the word, but as the first string/variable here is dependent on the length of the initial string, I believe (but could well be wrong) that you will need some kind of loop to create the value and again with reference to the lengths greater than 9 here, you will now
need to start accessing letters of the alphabet into the loop as well.
Whilst expedient for the current small scenarios, which of course if you guarantee the length won't be an issue then it is fine, I believe some of the earlier
offerings may be more prudent. Although i can see the issue where the indexes, in mine for example, continue the numbering order so it would be hard to tell if 11 means
two lots of position 1 or a single at position 11.