[SOLVED] Text processing -- UPPER CASE doubled letters in second word of each line
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
@firstfire - I found a small glitch in the sed solution. Try it with something that starts with the consecutive letters, like 'define'
Again, to Daniel, the new question is ambiguous on the same point, ie. are we assuming english spoken words or could we receive 'abcdef' and in which case, should all 6 characters now be upper case?
As to the current solution, where we upper case the first 3 characters found to be consecutive:
Code:
ruby -ane 'a="";$F[0].chomp.each_char{|c| if ! a.empty? && c == a[-1].next; a<<c; else a = c;end; break if a.size == 3};puts $F[0].sub(a,a.upcase)' file
EDIT: Just saw new file requirement that there are multiple words on the lines and not single entries. Above works for a file with one word per line
Again, to Daniel, the new question is ambiguous on the same point, ie. are we assuming english spoken words or could we receive 'abcdef' and in which case, should all 6 characters now be upper case?
You are right... ambiguous. Let's make the problem statement elastic: solve it any way you like, so long as you declare what your solution intends to do.
Ok, so I believe this solution will provide the output expected in post #16, with one slight caveat, the expected results shows the word 'struck' in the second line as being
part of the solution, but 'str' are not consecutive in this order. So presented solution looks for 3 characters in english alphabet order as they appear on the line and will only replace the first 3
that appear in a word, ie abcdef will become ABCdef:
Code:
ruby -ape '$F.each{|x| a="";x.each_char{|c| if ! a.empty? && c == a[-1].next; a<<c; else a = c;end; if a.size == 3; $_.sub!(x,x.sub(a,a.upcase)); break; end}}' file
Yes, there are cuple of bugs in my solution. Here is the v2:
Code:
$ cat infile
firstfire is an excellent programmer
the baseball player struck out
ghibli is a desert wind
excessive heat burst my wurst
now is the time
abcdef
$ sed -rn 's/$/\nabcdefghijklmnopqrstuvwxyz/; :a; s/(...)(.*\n.*\1.*)/\U\1\E\2/; ta; s/\n.*//p' < /tmp/infile
fiRSTfire is an excellent programmer
the baseball player struck out
GHIbli is a desert wind
excessive heat buRST my wuRST
now is the time
ABCDEF
I marked with bold what was added (also one closing parenthesis has been moved).
This script will upper-case all occurences of 3-character substrings of english alphabet.
I have checked back to version 1.8.7 and all commands I have used are available. After adding, 'abcdef' to the end I get the following output:
Code:
[grail@pilgrim]$ ruby -ape '$F.each{|x| a="";x.each_char{|c| if ! a.empty? && c == a[-1].next; a<<c; else a = c;end; if a.size == 3; $_.sub!(x,x.sub(a,a.upcase)); break; end}}' infile
fiRSTfire is a skillful programmer
the truck was STUck in the mud
GHIbli is a desert wind
excessive heat buRST my wuRST
now is the time
ABCdef
firstfire is a skillful programmer
the truck was stuck in the mud
ghibli is a desert wind
excessive heat burst my wurst
now is the time
abcdef
... this code ...
Code:
echo; echo "Method of LQ Member firstfire, using SED."
sed -rn 's/$/\nabcdefghijklmnopqrstuvwxyz/;
:a; s/(...)(.*\n.*\1.*)/\U\1\E\2/; ta;
s/\n.*//p' $InFile >$OutFile
echo "OutFile ..."; cat $OutFile; echo "End Of File"
echo; echo "Method of LQ Guru grail, using RUBY."
ruby -ape '$F.each{|x| a="";x.each_char{|c| if ! a.empty? && c == a[-1].next; a<<c; else a = c;end; if a.size == 3; $_.sub!(x,x.sub(a,a.upcase)); break; end}}' $InFile >$OutFile
echo "OutFile ..."; cat $OutFile; echo "End Of File"
echo; echo "Method of LQ member danielbmartin, using SED."
sed -f $PatFile $InFile >$OutFile
echo "OutFile ..."; cat $OutFile; echo "End Of File"
echo; echo "Normal end of job.";echo; exit
... produced this result ...
Code:
daniel@daniel-desktop:~$ bash /home/daniel/Desktop/LQfiles/dbm1249.bin
Method of LQ Member firstfire, using SED.
OutFile ...
fiRSTfire is a skillful programmer
the truck was STUck in the mud
GHIbli is a desert wind
excessive heat buRST my wuRST
now is the time
ABCDEF
End Of File
Method of LQ Guru grail, using RUBY.
OutFile ...
firstfire is a skillful programmer
the truck was stuck in the mud
ghibli is a desert wind
excessive heat burst my wurst
now is the time
abcdef
End Of File
Method of LQ member danielbmartin, using SED.
OutFile ...
fiRSTfire is a skillful programmer
the truck was STUck in the mud
GHIbli is a desert wind
excessive heat buRST my wurst
now is the time
ABCDEF
End Of File
Normal end of job.
Perhaps, due to my lack of ruby knowledge, I invoked your code improperly.
Well I didn't know what PatFile was supposed to be equal to so I got an error there, but I copied your script and got the following output:
Code:
Method of LQ Member firstfire, using SED.
OutFile ...
fiRSTfire is a skillful programmer
the truck was STUck in the mud
GHIbli is a desert wind
excessive heat buRST my wuRST
now is the time
ABCDEF
End Of File
Method of LQ Guru grail, using RUBY.
OutFile ...
fiRSTfire is a skillful programmer
the truck was STUck in the mud
GHIbli is a desert wind
excessive heat buRST my wuRST
now is the time
ABCdef
End Of File
Method of LQ member danielbmartin, using SED.
sed: file f1 line 1: unknown command: `f'
OutFile ...
End Of File
Normal end of job.
So you have run it correctly, but unfortunately I do not currently have access to a box running 1.8.7 to see why it is failing
Ok ... did some poking around and found the flaw, was quite tricky, but the below should work with 1.8.7 (on that note, is there any reason you cannot move to a later version? this one is quite
dated)
Code:
ruby -ape '$F.each{|x| a="";x.each_char{|c| if ! a.empty? && c == a[-1,1].next; a<<c; else a = c;end; if a.size == 3; $_.sub!(x,x.sub(a,a.upcase)); break; end}}'
... is there any reason you cannot move to a later version?
I've tried to install later versions of Ubuntu and (so far) my computer pukes so I continue to limp along with 10.04. One of these days I will try again!
echo; echo "Method of LQ Member firstfire, using SED."
sed -rn 's/$/\nabcdefghijklmnopqrstuvwxyz/;
:a; s/(...)(.*\n.*\1.*)/\U\1\E\2/; ta;
s/\n.*//p' $InFile >$OutFile
echo "OutFile ..."; cat $OutFile; echo "End Of File"
echo; echo "Method of LQ Guru grail, using RUBY."
ruby -ape '$F.each{|x| a="";x.each_char{|c|
if ! a.empty? && c == a[-1,1].next; a<<c;
else a = c;end;
if a.size == 3; $_.sub!(x,x.sub(a,a.upcase)); break; end}}' \
$InFile >$OutFile
echo "OutFile ..."; cat $OutFile; echo "End Of File"
echo; echo "Method of LQ member danielbmartin, using SED."
sed -f $CmdFile $InFile >$OutFile
echo "OutFile ..."; cat $OutFile; echo "End Of File"
echo; echo "Normal end of job.";echo; exit
... produced this result ...
Code:
daniel@daniel-desktop:~$ bash /home/daniel/Desktop/LQfiles/dbm1249.bin
Method of LQ Member firstfire, using SED.
OutFile ...
fiRSTfire is a skillful programmer
the truck was STUck in the mud
GHIbli is a desert wind
excessive heat buRST my wuRST
now is the time
ABCDEF
End Of File
Method of LQ Guru grail, using RUBY.
OutFile ...
fiRSTfire is a skillful programmer
the truck was STUck in the mud
GHIbli is a desert wind
excessive heat buRST my wuRST
now is the time
ABCdef
End Of File
Method of LQ Member danielbmartin, using SED.
OutFile ...
fiRSTfire is a skillful programmer
the truck was STUck in the mud
GHIbli is a desert wind
excessive heat buRST my wuRST
now is the time
ABCDEF
End Of File
Normal end of job.
There is one remaining nitpick: abcdef should be transformed to ABCDEF.
Well that is an additional requirement which I did previously mention was not part of this solution
I'll see what needs to be changed for correction. I would add that your patfile solution is rather limited as if the requirement changed top say 5 consecutive letters
you would have to completely re-write the file, whereas the ruby and first sed solution need a minor change.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.