Edit words to upper case without delete anything from source file
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Edit words to upper case without delete anything from source file
Hi all,
Some help over here please.
I have a text file that I want to edit changing only some words from lower case to upper case when match a condition. The condition is
pass to upper case all words below word SESSIONRECORD in column 1.
I´ve got pass all words below "SESSIONRECORD" to upper case with
is nearly wath you want, only you forget to print the other lines to the new file.
Code:
awk '/SESSIONRECORD/ { print ($0); #print the SESSIONRECORD line
getline; #get the next line
print toupper($0)} #print this line uppercase
else { print ($0)} #print all the other lines
' infile.txt > outfile.txt
$0 will print the hole line $1 will print only the frist item.
Last edited by Asy; 01-16-2009 at 10:48 AM.
Reason: change $1 to $0
The error is due to the fact you cannot use the else statement outside a code block. To negate a regular expression you have to put an exclamation mark before it. Anyway, here is another solution using a slightly different logic and without need of getline:
You did not say what programming language you are using. Here are some common programming practices to do what you want:
A. Subtract hex 20 from each lower case letter that you wish to make upper case.
B. Exclusively OR hex 20 against each lower case letter that you wish to make upper case.
C. Create a 256 byte translate table which maps hex 61 through 7A to hex 41 through 51. Then issue a translate instruction against the table for each lower case letter. This assumes that your language supports a translate instruction.
It is not clear if you want this feature in the same awk program that changes lower to uppercase words, anyway it should not be difficult to integrate the two pieces of code in one program.
As far as I know there is no way to tell getline to read ahead N records. But you can always use it in a while loop like this:
Code:
/SESSIONRECORD/{ while ( i < 7 ) {
getline
i++
}
print
i = 0
}
This method implies you want to immediately process the 7th line after SESSIONRECORD, lines from 1st to 6th after SESSIONRECORD are totally skipped and if a SESSIONRECORD occurs before the 7th line it is lost.
Instead, the basic idea can be: every time you encounter SESSIONRECORD keep memory of the line you want to process further. That is every time SESSIONRECORD occurs you store the Record Number + 7 and leave the program to process the following lines freely. When you encounter the Record Number previously stored, do some processing. Better to show than to tell:
Code:
/SESSIONRECORD/ { to_print[NR + 7] = 1 }
{ if ( NR in to_print ) {
print
}
}
The code above stores the NR + 7 value as an index in the array "to_print". Then if the current record number is an index of the array, print something. Hope it is clear.
I run it Telemachos solution and looks normal execution, the backup file is generated, but in original file any word change to capital letters
Huh, weird. When I do it over here on a file that looks (I thought) just like yours, it works fine. Original file:
Code:
[telemachus ~ $ cat records
SESSIONRECORD
student
Jhon Kenett
SESSIONRECORD
lawyer
Billy Bob
SESSIONRECORD
astronaut
Julie Sims
SESSIONRECORD
medician
Michael Barns
Edit with Perl:
Code:
telemachus ~ $ perl -i.bak -ple '$_ = uc $_ if $previous eq "SESSIONRECORD"; $previous = $_' records
Output:
Code:
telemachus ~ $ cat records
SESSIONRECORD
STUDENT
Jhon Kenett
SESSIONRECORD
LAWYER
Billy Bob
SESSIONRECORD
ASTRONAUT
Julie Sims
SESSIONRECORD
MEDICIAN
Michael Barns
Only the "job" line goes to uppercase. It doesn't matter much, but it's surprising to me that the awk solution works when this doesn't since they use the same logic.
As for your second problem, this line will print the seventh line below SESSIONRECORD, but it won't work on a file that looks like the first one you posted. Let me show a file where it will work and then explain:
Code:
telemachus ~ $ cat records
SESSIONRECORD
first line below
second line below
third line below
fourth line below
fifth line below
sixth line below
seventh line below
sessionrecord
MEDICIAN
Michael Barns
telemachus ~ $ perl -nle '$line = $. if m/SESSIONRECORD/; print if $. == $line + 7' records
seventh line below
This works here, but if the file looks like your original did, then the word SESSIONRECORD reappears too often. It reappears before we get 7 lines on in the file, so the number that the program is looking for gets reset. Worse still, it reappears so often that we never get an example of "a line 7 lines below the last instance of SESSIONRECORD". So before I go any further what does the file look like, and do you want to get only the 7th line after the very first SESSIONRECORD? Or are there multiple lines you want to get after multiple instances of SESSIONRECORD? I should add that the $. built-in variable in Perl records the current line number. (If you have multiple files you run this over, there is a trick to reset the line number to 1 at the start of each new file, but I'm not going to get into that until I see more about the problem.)
Last edited by Telemachos; 01-17-2009 at 06:24 AM.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.