LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 01-16-2009, 01:18 AM   #1
cgcamal
Member
 
Registered: Nov 2008
Location: Tegucigalpa
Posts: 78

Rep: Reputation: 16
Edit words to upper case without delete anything from source file


Hi all,

Some help over here please.

I have a text file that I want to edit changing only some words from lower case to upper case when match a condition. The condition is
pass to upper case all words below word SESSIONRECORD in column 1.

Ive got pass all words below "SESSIONRECORD" to upper case with

Code:
awk '/SESSIONRECORD/ { getline; print toupper($1)}' infile.txt > outfile.txt
but this command using getline isolates the words passed to upper from the original file and only shows those words and not the complete file.

Id like to preserve all the original data in infile.txt, only modifying the words mentioned above to upper case.

Example:
Original text file:
Code:
SESSIONRECORD
student
Jhon Kenett
.
.
.
SESSIONRECORD
medician
Michael Barns
.
.
Id like to pass as follow
Code:
SESSIONRECORD
STUDENT
Jhon Kenett
.
.
.
SESSIONRECORD
MEDICIAN
Michael Barns
.
.

Some suggestion on what I have to modify in the command I,m using would be very appreciated.

Best regards
 
Old 01-16-2009, 02:34 AM   #2
chakka.lokesh
Member
 
Registered: Mar 2008
Distribution: Ubuntu
Posts: 270

Rep: Reputation: 33
you can use the following command

Quote:
sed -e 's/student/STUDENT/;s/medician/MEDICIAN/' filename

Last edited by chakka.lokesh; 01-16-2009 at 02:38 AM.
 
Old 01-16-2009, 10:38 AM   #3
Asy
LQ Newbie
 
Registered: May 2008
Location: The Netherlands
Distribution: Ubuntu
Posts: 25

Rep: Reputation: 16
Thumbs up

@chakka.lokesh
Quote:
sed -e 's/student/STUDENT/;s/medician/MEDICIAN/' filename
Is not the question, because after the SESSIONRECORD can be any word.

@cgcamal
Quote:
awk '/SESSIONRECORD/ { getline; print toupper($1)}' infile.txt > outfile.txt
is nearly wath you want, only you forget to print the other lines to the new file.


Code:
awk '/SESSIONRECORD/ { print ($0);         #print the SESSIONRECORD line
                       getline;            #get the next line
                       print toupper($0)}  #print this line uppercase
                else { print ($0)}         #print all the other lines
    ' infile.txt > outfile.txt
$0 will print the hole line $1 will print only the frist item.

Last edited by Asy; 01-16-2009 at 10:48 AM. Reason: change $1 to $0
 
Old 01-16-2009, 04:57 PM   #4
cgcamal
Member
 
Registered: Nov 2008
Location: Tegucigalpa
Posts: 78

Original Poster
Rep: Reputation: 16
Hi chakka.lokesh and Asy,

Thanks for your reply.

Ive test the Asys answer, because is more similar to my question, but
I get an error when I run it, and Im not sure why.

Code:
awk: syntax error at source line 1
 context is
        /SESSIONRECORD/ { getline; print toupper($0)} >>>  else <<<
  { print ($0)}
awk: bailing out at source line 1
$ line; print toupper($1)} else { print ($0)}' input3.txt > outfile.txt     <
awk: syntax error at source line 1
 context is
        /SESSIONRECORD/ { print ($0); getline; print toupper($1)} >
>>  else <<<  { print ($0)}
awk: bailing out at source line 1
$

One more thing. A related question.

Something similar about the same input file.

Sometimes I need to print the word that is 5, 7 or 10 lines below "SESSIONRECORD"
(this word is always the reference to begin any analysis desired).

For that I use getline several times, depending how many lines below is the
word wanted. My code so far is as follow:


Code:
awk '/ SESSIONRECORD / { getline; getline; getline;...getline;
						student
						Jhon Kenett
						.
						.
						.
            DurationSession = substr($0,20,2)":"substr($0,23,2)":"substr($0,26,2)
            
            printf("%s %s %s\n", "SESSIONRECORD", "DurationSession")
Well, somebody knows if is there a best way to do that?

something like say in numbers how many lines below is the word I need.

Code:
awk '/ SESSIONRECORD / { getline 7; print S1}' #The 7 is an example.
 
Old 01-16-2009, 05:17 PM   #5
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
The error is due to the fact you cannot use the else statement outside a code block. To negate a regular expression you have to put an exclamation mark before it. Anyway, here is another solution using a slightly different logic and without need of getline:
Code:
awk '{if ( prev == "SESSIONRECORD" ) print toupper($1); else print; prev = $1}' infile.txt > outfile.txt
 
Old 01-16-2009, 06:22 PM   #6
Telemachos
Member
 
Registered: May 2007
Distribution: Debian
Posts: 754

Rep: Reputation: 60
Here's a Perl version. (This will also edit the file in-place and save the old version as file.bak.)
Code:
perl -i.bak -ple '$_ = uc $_ if $previous eq "SESSIONRECORD"; $previous = $_'
Run it with the name(s) of the file or files you want to do this to.
 
Old 01-16-2009, 07:01 PM   #7
jailbait
LQ Guru
 
Registered: Feb 2003
Location: Virginia, USA
Distribution: Debian 12
Posts: 8,349

Rep: Reputation: 552Reputation: 552Reputation: 552Reputation: 552Reputation: 552Reputation: 552
You did not say what programming language you are using. Here are some common programming practices to do what you want:

A. Subtract hex 20 from each lower case letter that you wish to make upper case.

B. Exclusively OR hex 20 against each lower case letter that you wish to make upper case.

C. Create a 256 byte translate table which maps hex 61 through 7A to hex 41 through 51. Then issue a translate instruction against the table for each lower case letter. This assumes that your language supports a translate instruction.

-------------------
Steve Stites
 
Old 01-17-2009, 12:39 AM   #8
cgcamal
Member
 
Registered: Nov 2008
Location: Tegucigalpa
Posts: 78

Original Poster
Rep: Reputation: 16
Hi all, colucix, Telemachos, jailbait,

Ive tryed colucix solution and works perfectly like I wanted.

I run it Telemachos solution and looks normal execution, the backup file
is generated, but in original file any word change to capital letters.

The suggestion of jalbait looks good, but really advance for me, Im very new with unix and this
programming languages. thanks anyway.

Now, the last question as I said before:

How can I print a word that is in column 1 N lines below "SESSIONRECORD" without using "getline" N times?

If the word wanted is 7 lines below "SESSIONRECORD", I think there is a better way to do that instead of repeat 7 time getline as follow:

awk '/ SESSIONRECORD / { getline; getline; getline;
getline; getline; getline; getline; print S1}'

Thanks to all again for your valuable help.

Best regards
 
Old 01-17-2009, 04:07 AM   #9
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
It is not clear if you want this feature in the same awk program that changes lower to uppercase words, anyway it should not be difficult to integrate the two pieces of code in one program.

As far as I know there is no way to tell getline to read ahead N records. But you can always use it in a while loop like this:
Code:
/SESSIONRECORD/{ while ( i < 7 ) {
    getline
    i++
  }
  print
  i = 0
}
This method implies you want to immediately process the 7th line after SESSIONRECORD, lines from 1st to 6th after SESSIONRECORD are totally skipped and if a SESSIONRECORD occurs before the 7th line it is lost.

Instead, the basic idea can be: every time you encounter SESSIONRECORD keep memory of the line you want to process further. That is every time SESSIONRECORD occurs you store the Record Number + 7 and leave the program to process the following lines freely. When you encounter the Record Number previously stored, do some processing. Better to show than to tell:
Code:
/SESSIONRECORD/ { to_print[NR + 7] = 1 }

{ if ( NR in to_print ) {
     print
  }
}
The code above stores the NR + 7 value as an index in the array "to_print". Then if the current record number is an index of the array, print something. Hope it is clear.
 
Old 01-17-2009, 06:06 AM   #10
Telemachos
Member
 
Registered: May 2007
Distribution: Debian
Posts: 754

Rep: Reputation: 60
Quote:
Originally Posted by cgcamal View Post
I run it Telemachos solution and looks normal execution, the backup file is generated, but in original file any word change to capital letters
Huh, weird. When I do it over here on a file that looks (I thought) just like yours, it works fine. Original file:
Code:
[telemachus ~ $ cat records 
SESSIONRECORD
student
Jhon Kenett

SESSIONRECORD
lawyer
Billy Bob

SESSIONRECORD
astronaut
Julie Sims

SESSIONRECORD
medician
Michael Barns
Edit with Perl:
Code:
telemachus ~ $ perl -i.bak -ple '$_ = uc $_ if $previous eq "SESSIONRECORD"; $previous = $_' records
Output:
Code:
telemachus ~ $ cat records
SESSIONRECORD
STUDENT
Jhon Kenett

SESSIONRECORD
LAWYER
Billy Bob

SESSIONRECORD
ASTRONAUT
Julie Sims

SESSIONRECORD
MEDICIAN
Michael Barns
Only the "job" line goes to uppercase. It doesn't matter much, but it's surprising to me that the awk solution works when this doesn't since they use the same logic.

As for your second problem, this line will print the seventh line below SESSIONRECORD, but it won't work on a file that looks like the first one you posted. Let me show a file where it will work and then explain:
Code:
telemachus ~ $ cat records
SESSIONRECORD
first line below
second line below
third line below
fourth line below
fifth line below
sixth line below
seventh line below

sessionrecord
MEDICIAN
Michael Barns

telemachus ~ $ perl -nle '$line = $. if m/SESSIONRECORD/; print if $. == $line + 7' records
seventh line below
This works here, but if the file looks like your original did, then the word SESSIONRECORD reappears too often. It reappears before we get 7 lines on in the file, so the number that the program is looking for gets reset. Worse still, it reappears so often that we never get an example of "a line 7 lines below the last instance of SESSIONRECORD". So before I go any further what does the file look like, and do you want to get only the 7th line after the very first SESSIONRECORD? Or are there multiple lines you want to get after multiple instances of SESSIONRECORD? I should add that the $. built-in variable in Perl records the current line number. (If you have multiple files you run this over, there is a trick to reset the line number to 1 at the start of each new file, but I'm not going to get into that until I see more about the problem.)

Last edited by Telemachos; 01-17-2009 at 06:24 AM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
renaming directories from upper case to lower case, help!! linux_teller Linux - Newbie 3 03-07-2008 05:15 AM
Mount DVD While Preserve Upper Case File Names xptools Linux - Software 2 12-12-2005 10:18 AM
upper case letter moonz Linux - General 3 09-18-2005 06:58 AM
Why are all my upper case files being shown as lower case?? [Kernel 2.6.9-1.667 FC3] t3gah Fedora 4 03-11-2005 04:09 PM
Lower case to upper case letter sudhasmyle Programming 1 12-03-2004 04:15 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 01:21 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration