LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Need to edit last word on each line in file (https://www.linuxquestions.org/questions/linux-general-1/need-to-edit-last-word-on-each-line-in-file-810737/)

emmalg 05-28-2010 11:05 AM

Need to edit last word on each line in file
 
Hi All

I have a file which a number at the end of each line. I need to change this number in the file to be correct, i.e. each time the number is 9 it needs to be 1, each time it is 233 it needs to be 2, etc... There is no pattern to the numbers currently in the list other than the same number appears only in a single line/group of lines, not throughout the file, but the replacements need to be sequential (but can be repeated an arbitrary number of times).

For example:
... 9
... 233
... 233
... 233
... 234
... 234
... 7
... 7
... 7
... 7

Becomes:
... 1
... 2
... 2
... 2
... 3
... 3
... 4
... 4
... 4
... 4


What I propose doing in pseudocode is:

Code:

first_val = 9
new_val = 1

read file

foreach line do
  if last_variable = first_val
    last_varible = new_val
  else
    first_val = last_variable
    new_val = new_val + 1
    last_variable = new_val
  endif
endfor

I've looked at both sed and awk, but I'm coming really unstuck with the regular expressions, I've used both in the past for simple things, but in this case what I want to change is a variable, so I don't think I can use a regexp and it is at the end of the line which doesn't seem easy with sed.

If you can offer any advice I would be really grateful!

Cheers
Emma

rikijpn 05-28-2010 12:16 PM

sed
 
what's wrong with sed?
just do
Code:

sed -e 's/string1/newstring1/g' -e 's/string2/newstring2/g '
and add "-e 's/string_to_replace/new_string/g'" for all the strings you want to replace. If you want to specify the number at the end of the line do
Code:

sed 's/string1$/newstring1/g'
("$" means end of line).

http://www.grymoire.com/Unix/Sed.html

emmalg 06-01-2010 03:54 AM

Hi rikijpn

Sorry for the delay in replying, it was a bank holiday weekend here! I did not know if I could use a variable in the sed expression, all the examples I have seen seemed to avoid the topic.

I'm going to have a good look through the grymoire so that I fully understand your suggestion.

I'll let you know how I get on.

Many thanks,
Emma

syg00 06-01-2010 05:41 AM

Dunno about sed - try something like this
Code:

awk '{ if( $NF != saveit) { saveit = $NF ; i += 1} ; $NF = i ; print}' infile
Normally I'd initialize the variables, but this should show the idea.

This works on gawk - if you're thinking of Solaris, you might have to massage this a bit.

emmalg 06-01-2010 06:04 AM

I really appreciate that - I'm working on the sed suggestion at the moment, and I've been trying a few examples which only change the last occurence of a pattern in echoed line but have yet try variables in it!

I am actually a little more familiar with awk myself (though rarely for more than printing columns of data) and understand your suggestion better.

I'll probably implement something like the awk suggestion but will pursue the sed one as well to fill in some gaps in my knowledge, it's proving more versatile than I realised!

syg00 06-01-2010 06:31 AM

You can do all sorts of arcane things with sed - doesn't make it the best tool for the job. I skipped awk - went from sed to perl.
I'm trying to get up to speed on awk, so I can understand you wanting to investigate sed for your own edification.

MTK358 06-01-2010 07:57 AM

Code:

NR == 1 {
        prev = $NF
        count = 1
        $NF = count
        print
}

NR > 1 {
        if ($NF != prev) {
                count++
                prev = $NF
        }
        $NF = count
        print
}


emmalg 06-01-2010 08:30 AM

Hi All

syg00's solution seems to be the quickest to implement and works amazingly "out of the box", I've declared the two variables needed before the if and all the tests I've been running so far have been great.

I'm going to incorporate some other bits as I need my output formatted specifically, but this solution was really useful.

Thanks everyone for your tips - you've helped me to understand my own ignorance bit better ;-)

Thanks very much!

linuxgurusa 06-01-2010 08:44 AM

Quote:

Originally Posted by emmalg (Post 3984387)
Hi All

I have a file which a number at the end of each line. I need to change this number in the file to be correct, i.e. each time the number is 9 it needs to be 1, each time it is 233 it needs to be 2, etc... There is no pattern to the numbers currently in the list other than the same number appears only in a single line/group of lines, not throughout the file, but the replacements need to be sequential (but can be repeated an arbitrary number of times).

For example:
... 9
... 233
... 233
... 233
... 234
... 234
... 7
... 7
... 7
... 7

Becomes:
... 1
... 2
... 2
... 2
... 3
... 3
... 4
... 4
... 4
... 4


What I propose doing in pseudocode is:

Code:

first_val = 9
new_val = 1

read file

foreach line do
  if last_variable = first_val
    last_varible = new_val
  else
    first_val = last_variable
    new_val = new_val + 1
    last_variable = new_val
  endif
endfor

I've looked at both sed and awk, but I'm coming really unstuck with the regular expressions, I've used both in the past for simple things, but in this case what I want to change is a variable, so I don't think I can use a regexp and it is at the end of the line which doesn't seem easy with sed.

If you can offer any advice I would be really grateful!

Cheers
Emma

Hi There, I might be a little bit confused by what you said, but what I normally do to change from one carackter to another one, making a global change within a file, vi the file ad type the following command:

:1,$s/9/1/g ( nd press enter )

This command will change all 9's to 1's in the file ...

Usefull ?

emmalg 06-01-2010 10:20 AM

Unfortunately it isn't as simple as replacing all "9" with "1" because the entire file is 7,000,000 lines of 14 columns of numbers!

Anyway, rather inelegantly shown, but it is only a short term solution to re-ingest data which was lost from a couple of databases due to some buggy scrpts I was given. This is what I ended up with, first I removed all the lines starting with the string "PulseId" (i.e. the headers from my original MySQL queries), then if the line did not start with that string, I altered the numbers and printed the output in the required format!

Code:

awk '{ if ($1=="PulseId") {$0=""}
else {last_var=233; new_val=1;
  if ($NF != last_var) { last_var=$NF ; new_val+=1 };
    $NF = new_val;
    print "insert into pulse (PulseId,RowNum,MaxP1,MaxP2,MaxP3,AveP1,AveP2,AveP3,AveP1a,PhaseP1,PhaseP1a,PhaseP2,PhaseP3,CalId) values ('\'' '\'','\''"$2"'\'','\''"$3"'\'','\''"$4"'\'','\''"$5"'\'','\''"$6"'\'','\''"$7"'\'','\''"$8"'\'','\''"$9"'\'','\''"$10"'\'','\''"$11"'\'','\''"$12"'\'','\''"$13"'\'','\''"$14"'\'');"}
}' testfile.txt > testoutfile.txt


emmalg 06-01-2010 10:36 AM

Ah - before you all point out the silly bug, I've seen it! I didn't test it on enough lines!

[Edit]
Problem solved - setting up the initial values of the variables was a bad idea as the command works on every line in the file! To solve the problem I've simply removed last_var=233 and new_val=1. I've tested this several times and it seems to work.

Thanks all!


All times are GMT -5. The time now is 06:15 PM.