LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Perl: Match part of a line and replace with another line from the same file (https://www.linuxquestions.org/questions/linux-newbie-8/perl-match-part-of-a-line-and-replace-with-another-line-from-the-same-file-735585/)

briana.paige 06-25-2009 10:59 AM

Perl: Match part of a line and replace with another line from the same file
 
Hi,

I need to match and manipulate particular lines of data files. You can see the shortened sample file below. The actual contains data for every day for all 12 months.

The weather data that was not measured for whatever reason is input as 9999 or 999 so that the user knows that data is missing. Unfortunately, these data points cause the simulation software I'm using to crash.

What I would like to do is write a perl script that will open any *.cw2.a file, read it in, find the lines that have 9999 and replace them with the first line above which does not have 9999 in it and is not a day and month demarcation line or white space etc...

I've done a little bit of matching and batch processing with perl and awk but I'm unsure how to proceed with the specific line I/O. Any help is greatly appreciated.

Code:

*CLIMATE
# ascii climate file for binary db _OttawaNRC.iwec ,
# defined in: /cygdrive/c/documents and settings/briana/desktop/ottawanrc_2001.cw2.a
# col 1: Diffuse solar on the horizontal (W/m**2)
# col 2: External dry bulb temperature  (Tenths DEG.C)
# col 3: Direct normal solar intensity  (W/m**2)
# col 4: Prevailing wind speed          (Tenths m/s)
# col 5: Wind direction    (clockwise deg from north)
# col 6: Relative humidity              (Percent)
OttawaNRC                  # site name
  2001  45.32  -0.67  0      # year, latitude, long diff, rad flag
    1  365                  # period (julian days)
      0      0      0      0      0    100
* day 18 month 01
      0  -171      0      0      0    79
      0  9999      0  9999    999    100
      0  -194      0    17    50    77
      0  -186      0    17    80    79
      0  -196      0    19    80    83
    0  9999      0  9999    999    78
      0  9999      0  9999    999    79

      0  -193      0    36    80    78
    39  -179    130    36    70    79
    68  -167    167    36    70    82


Tinkster 06-25-2009 04:07 PM

Code:

awk '{if( NF==6 && $0 !~ /9999/){hold=$0};if(NF==6 && $0 ~ /9999/){$0=hold};print}' climate

Ugly awk hack ... works on your sample data; will fail if the first line of
actual data has 9999 in it.



Cheers,
Tink

bigearsbilly 06-25-2009 04:11 PM

I don't quite get what the replacement is???

illustrate using your example.

oops:simultaneous post tinkster

Kenhelm 06-25-2009 11:04 PM

Using GNU sed:
This fails if the first data line contains '9999'.
(It turns an initial run of '9999' lines into empty lines).
Code:

data='\([[:blank:]]\+[-0-9]\+\)\{6\}'
sed -i.bak ":a /$data/{/9999/g;h};n;ba" *.cw2.a

This first puts a default line into the sed hold space.
The default line is used instead of an empty line to replace any initial '9999' lines.
Code:

default='    39  -179    130    36    70    79'
data='\([[:blank:]]\+[-0-9]\+\)\{6\}'
sed -i.bak "x;s/.*/$default/;x; :a /$data/{/9999/g;h};n;ba" *.cw2.a

'sed -i.bak' edits the files inplace and makes backups of the original files with a '.bak' extension.
Normally sed treats multiple input files as one continuous file but with '-i' they are treated as being separate.

ghostdog74 06-25-2009 11:30 PM

Code:

awk '$2!~/999/&&!/day|month/{h=$0}$2 ~/999/{ print h;next}1' file
Perl:
Code:

while (<>) {
    chomp;     
    @f = split(/\s+/, $_);
    if ($f[2] !~ /999/ && !/day|month/) {
        $h = $_;
    }
    if ($f[2] =~ /999/) {
        print $h."\n";
        next ;
    }
    print $_ ."\n";
}


briana.paige 06-26-2009 12:44 PM

Quote:

Originally Posted by ghostdog74 (Post 3586658)
Code:

awk '$2!~/999/&&!/day|month/{h=$0}$2 ~/999/{ print h;next}1' file
Perl:
Code:

while (<>) {
    chomp;     
    @f = split(/\s+/, $_);
    if ($f[2] !~ /999/ && !/day|month/) {
        $h = $_;
    }
    if ($f[2] =~ /999/) {
        print $h."\n";
        next ;
    }
    print $_ ."\n";
}


Thanks for your reply. I still can't seem to get this working quite right. I can get the awk and perl code to run and it will print out the file to my command window but its not actually editing the file so when I do grep as shown below. So even after I've run the awk or the perl commands, it still finds all the bad data lines i'm trying to replace with good data.

Code:

$ grep -Hn "9999" ottawanrc_2001.cw2.a

ottawanrc_2001.cw2.a:164    0  9999    0  999  100
ottawanrc_2001.cw2.a:2572  168  9999  93  999  100
ottawanrc_2001.cw2.a:2923  123  9999  775  999  100

I used the following from my bash:
Code:

awk '$2!~/999/&&!/day|month/{h=$0}$2 ~/999/{ print h;next}1' ottawanrc_2001.cw2.a
and this perl code in a script:

Code:

#!/usr/bin/perl
open CW2, ">>ottawanrc_2001.cw2.a" or die "cannot open file!";
while (<CW2>) {
    chomp;     
    @f = split(/\s+/, $_);
    if ($f[2] !~ /999/ && !/day|month/) {
        $h = $_;
    }
    if ($f[2] =~ /999/) {
        print $h."\n";
        next ;
    }
    print CW2 $_ ."\n";
    close CW2;
 }

I had the same problem with Tinkster's code so I don't know if I'm misunderstanding something?

Tinkster 06-27-2009 02:27 AM

For the awk parts just wrap it up like so.
Code:

awk '{if( NF==6 && $0 !~ /9999/){hold=$0};if(NF==6 && $0 ~ /9999/){$0=hold};print}' climate > tmp && mv tmp climate

ghostdog74 06-27-2009 02:37 AM

Quote:

Originally Posted by briana.paige (Post 3587331)
but its not actually editing the file

the codes only print to standard output. Of course it won't edit the file. you have to redirect to new file and rename it, like what tinkster did

briana.paige 06-27-2009 06:35 AM

Okay, thanks again everyone for all your help. I thought that it would edit the file in place but now I understand.
Cheers!


All times are GMT -5. The time now is 09:38 PM.