LinuxQuestions.org - sed: delete lines after last occurrence of a pattern in a file

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - sed: delete lines after last occurrence of a pattern in a file (https://www.linuxquestions.org/questions/programming-9/sed-delete-lines-after-last-occurrence-of-a-pattern-in-a-file-769543/)

sed: delete lines after last occurrence of a pattern in a file

Hello,

I'm having trouble finding how to delete all lines after the last occurrence of a pattern.

I know this deletes all lines after it finds PATTERN:
sed '/PATTERN/q' file.in > file.out

but if I have a file like:
qwe
PATTERN
rty
PATTERN
uiop

the result is:
qwe
PATTERN

when in fact I want:
qwe
PATTERN
rty
PATTERN

Can anyone tell me how I achieve this? I found lots of references with google to replacing the last occurrence of one word with another within a line, but not something like the case I have.

Thanks!

I'm not entirely sure that sed is the best tool for this job. The problem is that sed is so line-oriented, and you need to maintain state (at least at some level) to do this.

A quick stab at how I might do this (probably in Perl): track through the file line by line; every time I see PATTERN, note the line number (just one variable for this $last_seen); then rewind to the top of the file and only print back out up to the line number that I was left with.

The general problem is that whatever tool you use has no obvious way of knowing whether the next line is another occurrence of PATTERN or EOF.

An example, not very fancy:

Code:

#!/usr/bin/env perl

use strict;

use warnings;



open my $fh, '<', 'file.txt'

        or die "Can't open 'file.txt' for reading: $!";



my $last_seen;



while (<$fh>) {

        $last_seen = $. if $_ =~ /PATTERN/;

        print $., "\n";

}



open my $out, '>', 'new_file.txt'

        or die "Can't open 'new_file.txt' for writing: $!";



seek($fh, 0, 0);

$. = 0;



while (<$fh>) {

        print $out $_;

        print $., "\n";

        last if $. == $last_seen;

}

Hi Telemachos,

I suppose the difficulty of doing this with sed is why I was unable to find an appropriate sed-based solution with google.

Thank you for your kind help and demonstrating a perl-based solution. In the end, taking note about what you said about "rewind", I have written a fortran program to selectively read in data, using the "backspace" command to go back through the file when needed.

Thank you again!

@OP,you should learn how to use gawk instead.

Code:

$ more file

qwe

PATTERN

rty

PATTERN

uiop

blah blah PATTERN

lksf

lasd

PATTERN

end



$ gawk -vRS="PATTERN" 'NR>1{print s RT} {s=$0}' ORS=""  file

qwe

PATTERN

rty

PATTERN

uiop

blah blah PATTERN

lksf

lasd



$ more file

qwe

PATTERN

rty

PATTERN

uiop

blah blah PATTERN

lksf

lasd

end



$ gawk -vRS="PATTERN" 'NR>1{print s RT} {s=$0}' ORS=""  file

qwe

PATTERN

rty

PATTERN

uiop

blah blah

tac can simplify the problem for sed by temporally reversing the order of the lines.
The '0' address is a GNU sed extension and is needed here in case PATTERN is on line 1 of the reversed file.

Code:

echo 'qwe

PATTERN

rty

PATTERN

uiop' | tac | sed  '0,/PATTERN/{/PATTERN/!d}' | tac



qwe

PATTERN

rty

PATTERN

This uses a loop ':a N;$!ba' to put all the lines into the sed pattern space then a 's' command to delete anything after the last PATTERN.

Code:

echo 'qwe

PATTERN

rty

PATTERN

uiop' | sed ':a N;$!ba; s/\(.*PATTERN\).*/\1/'



qwe

PATTERN

rty

PATTERN