LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 07-06-2019, 12:51 AM   #1
blason
Member
 
Registered: Feb 2016
Posts: 122

Rep: Reputation: Disabled
How do I grep out lines above till certain pattern is found


I need to grep out lines matching certain parameter from the file and then above that pattern til the time "#" is found at the beginning of the line.

For example here is the file and I need filter out "00:21:44" and lines above that till # is found.

Any clue?

# CSV: Date,Score,URL,IP #
######################################################################################
"06-07-19 00:23:46","4.5","www.origina-l-diploms.com/wp-includes/fonts/mtbonline/update.htm","198.252.101.174"
"06-07-19 00:23:16","4.5","www.deehhayus.com/sqL/img/home/amazon","202.182.120.120"
"06-07-19 00:22:44","5.9","www.bricktechindia.in/fonts/www.bancoestado.cl/imagenes/comun2008/banca-en-linea-personas.html","43.255.154.40"
"06-07-19 00:22:14","3.7","www.bekenjekleurinstijl.nl/wp-admin/Alibaba/vqcr8bp0gud&lc=1033&id=64855&mkt=en-us&cbcxt=mai&snsc.php?email=nobody@mycraftmail.com","37.46.194.80"
"06-07-19 00:21:44","7.7","www.search-5.com/apx26e/verification/N76C72ED2B98CM9A99BC/qes.php","162.241.130.152"
 
Old 07-06-2019, 01:05 AM   #2
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,125

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
You can't buffer backwards - you have to buffer reads until they match your criteria. Grep won't do it - maybe awk/perl/ ...
And what on earth does "filter out" mean ? - include or exclude ?.
 
1 members found this post helpful.
Old 07-06-2019, 01:08 AM   #3
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,307
Blog Entries: 3

Rep: Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721
I'm not sure of the direction or what is to be removed, but either way it is likely that sed can do the job.

One guess:

Code:
tac file.csv \
| sed -nr -e '/00:21:44/,/^##/p' \
| tac
Can you clarify your requirements a little more?
 
1 members found this post helpful.
Old 07-06-2019, 05:32 AM   #4
individual
Member
 
Registered: Jul 2018
Posts: 315
Blog Entries: 1

Rep: Reputation: 233Reputation: 233Reputation: 233
Just for fun, here's an AWK one-liner.
Code:
<file.csv awk '{ a[NR]=$0; /^##/ && m=NR+1; /00:21:44/ && n=NR } END { while (m <= n) print a[m++] }'
It prints all lines after ###[...] and stops printing after matching the line with your pattern.

EDIT: If you want to stop altogether after matching your pattern, this one-liner will do that.
Code:
<file.csv awk '{ a[NR]=$0; /^##/ && m=NR+1; if (/00:21:44/) { n=NR; exit } } END { while (m <= n) print a[m++] }'
EDIT2: This is as short as I can get it. An AWK guru could probably come up with a better solution, though.
Code:
<file.csv awk '{a[NR]=$0; /^##/ && m=NR+1; if (/00:21:44/) exit} END {while (m <= NR) print a[m++]}'

Last edited by individual; 07-06-2019 at 05:57 AM. Reason: Added two alternatives.
 
2 members found this post helpful.
Old 07-06-2019, 04:48 PM   #5
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,263
Blog Entries: 24

Rep: Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194
@blason: Please place your code snippets inside [CODE]...[/CODE] tags for better readability. You may type those yourself or click the "#" button in the edit controls.

As others have indicated, a little more clarity about what you are trying to do would be helpful.

In particular, does "filter out" mean delete or extract?

Do you want the # line to be deleted/extracted as well, and which one exactly?

Please review the Site FAQ for guidance in posting your questions and general forum usage. Especially, read the link in that page, How To Ask Questions The Smart Way. The more effort you put into understanding your problem and framing your questions, the better others can help!
 
Old 07-07-2019, 11:35 AM   #6
blason
Member
 
Registered: Feb 2016
Posts: 122

Original Poster
Rep: Reputation: Disabled
Hi Guys,

Thanks for the input and let me correct or rephrase. filter out as in I need include those lines only and then count backwards till the end of the file where file starts with "#"

Well, eventually I need to find out the recent lines since recent additions get added above it. Hence I was not sure how to include only recent changes or lines which were added yesterday.
 
Old 07-15-2019, 05:03 AM   #7
blason
Member
 
Registered: Feb 2016
Posts: 122

Original Poster
Rep: Reputation: Disabled
Hey all,

the post by @individual have solved my problem and many thanks for it.
 
Old 07-15-2019, 06:08 PM   #8
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,263
Blog Entries: 24

Rep: Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194
Good to hear!

If your problem is solved, please use the Thread Tools above top post to mark the thread as solved.

Last edited by astrogeek; 07-15-2019 at 06:08 PM. Reason: tpoys
 
Old 07-16-2019, 03:29 PM   #9
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 2,790

Rep: Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201
The a[NR]=$0 in the previous solution stores the whole input file in memory.
The following is an improvement attempt
Code:
<file.csv awk '{ a[++n]=$0 } /^#/ { n=0; next } /00:21:44/ { for (m=1; m<=n; m++) print a[m] }'
In sed one can use its hold buffer
Code:
<file.csv sed -n -e 'H; /^#/{s/.*//;x;d;}' -e '/00:21:44/{x;p;}'

Last edited by MadeInGermany; 07-17-2019 at 03:42 AM. Reason: fixes after testing
 
Old 07-16-2019, 08:35 PM   #10
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (15 current), Slack15, Ubuntu studio, MX Linux, FreeBSD 13.1, WIn10
Posts: 10,342

Rep: Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242
Quote:
Originally Posted by blason View Post
I need to grep out lines matching certain parameter from the file and then above that pattern til the time "#" is found at the beginning of the line.

For example here is the file and I need filter out "00:21:44" and lines above that till # is found.

Any clue?

# CSV: Date,Score,URL,IP #
######################################################################################
"06-07-19 00:23:46","4.5","www.origina-l-diploms.com/wp-includes/fonts/mtbonline/update.htm","198.252.101.174"
"06-07-19 00:23:16","4.5","www.deehhayus.com/sqL/img/home/amazon","202.182.120.120"
"06-07-19 00:22:44","5.9","www.bricktechindia.in/fonts/www.bancoestado.cl/imagenes/comun2008/banca-en-linea-personas.html","43.255.154.40"
"06-07-19 00:22:14","3.7","www.bekenjekleurinstijl.nl/wp-admin/Alibaba/vqcr8bp0gud&lc=1033&id=64855&mkt=en-us&cbcxt=mai&snsc.php?email=nobody@mycraftmail.com","37.46.194.80"
"06-07-19 00:21:44","7.7","www.search-5.com/apx26e/verification/N76C72ED2B98CM9A99BC/qes.php","162.241.130.152"
why cannot it be. I need to start at the # and include, or get the lines until "00:21:44" is reached, by any method that works?
Code:
$ sed -n '/#/,/00:22:44/p' testfile
# CSV: Date,Score,URL,IP #
######################################################################################
"06-07-19 00:23:46","4.5","www.origina-l-diploms.com/wp-includes/fonts/mtbonline/update.htm","198.252.101.174"
"06-07-19 00:23:16","4.5","www.deehhayus.com/sqL/img/home/amazon","202.182.120.120"
"06-07-19 00:22:44","5.9","www.bricktechindia.in/fonts/www.bancoestado.cl/imagenes/comun2008/banca-en-linea-personas.html","43.255.154.40"

Last edited by BW-userx; 07-16-2019 at 08:39 PM.
 
Old 07-17-2019, 03:49 AM   #11
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 2,790

Rep: Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201
After testing, I have corrected my post#9.
I think the O/P wants to print from the last # line before the matched line, up to the matched line.
In contrast, /^#/,/00:22:44/ is from the first # line until the matched line.
 
  


Reply

Tags
awk, bash, grep



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] How to remove certain pattern files except another certain pattern files from a list? Mike_Brown Linux - Newbie 4 04-23-2016 12:30 AM
grep till second pattern rattlesnakejoe Programming 1 11-22-2009 08:09 AM
LAMP installation with apache(2.2 or above),php(5.1.4 or above)),mysql(4 or above) mobquest Linux - Newbie 2 08-31-2009 12:01 AM
grep till the 1st occurrence of a pattern raghu123 Programming 2 04-15-2009 05:47 AM
grep till the 1st occurrence of a pattern raghu123 Programming 1 04-15-2009 05:17 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 03:55 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration