LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 11-26-2010, 12:31 PM   #1
zoeplankton
LQ Newbie
 
Registered: Nov 2010
Posts: 3

Rep: Reputation: 0
Problem using grep -f


Hi all,

I'm trying to manipulate a large text file full of records (metadata - one complete record per line). I need to delete every line on which certain words appear - there are five different words, all pretty simple all-caps strings with occasional whitespace (like ADVISORY and EDITOR'S NOTE)

I tried using grep -v, which worked a treat, but only string-by-string. Ideally I'd like to run this as grep -v -f, where the file targeted by the -f contains the strings I need to match in order to delete the lines they're in.

i.e grep -v -f filecontainingSTRINGS.txt targetfile.txt > outputfile.txt

When I try this, however, I don't get any matches - or more specifically, no changes are made in the output file.

It works fine if there's only one string in filecontainingSTRINGS, but it doesn't work if there's more than one (I'm using newline as the delimiter).

(Also my machine doesn't recognise /usr/xpg4/bin/grep - no idea what that's all about!)

Any ideas? All help much appreciated!
 
Old 11-26-2010, 12:57 PM   #2
crts
Senior Member
 
Registered: Jan 2010
Posts: 1,606

Rep: Reputation: 448Reputation: 448Reputation: 448Reputation: 448Reputation: 448
Quote:
Originally Posted by zoeplankton View Post
Hi all,

I'm trying to manipulate a large text file full of records (metadata - one complete record per line). I need to delete every line on which certain words appear - there are five different words, all pretty simple all-caps strings with occasional whitespace (like ADVISORY and EDITOR'S NOTE)

I tried using grep -v, which worked a treat, but only string-by-string. Ideally I'd like to run this as grep -v -f, where the file targeted by the -f contains the strings I need to match in order to delete the lines they're in.

i.e grep -v -f filecontainingSTRINGS.txt targetfile.txt > outputfile.txt

When I try this, however, I don't get any matches - or more specifically, no changes are made in the output file.

It works fine if there's only one string in filecontainingSTRINGS, but it doesn't work if there's more than one (I'm using newline as the delimiter).

(Also my machine doesn't recognise /usr/xpg4/bin/grep - no idea what that's all about!)

Any ideas? All help much appreciated!
Hi,

are you sure that every string is on its own line? Also, I see a windows logo on the left side. Are you trying to edit windows files in linux? If so, then try to convert the files you are using to unix format first:
dos2unix file.txt
 
1 members found this post helpful.
Old 11-26-2010, 04:20 PM   #3
markush
Senior Member
 
Registered: Apr 2007
Location: Germany
Distribution: Slackware
Posts: 3,979

Rep: Reputation: 850Reputation: 850Reputation: 850Reputation: 850Reputation: 850Reputation: 850Reputation: 850
Hi zoeplankton and welcome to LQ,

you may search for lines with Upercase letters:
Code:
grep -e [A-Z] file
in your case you are looking for the lines which do not have such characters
Code:
grep -v -e [A-Z] file
and creating a new file without the matching lines
Code:
grep -v -e [A-Z] file > newfile
if this doesn't meet your requirements I'd recommend to use sed, the streameditor, please look at the manpage
Code:
man sed
Markus

Last edited by markush; 11-26-2010 at 04:24 PM.
 
1 members found this post helpful.
Old 11-26-2010, 04:42 PM   #4
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 23,066
Blog Entries: 11

Rep: Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910
Quote:
Originally Posted by zoeplankton View Post
Hi all,

I'm trying to manipulate a large text file full of records (metadata - one complete record per line). I need to delete every line on which certain words appear - there are five different words, all pretty simple all-caps strings with occasional whitespace (like ADVISORY and EDITOR'S NOTE)

I tried using grep -v, which worked a treat, but only string-by-string. Ideally I'd like to run this as grep -v -f, where the file targeted by the -f contains the strings I need to match in order to delete the lines they're in.

i.e grep -v -f filecontainingSTRINGS.txt targetfile.txt > outputfile.txt

When I try this, however, I don't get any matches - or more specifically, no changes are made in the output file.

It works fine if there's only one string in filecontainingSTRINGS, but it doesn't work if there's more than one (I'm using newline as the delimiter).

(Also my machine doesn't recognise /usr/xpg4/bin/grep - no idea what that's all about!)

Any ideas? All help much appreciated!

The "problem" here is that you're saying to don't want to
see any lines that have ALL words from your file in one line.


If your solaris has has a decent egrep you could try
Code:
egrep -v "$(sed 's/\n/|/g' filecontainingSTRINGS)" targetfile.txt > outputfile.txt
Untested.



Cheers,
Tink
 
Old 11-26-2010, 06:59 PM   #5
zoeplankton
LQ Newbie
 
Registered: Nov 2010
Posts: 3

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by crts View Post
Hi,

are you sure that every string is on its own line? Also, I see a windows logo on the left side. Are you trying to edit windows files in linux? If so, then try to convert the files you are using to unix format first:
dos2unix file.txt
Aha! Yes - my text editor is textpad (very good regular expressions engine) while only runs on pc - and yes, I'm on a (work) pc. I have the option to save as unix in textpad, I'll try that Monday morning. Thanks!

**************************

Update - yup, that was the problem. Fixed now, thanks!

Last edited by zoeplankton; 11-30-2010 at 10:31 AM.
 
Old 11-26-2010, 07:03 PM   #6
zoeplankton
LQ Newbie
 
Registered: Nov 2010
Posts: 3

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by Tinkster View Post
The "problem" here is that you're saying to don't want to
see any lines that have ALL words from your file in one line.


If your solaris has has a decent egrep you could try
Code:
egrep -v "$(sed 's/\n/|/g' filecontainingSTRINGS)" targetfile.txt > outputfile.txt
Untested.



Cheers,
Tink
Yup, I've had egrep recommended by a couple of other folks - am looking into it now & will try implementing on Monday when back at work. Thanks!
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Trying to understand pipes - Can't pipe output from tail -f to grep then grep again lostjohnny Linux - Newbie 15 03-12-2009 11:31 PM
A grep problem baks Programming 16 05-28-2007 06:44 PM
grep problem spx2 Linux - Newbie 9 12-15-2005 02:22 AM
Grep problem chup Linux - General 7 04-17-2004 01:08 AM
ps -ef|grep -v root|grep apache<<result maelstrombob Linux - Newbie 1 09-24-2003 12:38 PM


All times are GMT -5. The time now is 01:01 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration