LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices



Reply
 
Search this Thread
Old 07-22-2008, 05:12 PM   #1
DX398
LQ Newbie
 
Registered: Jul 2008
Posts: 7

Rep: Reputation: 0
Sed: how to delete all lines except those matching........


Hello Everyone,

I'm hoping i can get some assistance using Sed to do the following.;
"Remove all lines that do NOT contain a period followed by 3 characters (either numbers or letters or spaces...)Basicly, trying to cleanup a windows directory listing so that only files ending with .*** are
left.

I have been able to successfully delete all lines that do NOT contain
a period with the following syntax,

sed '/\./'!d

but I'm unable to insert the necessary wildcards to keep lines that
contain periods immediately followed by three characters..


Thanks for any help you can provide,

Regards,

Last edited by DX398; 07-22-2008 at 05:38 PM.
 
Old 07-22-2008, 05:20 PM   #2
matthewg42
Senior Member
 
Registered: Oct 2003
Location: UK
Distribution: Kubuntu 12.10 (using awesome wm though)
Posts: 3,530

Rep: Reputation: 63
You can use the -n option to suppress printing unless explicitly done with the 'p' command, and then make a rule which matches the desired line and upon matching calls 'p'.
 
Old 07-22-2008, 05:36 PM   #3
DX398
LQ Newbie
 
Registered: Jul 2008
Posts: 7

Original Poster
Rep: Reputation: 0
Thanks for the quick response Matthew.. Yes that is another option but what I'm confused about is what the rule should look like. The only known character is the period. The 3 trailing characters can be anything...

For example;

\directory\subdir\file1.txt - Want to keep this line
\directory\subdir\file1.xls - Want to keep this line

\directory\subdir\subdir.1 - Want to delete this line.
\directory\subdir\subdir - Want to delete this line.

Last edited by DX398; 07-22-2008 at 05:44 PM.
 
Old 07-22-2008, 05:58 PM   #4
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
Following the suggestion by matthewg42 you are near to the solution. To build the regular expression you are looking for, take in mind the special meaning of . (dot): it matches any single character. Three consecutive dots match any sequence of three characters.
 
Old 07-22-2008, 06:09 PM   #5
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 12,488

Rep: Reputation: 1077Reputation: 1077Reputation: 1077Reputation: 1077Reputation: 1077Reputation: 1077Reputation: 1077Reputation: 1077
What about "\directory\subdir\file1.his.txtt" - gotta watch out for "corner" cases.
Ensure the data you are checking is appropriate - say, only at the end of the input.
 
Old 07-22-2008, 11:27 PM   #6
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Arch/XFCE
Posts: 17,802

Rep: Reputation: 729Reputation: 729Reputation: 729Reputation: 729Reputation: 729Reputation: 729Reputation: 729
sed -n '/\....$/p' filename > newfilename

literal "." + any 3 characters--all at the end of the line
 
Old 07-22-2008, 11:31 PM   #7
DX398
LQ Newbie
 
Registered: Jul 2008
Posts: 7

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by colucix View Post
Following the suggestion by matthewg42 you are near to the solution. To build the regular expression you are looking for, take in mind the special meaning of . (dot): it matches any single character. Three consecutive dots match any sequence of three characters.
Hi Colucix.. I didn't know that about the dot. Thanks, but in my case
I had to escape it with a backslash to get the literal dot... how do you "unescape" and add the remaining dots.


Regards,
 
Old 07-22-2008, 11:34 PM   #8
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 12,488

Rep: Reputation: 1077Reputation: 1077Reputation: 1077Reputation: 1077Reputation: 1077Reputation: 1077Reputation: 1077Reputation: 1077
Bad, bad pix- we were trying to get the OP to answer his own q.
As pix showed, the escape is for only the one character.
 
Old 07-22-2008, 11:44 PM   #9
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Arch/XFCE
Posts: 17,802

Rep: Reputation: 729Reputation: 729Reputation: 729Reputation: 729Reputation: 729Reputation: 729Reputation: 729
Quote:
Originally Posted by syg00 View Post
Bad, bad pix- we were trying to get the OP to answer his own q.
As pix showed, the escape is for only the one character.
Well, I'm sorry. My homework detector was silent, and it seemed that OP was a real person--one who might go on to learn greater things.

I'll be better next time......unless I'm not..
 
Old 07-23-2008, 03:39 AM   #10
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
Quote:
Originally Posted by DX398 View Post
Hi Colucix.. I didn't know that about the dot. Thanks, but in my case I had to escape it with a backslash to get the literal dot... how do you "unescape" and add the remaining dots.
Just leave them unescaped! You want to match a sequence of 4 characters: a literal dot followed by three other characters and by an end of line (as syg00 pointed out). You have to build a regular expression with 4 + 1 items: a literal (escaped) dot immediately followed by three (unescaped) dots plus the special character which means "end-of-line". Pixellany already posted the solution (just a final twist of the knife... ).

Also note that another method to match literal characters is to embed them in a "character list" using square brackets. So the regular expression could be written
Code:
/[.]...$/
A character list can be used to match any of the listed characters. For example [agk] match any single occurrence of a, g or k. You can also use "intervals" like [0-5] to match any single digit from 0 to 5, or even [a-zA-Z] to match one single alphabetic character, be it upper- or lower-case. Hope this helps.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
sed or grep : delete lines containing matching text raj000 Linux - General 18 09-08-2012 10:38 AM
delete lines with sed LuciusHunt Programming 1 04-19-2008 03:40 PM
Insert and delete lines at the end of a file using sed DriveMeCrazy Programming 1 01-05-2007 02:45 AM
AWK/SED Multiple pattern matching over multiple lines issue GigerMalmensteen Programming 15 12-03-2006 06:08 PM
awk/gawk/sed - read lines from file1, comment out or delete matching lines in file2 rascal84 Linux - General 1 05-24-2006 10:19 AM


All times are GMT -5. The time now is 04:16 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration