LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 01-07-2008, 01:55 PM   #1
kcorkran
LQ Newbie
 
Registered: Jan 2008
Location: Austin, TX
Distribution: Suse
Posts: 6

Rep: Reputation: 0
Smile using sed to parse dir output


Hello Linux Professionals:

I am trying to parse the output of a windows dir command so it looks like to the below 'After' statement. I just to remove the extra stuff even the recursive directories.

Before:
Volume in drive \\Scandocs_vs\scandocs is SCANDOCS
Volume Serial Number is C0A8-579C

Directory of \\Scandocs_vs\scandocs\archives_webfiles\arcmaps\pdfs

03/04/2004 12:39p <DIR> .
03/04/2004 12:39p <DIR> ..
03/19/2004 01:15p 24,364,073 10315.pdf

After:
03/19/2004 01:15p 24,364,073 10315.pdf

Any help appreciated!
Keith
 
Old 01-07-2008, 02:09 PM   #2
cjcox
Member
 
Registered: Jun 2004
Posts: 306

Rep: Reputation: 42
Remove the first 7 lines (?).

dir | sed '1,7d'

Just a stab in the dark...
 
Old 01-07-2008, 03:42 PM   #3
kcorkran
LQ Newbie
 
Registered: Jan 2008
Location: Austin, TX
Distribution: Suse
Posts: 6

Original Poster
Rep: Reputation: 0
I think that would work if I did not have to dir recursively. [dir /s]
I was thinking that if I could remove all lines that did not match '.pdf' in the string it would work.
-Keith

********************
03/19/2004 01:15p 24,364,073 10315.pdf (keep)

Directory of \\Scandocs_vs\scandocs\archives_webfiles\arcmaps\pdfs (discard)
********************
 
Old 01-07-2008, 03:44 PM   #4
Poetics
Senior Member
 
Registered: Jun 2003
Location: California
Distribution: Slackware
Posts: 1,181

Rep: Reputation: 49
Why don't you just use grep? You can search for ".pdf" and only include those lines that have a .pdf file on them (if so named). There are a variety of ways to do this, all equally valid, but I leave their discovery as an exercise for the reader.
 
Old 01-08-2008, 10:10 AM   #5
kcorkran
LQ Newbie
 
Registered: Jan 2008
Location: Austin, TX
Distribution: Suse
Posts: 6

Original Poster
Rep: Reputation: 0
grep did it.
I was using it like this: (cygwin by the way)
c:> grep -i '.pdf' dir_pdfs
But the result was not returning what I expected (directories were still listed) so I was luckily able to modify the statement to:
c:> grep -i '[0-9].pdf' dir_pdfs
and it returned the results I wanted. So problem solved!

As a matter of curiosity, do you know why it did not seem to apply the '.' in the string example.
Thanks,
Keith
 
Old 01-08-2008, 02:35 PM   #6
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976
The dot '.' as a special meaning in a regular expression: it matches any single character, not just the dot itself. When you use a dot or any other special character in the pattern, grep interprets it as a regular expression and you can obtain an unexpected result.

On the other hand, to match a dot literally you have to enclose it in square brackets, e.g
Code:
grep [.]pdf dir_pdfs

Last edited by colucix; 01-08-2008 at 02:37 PM.
 
Old 01-09-2008, 12:32 AM   #7
kcorkran
LQ Newbie
 
Registered: Jan 2008
Location: Austin, TX
Distribution: Suse
Posts: 6

Original Poster
Rep: Reputation: 0
That explains it. Thanks much!
 
Old 01-09-2008, 12:45 AM   #8
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Arch/XFCE
Posts: 17,802

Rep: Reputation: 738Reputation: 738Reputation: 738Reputation: 738Reputation: 738Reputation: 738Reputation: 738
Quote:
Originally Posted by colucix View Post
The dot '.' as a special meaning in a regular expression: it matches any single character, not just the dot itself. When you use a dot or any other special character in the pattern, grep interprets it as a regular expression and you can obtain an unexpected result.

On the other hand, to match a dot literally you have to enclose it in square brackets, e.g
Code:
grep [.]pdf dir_pdfs
Every day I learn that you learn something new every day!!

I had learned that the "normal" way to change the meaning of certain characters was the "escape"---as in:
grep "\." filename
The square bracket I never saw before---does it also work in SED?

Yes...
 
Old 01-09-2008, 01:09 AM   #9
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Quote:
Originally Posted by pixellany View Post
The square bracket I never saw before---does it also work in SED?

Yes...
its used in regexp to specify range or single character. eg [a-z] , [abc].
From wiki
Quote:
\[ \] A bracket expression. Matches a single character that is contained within the brackets. For example, \[abc\] matches "a", "b", or "c". \[a-z\] specifies a range which matches any lowercase letter from "a" to "z". These forms can be mixed: \[abcx-z\] matches "a", "b", "c", "x", "y", and "z", as does \[a-cx-z\].

The - character is treated as a literal character if it is the last or the first character within the brackets, or if it is escaped with a backslash: \[abc-\], \[-abc\], or \[a\-bc\].
If the sed you are using supports this syntax, then yes, it can be used in sed.
 
Old 01-09-2008, 07:58 AM   #10
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Arch/XFCE
Posts: 17,802

Rep: Reputation: 738Reputation: 738Reputation: 738Reputation: 738Reputation: 738Reputation: 738Reputation: 738
Light goes on....
I knew bracket expressions, but had never considered that a "special" character would cease to be special inside one. The books typically don't talk about the use of brackets in lieu of escaping---but it obviously works.

So, is there a way to pass in as a variable the string to go inside [ ]?
 
Old 01-09-2008, 08:49 AM   #11
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976
Quote:
Originally Posted by pixellany View Post
The books typically don't talk about the use of brackets in lieu of escaping---but it obviously works.
Yes... not really used as an escape, but as a way to match single characters, as ghostdog reported. Anyway, very useful for "escaping" in some cases!
Quote:
So, is there a way to pass in as a variable the string to go inside [ ]?
I think this can be done in the common way. For example consider a text file with these two lines
Code:
$ cat testfile
line with a dot . inside
line with a dot at the end.
You can do
Code:
$ my_var=.$
$ grep [$my_var] testfile
line with a dot . inside
line with a dot at the end.
whereas if you want to retain the special meaning of $ you have to add it outside the brackets.
Code:
$ grep [$my_var]$ testfile
line with a dot at the end.
Cheers!
 
Old 01-10-2008, 08:37 AM   #12
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Arch/XFCE
Posts: 17,802

Rep: Reputation: 738Reputation: 738Reputation: 738Reputation: 738Reputation: 738Reputation: 738Reputation: 738
OK---special meaning as you use it means "at the end of the line". But, inside the [ ], the "$" clearly has its more general special meaning--i.e. "the value of". so you would have to use [\$] to look for a literal "$".

What other characters are special by default inside of [ ]? e.g. "r[^ab]" means "r, not followed by a or b".
 
Old 01-10-2008, 02:17 PM   #13
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976
CORRECT. Except when you put $ at the end of the character list, that is if it's not followed by any other character it cannot expand any variable. How many nuances the shell has!!
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
ssimple shell script to parse a file ~sed or awk stevie_velvet Programming 7 07-14-2006 04:41 AM
using sed to parse large directory mastro Programming 2 04-13-2006 11:58 AM
using sed to parse emails dnardoni Programming 1 12-08-2005 04:10 AM
I need to parse a word: awk or sed? mehesque Programming 5 07-27-2004 05:23 PM
Parse a filename with awk and sed chrisk5527 Programming 2 06-08-2004 07:13 PM


All times are GMT -5. The time now is 08:41 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration