LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-21-2012, 09:52 AM   #1
Weapon S
Member
 
Registered: May 2011
Location: Netherlands
Distribution: Debian, Archlinux
Posts: 250
Blog Entries: 2

Rep: Reputation: 48
How to use grep or sed to extract pattern?


I have zipfile that reports a smaller (top) directory inside than the original directory (but bigger than the archive itself, duh). So I wanted to see if zip skipped some files.
But I can't diff the output of unzip -l, because of the extra information:
Code:
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  2010-01-06 13:26   src/
      466  2008-09-12 22:49   src/active_map.cpp
        0  2010-01-29 11:24   src/win/
     3573  2010-01-06 13:35   src/win/deps.mak
---------                     -------
   229760                     62 files
The above is not exactly the output (and not the archive I wanted to compare). ( unzip -ql gives about the same output.) But you get my point.
I thought this would give me only the filenames:
Code:
unzip -l My/Gam/ttt/src.zip | grep -oP "(?:[0-9]{2}:[0-9]{2}\s+).*"
Instead it returns something like:
Code:
13:26   src/
[etc...]
I could imagine this is a result of the "experimental" behaviour of Perl expressions, but other options than P don't give me anything; and this comes very close to what I wanted.
I've experimented with sed too. This page explains how to remove parts, but it seems to work on whole lines.
I don't understand the uppermost syntax on the commandline. Do I have to place the pattern in quotes, or do I need to escape all (to the commandline) special characters, what does slash (or other character) at the beginning of the pattern exactly mean? I'm sure I could figure out a regex, if I knew those things.
 
Old 10-21-2012, 10:10 AM   #2
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,819
Blog Entries: 1

Rep: Reputation: 1209Reputation: 1209Reputation: 1209Reputation: 1209Reputation: 1209Reputation: 1209Reputation: 1209Reputation: 1209Reputation: 1209
It will be easier to help you if you provide the input text and the desired output. Are you trying to get the file names only?
 
Old 10-21-2012, 11:02 AM   #3
Weapon S
Member
 
Registered: May 2011
Location: Netherlands
Distribution: Debian, Archlinux
Posts: 250
Blog Entries: 2

Original Poster
Rep: Reputation: 48
Yes, that's what I'm trying to do: to only get path + filename.
The first code-block has a sample of the input. The desired output would be:
Code:
src/
src/active_map.cpp
src/win/
src/win/deps.mak
Now I have this, which works:
Code:
unzip -l My/Gam/ttt/src.zip | grep "[0-9]:" | sed -e 's/^.\{30\}//'
IIUC sed should be able to do this by itself, but I can't make it work.
 
Old 10-21-2012, 11:24 AM   #4
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,819
Blog Entries: 1

Rep: Reputation: 1209Reputation: 1209Reputation: 1209Reputation: 1209Reputation: 1209Reputation: 1209Reputation: 1209Reputation: 1209Reputation: 1209
Yes, it can be done with sed. Awk, however, would be easier here. See if that works for you:

Code:
unzip -l My/Gam/ttt/src.zip | awk '/\// { print $NF }'
 
Old 10-22-2012, 10:18 AM   #5
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959
This is probably easier:

Code:
zipinfo -1 archivename.zip
That's a one, not an ell.
 
2 members found this post helpful.
Old 10-26-2012, 01:26 PM   #6
Weapon S
Member
 
Registered: May 2011
Location: Netherlands
Distribution: Debian, Archlinux
Posts: 250
Blog Entries: 2

Original Poster
Rep: Reputation: 48
My method "worked". I had to turn that 30 into a 28 for some reason on my other system.

Quote:
unzip -l My/Gam/ttt/src.zip | awk '/\// { print $NF }'
Close, but no sigar
This output:
Code:
$ unzip -l /home/neo/progs/A5/examples/data/ex_physfs.zip
Archive:  /home/neo/progs/A5/examples/data/ex_physfs.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
     5410  2009-04-21 12:04   02.bmp
---------                     -------
     5410                     1 file
Is filtered as:
Code:
/home/neo/progs/A5/examples/data/ex_physfs.zip
Not exactly the example I posted ;D So thanks for sharing.

Quote:
zipinfo -1 archivename.zip
Spot on Package seems to be installed too. Thanks.
 
Old 10-28-2012, 05:08 PM   #7
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959
zipinfo is supplied along with unzip. You can, in fact, use unzip -Z instead as the command name.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] how to extract a 2-line pattern from a file using awk, grep, etc. dcsmayei Linux - Newbie 9 06-09-2012 08:32 AM
grep or sed to delete matching pattern rbalaa Linux - General 2 07-07-2011 03:28 PM
using sed or grep to extract stuff from a text file DEF. Programming 5 12-12-2009 10:13 AM
Using sed to extract a pattern plus a number of positions after roach7711x Linux - Software 20 10-31-2008 04:37 AM


All times are GMT -5. The time now is 01:32 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration