LinuxQuestions.org
Visit the LQ Articles and Editorials section
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 09-30-2004, 12:07 PM   #1
tonyfreeman
Member
 
Registered: Sep 2003
Location: Fort worth, TX
Distribution: Debian testing 64bit at home, EL5 32/64bit at work.
Posts: 187

Rep: Reputation: 30
sed or awk help requested


I'm making a script that will transfer some files from CD to harddrive. The files on the cd are in the format of:
Code:
/mnt/cdrom/10Sep2004xxxx.tar.gz
/mnt/cdrom/10Sep2004xxxxxxxx.tar.gz
/mnt/cdrom/10Sep2004xx.tar.gz
/mnt/cdrom/16Sep2004xxxxxxxxxxx.tar.gz
/mnt/cdrom/16Sep2004xxxxxxx.tar.gz
/mnt/cdrom/16Sep2004xxx.tar.gz
I would like to end up with a file in the /tmp/ directory called _1.txt with these two entries:
Code:
10Sep2004
16Sep2004
This file will be used as a user selectable wildcard list.

Here's what I have so far:
Code:
ls -1 /mnt/cdrom/[0-9][0-9]???200[0-9]* | tr -d "/mnt/cdrom/" | <unknown> | sort -u > /tmp/_1.txt
What would be the correct syntax for awk or sed to match this: [0-9][0-9]???200[0-9] and just pipe it to "sort -u" so that it can be written to the /tmp/_1.txt file?

--Tony
 
Old 09-30-2004, 12:21 PM   #2
christhom
Member
 
Registered: Sep 2004
Distribution: Debian sarge/sid
Posts: 41

Rep: Reputation: 15
if the file format is so regular, why not just use cut?

ls -1 /mnt/cdrom/[0-9][0-9]???200[0-9]* | cut -c 12-20 | sort | uniq > /tmp/_1.txt

otherwise...my sed is not so great, but something like this should strip the leading path

ls -1 /mnt/cdrom/[0-9][0-9]???200[0-9]* | sed 's/.*\///g'

but I'd still be inclined to use cut for the postfix

HTH
 
Old 09-30-2004, 12:43 PM   #3
tonyfreeman
Member
 
Registered: Sep 2003
Location: Fort worth, TX
Distribution: Debian testing 64bit at home, EL5 32/64bit at work.
Posts: 187

Original Poster
Rep: Reputation: 30
Thanks man!

Here's what I ended up doing:

Code:
ls -1 /mnt/cdrom/[0-9][0-9]200[0-9]* | cut -c  16-24 | uniq | grep -e "^[0-9]" > /tmp/_1.txt
-- Tony
 
Old 09-30-2004, 01:36 PM   #4
christhom
Member
 
Registered: Sep 2004
Distribution: Debian sarge/sid
Posts: 41

Rep: Reputation: 15
ok. now that I thinl about it some more, I don't know why you need the -l flag to ls. nor do I get why you need the second grep.

oh, btw - be careful of uniq on it's own - always best to use sort | uniq (i.e. don't trust ls to sort properly for you)

cheers
 
Old 10-01-2004, 07:00 PM   #5
tonyfreeman
Member
 
Registered: Sep 2003
Location: Fort worth, TX
Distribution: Debian testing 64bit at home, EL5 32/64bit at work.
Posts: 187

Original Poster
Rep: Reputation: 30
I put the "1" flag (that's a number one ... not the letter L) just to make sure there is one entry per line. The grep is to make sure that what I get actually does start with a number ... during my testing I got one entry that was an empty space and another that was a strange character.

Thanks for your help

--Tony
 
Old 10-01-2004, 08:33 PM   #6
jlliagre
Moderator
 
Registered: Feb 2004
Location: Outside Paris
Distribution: Solaris10, Solaris 11, Mint, OL
Posts: 9,500

Rep: Reputation: 355Reputation: 355Reputation: 355Reputation: 355
The "-1" (one file per line option) is unnecessary, as when ls output is not a terminal, but redirected to a file or piped to another command, this is the default behaviour.

Also, I don't understand why you use 16-24 vs 12-20 which match for me the path you give.

Finally, the 'grep -e "^[0-9]"' still seems to me redundant, as the files already start by a number anyway. I would be curious to see how did you got an empty space or "strange characters", perhaps have you directories matching the ls pattern ?
 
Old 10-03-2004, 12:20 AM   #7
the_tflk
Member
 
Registered: Jul 2003
Location: MA
Distribution: Ubuntu
Posts: 35

Rep: Reputation: 16
ummm code or whatever:

#/bin/sh
#written in notepad with no way of testing this - user beware!

#ooooo... command line options!
#I suppose you could put the destination dir in as a command line option too...
sourceDir=$1
find $sourceDir -type f -iname '*.tar.gz' > /destination/dir/fileList.lst

eachList=`cat fileList.lst | tr "\n" " "`

for each in $eachList do ;
cp $each /destination/dir/ ;
echo `basename $each | cut -c 1-9` >> /destination/dir/baseList.lst ;
done

cat /destination/dir/baseList.lst | sort | uniq > /destination/dir/_1.txt

rm /destination/dir/fileList.lst /destination/dir/baseList.lst

echo -e "done! \n"
 
Old 10-03-2004, 12:23 AM   #8
the_tflk
Member
 
Registered: Jul 2003
Location: MA
Distribution: Ubuntu
Posts: 35

Rep: Reputation: 16
that might not be what you were looking for but there it is... as much for my own amusement as your use....
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
SED, AWK or PERL HELP embsupafly Programming 6 08-20-2005 09:07 PM
Sed & Awk hinetvenkat Linux - Software 4 05-30-2005 05:10 AM
awk and sed issues alaios Linux - General 11 03-24-2005 05:33 AM
awk/sed help pantera Programming 1 05-13-2004 11:59 PM
sed/awk problem player_2 Programming 9 08-26-2003 06:09 PM


All times are GMT -5. The time now is 11:58 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration