LinuxQuestions.org
Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
Search this Thread
Old 01-04-2009, 09:39 AM   #1
ifeatu
Member
 
Registered: Sep 2008
Distribution: Fedora 9
Posts: 68

Rep: Reputation: 15
Another Sed challenge


I'm trying to create a script that takes a file full of names of files (mp3's actually) with the following syntax:

Mon_Da_YEAR_Sub_ject_Sub_ject_Location_side_1_or_2.mp3

...and makes folders from those file names in the following syntax:


Location-YEAR-MN-DY- Subject -BLH

The "BLH" part needs to be indiscriminately attached to every folder name.

Basically it'll take some Regex to parse the data from the file name and reorganize it into the folder name...and I think there will have to be a loop statement like do-while...can anyone help me?
 
Old 01-04-2009, 04:01 PM   #2
unSpawn
Moderator
 
Registered: May 2001
Posts: 27,118
Blog Entries: 54

Rep: Reputation: 2787Reputation: 2787Reputation: 2787Reputation: 2787Reputation: 2787Reputation: 2787Reputation: 2787Reputation: 2787Reputation: 2787Reputation: 2787Reputation: 2787
Quote:
Originally Posted by ifeatu View Post
I'm trying to create a script
Unless reinventing the wheel is something you just got to do, did you check LQ, Freshmeat or Sourceforge for apps that already can do that? Besides filenames like "Mon_Da_YEAR_Sub_ject_Sub_ject_Location_side_1_or_2.mp3" aren't reliable and easy to work with (wrt IFS). If MP3s are tagged "right" then getting the mp3info and working on that would be *way* "safer" IMHO. The "BLH" part isn't something you would even need sed for. Maybe it's best to post whatever (pseudo) script you've got right now?..


Quote:
Originally Posted by ifeatu View Post
Basically it'll take some Regex to parse the data from the file name and reorganize it into the folder name...
No challenge for you as it seems you recently bought ISBN 9780596528126 :-]
 
Old 01-05-2009, 09:00 PM   #3
ifeatu
Member
 
Registered: Sep 2008
Distribution: Fedora 9
Posts: 68

Original Poster
Rep: Reputation: 15
okay...so I took your advise and took a stab at it...I ended up having to use a batch file in DOS for the folder creation part...then I installed GNU Sed for DOS and took a stab at the RegEx...here is where I got stuck:

OKay...first, here is some of my raw data:

April_14_1991_Bread_of_life_Vallejo_side_1.mp3
April_14_1991_Bread_of_life_Vallejo_side_2.mp3
April_21_1991_Ministry_zadok_priesthood_Vallejo_side_1.mp3
April_21_1991_Ministry_zadok_priesthood_Vallejo_side_2.mp3
Apr_05_1992_Matthew_13_Sower_Berkeley_side_1.mp3
Apr_05_1992_Matthew_13_Sower_Berkeley_side_2.mp3
Aug_04_1991B_Moving_out_alive_soul_Vallejo_side_1.mp3
Aug_04_1991B_Moving_out_alive_soul_Vallejo_side_2.mp3
Aug_04_1992_Daniel_7_Berkeley_side_1.mp3
Aug_04_1992_Daniel_7_Berkeley_side_2.mp3

So you have the month (.*) followed by the day ([0-9]{1,2}) followed by the year ([0-9]{4}) followed by a subject (.*) followed by a location (Vallejo,vallejo, Berkeley,berkeley, UC, Union City) followed by the word "side" followed by a 1 or 2

sed -r 's/(.*)([0-9]{2})\s*_\s*([0-9]{4})\s*_\s*(.*)\s*([vallejo][.*])\s*(side)\s*_\s*([1,2])(.*)/\3-\1-\2-\4/' test2.txt
> test3.txt


I think its pretty obvious where I'm running into my problem, I can't seem to get my syntax right for the location...can anyone help?

The line above returns absolutely nothing, it returns the data precisely as it is read.
 
Old 01-06-2009, 03:17 AM   #4
chrism01
Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.5, Centos 5.10
Posts: 16,261

Rep: Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028
That's a bit advanced for me, but I notice a couple of things:
1. in your data you have Vallejo, your sed uses vallejo (case is different)
2. Aug_04_1991B_Moving_out_alive_soul_Vallejo_side_1.mp3 & Aug_04_1991B_Moving_out_alive_soul_Vallejo_side_2.mp3 do not have an underscore after the year component.
 
Old 01-06-2009, 05:00 AM   #5
ifeatu
Member
 
Registered: Sep 2008
Distribution: Fedora 9
Posts: 68

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by chrism01 View Post
That's a bit advanced for me, but I notice a couple of things:
1. in your data you have Vallejo, your sed uses vallejo (case is different)
2. Aug_04_1991B_Moving_out_alive_soul_Vallejo_side_1.mp3 & Aug_04_1991B_Moving_out_alive_soul_Vallejo_side_2.mp3 do not have an underscore after the year component.
therein lies a portion of the challenge. The regex needs to make provisions for the B or A that some times delimits the year...I understand that the data isn't at all pretty...but in this circumstance, reinventing the wheel is an absolute neccesity.
 
Old 01-06-2009, 05:13 AM   #6
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 12,206

Rep: Reputation: 1015Reputation: 1015Reputation: 1015Reputation: 1015Reputation: 1015Reputation: 1015Reputation: 1015Reputation: 1015
sed is best (only ?) for well-formed data.
Use something like perl - that way you can use some real programming techniques.
 
  


Reply

Tags
perl, regex, scripting, sed


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
A sed challenge sxn Programming 5 09-08-2008 03:06 AM
sed challenge..datamining fs11 Programming 7 01-14-2007 08:26 PM
bash script with grep and sed: sed getting filenames from grep odysseus.lost Programming 1 07-17-2006 11:36 AM
Insert character into a line with sed? & variables in sed? jago25_98 Programming 5 03-11-2004 06:12 AM


All times are GMT -5. The time now is 04:56 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration