LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 04-01-2007, 04:49 PM   #1
neville310
LQ Newbie
 
Registered: Oct 2005
Posts: 17

Rep: Reputation: 0
Isolate lines in a text file and perform replacements


I developed a shell script for renaming my mp3 files. My script uses Perl for full regular expression support (so I don’t have to escape the patterns like in SED).

The script uses Perl and regular expressions to normalize the file names. Basically, it removes junk from the file name; spaces, covert to title case. The patterns reside in a preset file since several replacements are necessary. Each preset file deals with a slight different renaming profile. My current development efforts involve applying these replacement patterns to the text inside playlist and xml files. I want to process the text inside these files; thus align the renamed filenames with the pointers in the playlist files. Eventually, the script would help me rename my mp3s and iTunes database (without importing all my audio files once again), yet it has several purposes beyond mp3 renaming.

The script performs a recursive search and finds files with these extensions (m3u, sfv, and xml). It iterates (while loop) through an external file with regular expression patterns. Then it combines the patterns and places them in a variable, which is passed to Perl. The script below performs this task (yet has not been thoroughly tested). It has a major shortcoming that the Perl line works on the entire file when it should only replace the lines with mp3 pointers. Here’s the call for assistance since I am having code block. Maybe, someone could help me with different logic or a traditional grep solution. I just need fresh ideas for my shell script.
Code:
find ./ -regex ".*\(m3u\|sfv\|xml\)$" -type f -print | while read FILE
	do 
		while read -r REGEX REPLACE line
		do
			CODE="$CODE; s/$REGEX/$REPLACE/g"
		done < "$PRESET"
#PRESET may contain over twenty five regex patterns for complex renaming task
		perl -pi.bak -e “$CODE” $FILE	
done
btw I am doing this script in Bash, because Perl is foreign territory.

Last edited by neville310; 04-01-2007 at 04:53 PM.
 
Old 04-02-2007, 04:09 PM   #2
indienick
Senior Member
 
Registered: Dec 2005
Location: London, ON, Canada
Distribution: Arch, Ubuntu, Slackware, OpenBSD, FreeBSD
Posts: 1,853

Rep: Reputation: 65
I've stopped using Perl for any kind of scripting, mainly because I've gotten into Common Lisp recently, and much prefer it for...well, anything.

I'm stuck on helping you with anything purely code-wise, but I could provide an algorithm for you, to help you achieve your result (I use it all the time):
1. Declare a buffer array/list.
2. Read in each line from the file, assigning each line to a new buffer index.
3. Scan through each array index until you find the expression you want to match.
3a. If a match is found, modify the line accordingly; skip to step 4.
3b. If a match is not found, move on to the next array index.
4. Close any input streams from the file, and open up an output stream to the file.
5. Using a loop, iterate through the array, and print each array value to its own line.
6. End Of Program.

Here's some pseudo-code:
Code:
' Declarations.
DECLARE buf AS ARRAY
DECLARE dat AS FILE
DECLARE exp as REGEX
DECLARE idx as INTEGER

' Initializations.
dat = "/path/to/file"
exp = "expression to match"
idx = 0

' Opening the file stream for input.
OPEN dat FOR INPUT AS #1

' Collect all the lines in the file, and store them in a buffer.
DO
  LINE INPUT #1, buf[idx]
  idx = idx + 1
LOOP WHILE NOT EOF(1)

' Close the input stream (it's not needed anymore).
CLOSE #1

' Parse through the buffer for the line to edit.
FOR i = 0 TO idx
  IF buf[i] ~= exp THEN GOTO ModLine
  ELSE 
   CONTINUE
  END IF
NEXT i

' If it no matches are found, program flows here.
PRINT "Error: No matches found."
END

' If a match is found, program flows here.
ModLine:
 ' Modify line
 ' as you need to here,
 ' then write it to file.
 OPEN dat FOR OVERWRITE AS #1
 
 FOR i = 0 to idx
   PRINT #1, buf[i]
 NEXT i

 CLOSE #1
However, instead of reading through a file, you would want to make use of Perl's glob() function, which takes all files that match a regular expression, and assigns each to its own array index:
Code:
@files = glob(".*\(m3u\|sfv\|xml\)$");
You can then iterate through the array like so:
Code:
foreach $item (@files) {
  // Do
  // Whatever
  // Here
}

Last edited by indienick; 04-02-2007 at 04:12 PM.
 
Old 04-02-2007, 05:15 PM   #3
archtoad6
Senior Member
 
Registered: Oct 2004
Location: Houston, TX (usa)
Distribution: MEPIS, Debian, Knoppix,
Posts: 4,727
Blog Entries: 15

Rep: Reputation: 234Reputation: 234Reputation: 234
The beauty of sed & awk are that they process a file 1 line at a time.

If the only reason you abandoned sed is escaping patterns, then perhaps you are not aware of 2 of its really wonderful features:
  • the -r option,
  • using ',' (or any character of your choice) in place of '/'.
  • (Bonus) ';' works to string multiple sed commands together w/o pipelining.
RTM sed. However, the part about using ',' is not in the man page I just read, it's in Info. If you have Konqueror & KDE, info:/sed/The "s" Command will get you there. Otherwise:
Quote:
The `/' characters may be uniformly
replaced by any other single character within any given `s' command.
The `/' character (or whatever other character is used in its stead)
can appear in the REGEXP or REPLACEMENT only if it is preceded by a `\'
character.
This will only work in Konqueror, no other browser.

Perhaps a small sample of your file names, profiles, & regexen would be helpful.

BTW, I just discovered http://pastebin.ca -- a really cool place to post whole files of examples & samples.

1 last Q/comment, wouldn't:
Code:
for F in .*m3u .*sfv .*xml
work as well as your find ... command?
 
Old 06-19-2007, 09:20 AM   #4
archtoad6
Senior Member
 
Registered: Oct 2004
Location: Houston, TX (usa)
Distribution: MEPIS, Debian, Knoppix,
Posts: 4,727
Blog Entries: 15

Rep: Reputation: 234Reputation: 234Reputation: 234
Quote:
Originally Posted by archtoad6
... If you have Konqueror & KDE, info:/sed/The "s" Command will get you there. Otherwise: This will only work in Konqueror, no other browser.
Clarification: That is, that URI will only work in Konqueror.

info:/ is a "pseudo protocol", a special feature of Konqueror; it is not really a protocol, but a "kioslave". See: http://en.wikipedia.org/wiki/Kioslave

For a list of Kioslaves, use the help: Kioslave -- "help:kioslave".
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Join lines in text file vidyashankara Linux - General 10 12-21-2009 03:17 PM
join every three lines of a text file powah Programming 8 02-01-2007 11:40 PM
Grab text lines in text file LULUSNATCH Programming 1 12-02-2005 10:55 AM
adding text to lines in a file tpreitano Linux - General 2 10-04-2005 09:30 AM
GUI test tool to perform widget functions via a text file liguorir Linux - Software 0 01-05-2004 02:30 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 01:09 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration