LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Need help reading text file in bash script (https://www.linuxquestions.org/questions/programming-9/need-help-reading-text-file-in-bash-script-258861/)

scilec 11-24-2004 01:18 PM

Need help reading text file in bash script
 
Hi,

I'm working on a bash script that collects some data from text files, manipulates it, and saves it to a new file. My problem is that my script is not behaving the way I would expect it to. Here's a sample from the data file:

TTITLE15=Major Sample - Dutty Behaviour
TTITLE16=Frassman - Bling Bling
TTITLE17=Alley Cat, Capleton, Hawkeye, Madd Anju, Mad Cobra, Degree, E
TTITLE17=lephant Man, Lexxus, Red Rat, Kiprich, Frassman, Delon, Mr. V
TTITLE17=egas
TTITLE18=Gangster Fun With The Steady Ernest Horns/Shook Me All Night
TTITLE18=Long [live]

Note that while title 15 and title 16 each occupy one line, title 17 is split into 3 lines and title 18 is split into two lines.
My goal is to print each title on a single line. So, the above file should be processed as the following:

Major Sample - Dutty Behaviour
Frassman - Bling Bling
Alley Cat, Capleton, Hawkeye, Madd Anju, Mad Cobra, Degree, Elephant Man, Lexxus, Red Rat, Kiprich, Frassman, Delon, Mr. Vegas
Gangster Fun With The Steady Ernest Horns/Shook Me All Night Long [live]

The chunk of bash script I'm using to do this looks like:

#!/bin/bash
#
# $1 is the name of the file being parsed.
titlenum=0

# parse the lines of file $1 containing track title information.

while [ -n "$(grep "TTITLE$titlenum=" $1)" ]
do

# tan will be set to "TTITLE0, TTITLE1, TTITLE2, etc.
tan=$(grep "TTITLE$titlenum=" $1)

# Get track name and concatinate if it spills onto more than one
# line in the source file.
#

tn1=$(echo $tan | awk -F"TTITLE$titlenum=" '{print $2$3$4}')

echo $tn1 >> trackfile.txt

let titlenum=titlenum+1
done
exit 0


My problem is that this script will output the following to trackfile.txt:

Major Sample - Dutty Behaviour
Frassman - Bling Bling
Alley Cat, Capleton, Hawkeye, Madd Anju, Mad Cobra, Degree, E lephant Man, Lexxus, Red Rat, Kiprich, Frassman, Delon, Mr. V egas
Gangster Fun With The Steady Ernest Horns/Shook Me All Night Long [live]

Note the extra space in "Elephant" and "Vegas". This is happening becuase these words are split between two different lines in the source file. However, if the last character of a line in the source file is a space, such as in "Night ", the lines are combined properly without adding an extra space.

QUESTION:
How can I parse a source file like this so that titles spanning multiple lines are properly combined regardless of whether the last character in each line in a space or a letter?

Thanks,
Steve

wapcaplet 11-24-2004 08:53 PM

I don't know if this will help removing the extra space, but 'sed' would probably be better for removing the TTITLEXX stuff. Try this instead of awk:

Code:

sed -e "s/TTITLE$titlenum=//g"
sed is useful for doing all sorts of regular expression search-and-replace. Also, using the '-n' option to 'echo' in that same line may help suppress the extra space (-n tells echo not to print a newline at the end).

LasseW 11-25-2004 04:14 PM

When standard output is redirected to a variable, linefeeds are converted to spaces. So you need to remove the linefeeds before assigning the string to a variable. Here's one solution:

tn1=$(grep "TTITLE$titlenum=" $1 | cut -d\= -f2 | tr -d "\n")

krock923 11-25-2004 06:44 PM

Hello, that looks a lot like the result from a cddb query. Are you making an encoding script? If so, I've been working on one for a few months and i'd be honored if you wanted to try it out.


All times are GMT -5. The time now is 07:32 AM.