LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Need help parsing text file (https://www.linuxquestions.org/questions/programming-9/need-help-parsing-text-file-260650/)

scilec 11-29-2004 04:25 PM

Need help parsing text file
 
Hi,

I'm writing a script that parses test files and in some cases, combines several lines into one. For example, the lines:

TTITLE1=There has go
TTITLE1=t to be a better way.

Would be output as "There has got to be a better way"

This is fine. The problem I have is if the last character is a space. For example, if I have the lines:

TTITLE2=You shook me all night
TTITLE2=long.

Even though there is a space after "night" in the file, my output comes out as:

You shook me all nightlong.

QUESTION:
How do I parse a text file such that spaces at the end of lines are recognized and retained???

Thanks in advance!

-Steve

acid_kewpie 11-29-2004 04:28 PM

probably just a case of wrapping your variable in quote marks, but as you've not even gievn us the code you're working on...

david_ross 11-29-2004 04:29 PM

Putting quotes around the text should work:
title="a string ending with a space "

scilec 12-01-2004 12:13 PM

My apologies for not being more clear in my initial question. In my first example, imagine a text file containing:

TTITLE1=There has go
TTITLE1=t to be a better way.

I want my script to parse this as "There has got to be a better way."

In my second example, imagine my text file contains:

TTITLE2=You shook me all night
TTITLE2=long.

Here, I'd want to parse it as "You shook me all night long."

The problem is that, depending on how I parse the file, I either end up with:

There has go t to be a better way. #unwanted space in "got"
You shook me all night long.

or

There has got to be a better way.
You shook me allnight long. # No space between "all" and "night"

Regardless of whether I use cut, sed, or awk, lines that end with a space aren't beiing handled correctly. Here's the script I'm using, where $1 is the file being read:


#!/bin/bash
#
titlenum=0

while [ -n "$(grep "TTITLE$titlenum=" $1)" ]
do
tan=$(grep "TTITLE$titlenum=" $1)
title=$(echo ${tan//"TTITLE$titlenum="/} | awk '{print $0}')
echo $title >> trackfile.txt
let titlenum=titlenum+1
done

In this case, I get the following:

There has got to be a better way.
You shook me allnight long. # No space between "all" and "night"

How do I modify my script so that it properly wraps text into a single line regardless of whether or not there's a space at the end of a line?

dustu76 12-02-2004 03:08 AM

Code:


/ibrc/users/tba17/daily17/soumen/tmp>cat abc |tr " " "*"
TTITLE1=There*has*go
TTITLE1=t*to*be*a*better*way.

TTITLE2=You*shook*me*all*night*
TTITLE2=long.

TTITLE3=hello,
TTITLE3=*
TTITLE3=how*are*you?*

/ibrc/users/tba17/daily17/soumen/tmp>cat b
#!/usr/bin/bash

fname=$1
tstr=TTITLE
ofile=trackfile.txt

> $ofile
for srl in $(cut -d"=" -f1 $fname | grep -v "^$" | sed -e 's/'$tstr'//g' |sort -u -n) ; do
        curstr=${tstr}${srl}
        echo "Processing $curstr ..."

        grep "^${curstr}=" $fname | \
        awk -F"=" '{arr[$1]=arr[$1]""$2} END{ for (i in arr) {printf("%s\n",arr[i])}}' | \
        grep -v "^$" >> $ofile
done
/ibrc/users/tba17/daily17/soumen/tmp>b abc
Processing TTITLE1 ...
Processing TTITLE2 ...
Processing TTITLE3 ...
/ibrc/users/tba17/daily17/soumen/tmp>cat trackfile.txt
There has got to be a better way.
You shook me all night long.
hello, how are you?
/ibrc/users/tba17/daily17/soumen/tmp>

The spaces in the input file [abc here], have been shown with "*". The script works even if the line is :

1. Ending/starting with space
2. Has only space(s)
3. Has nothing after "=" e.g. TTITLE1=

HTH.

scilec 12-02-2004 01:00 PM

Perfect
 
Just what I needed. Thank you so much!

-Steve


All times are GMT -5. The time now is 01:27 PM.