Data Processing

joelhop · 07-28-2005, 01:21 PM

Hey Gang,

I got a pretty quick question. I have a file where blank lines are being used as a place marker, so i cannot remove them:

ex:

train

school" "business
truck

plant

wagon

I would like to put a switch in front of only non- blank lines, while leaving blank lines intact:

ex:

-a train

-a school" "business
-a truck

-a plant

-a wagon

I'm looking for the most simple solution, I am not looking to complicate the script too much as efficiency is a prerequisite.

Thanks KARL

joelhop · 07-28-2005, 02:03 PM

additionally, all non- blank lines do contain at least 1 period.

demian · 07-28-2005, 02:28 PM

awk '{if(NF!=0)print "-a",$0; else print $0}' infile

sundialsvcs · 07-28-2005, 05:02 PM

For almost any sort of task involving line-oriented data, awk is your best friend. Definitely an important tool to know.

joelhop · 08-01-2005, 01:45 PM

Hey Gang,

This is the solution I ended up going with:
#########################################

file=data.txt #sets variable file to data.txt

numlines=`wc -l $file | awk '{ print $1 }'` #sets $numlines to the number of lines in $file

r=0 #sets $r to 0

#########################################

while [ $r -le $numlines ]; do #runs loop as many times as there are lines in the file

line=`cat $file | head -$r | tail -1` # $line is equal to the data on the particular line we
## are on in the loop

if [[ $line = *.* ]] #if $line contains anything than a period than anything else
then
echo "-a" $line >> goodfile.txt # prints -a plus the data previous on that line
else
echo >> goodfile.txt # if it does not contain anything a period anything else prints a
## blank line to goodfile.txt
fi

r=`expr $r + 1` #adds 1 to $r
done

###########################################

Any questions feel free to mail.

Thanks for all the help,

-Karl

sundialsvcs · 08-01-2005, 01:52 PM

It will work of course... but you could have done it all with awk. I would encourage you to read-up on it and see how the tool could have been applied to the problem.

demian · 08-02-2005, 10:17 AM

*hmmms*

Code:

luna:~/tmp demian$ wc -l data.txt
    3200 data.txt
luna:~/tmp demian$ time ./yourscript.sh
head: illegal line count -- 0

real    3m50.525s
user    0m22.550s
sys     1m30.890s

luna:~/tmp demian$ time awk '{if(NF!=0)print "-a",$0; else print $0}' data.txt > goodfile2.txt

real    0m0.098s
user    0m0.014s
sys     0m0.012s

luna:~/tmp demian$ diff goodfile.txt goodfile2.txt
luna:~/tmp demian$

joelhop · 01-01-2006, 07:19 PM

I did use awk, where needed: numlines=`wc -l $file | awk '{ print $1 }'`
That was all it was needed for, the rest was processed through a while loop. ;p

btmiller · 01-01-2006, 08:08 PM

Another possibility would have been to use sed -- something like:

Code:

sed 's/^\(.*\..*\)/-a \1/g'

ought to work if each line you want prefixed has a period in it.