LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Well done is better than well sed (https://www.linuxquestions.org/questions/linux-newbie-8/well-done-is-better-than-well-sed-749374/)

scott_audio 08-21-2009 03:01 PM

Well done is better than well sed
 
First, what I have, works... so I'm not asking someone to do a bunch of work 'for me'. I'm just looking to improve what I have, and learn something from seeing it done right.

I volunteered to help find a solution for a group where the goal was to extract all bible references that are enclosed consistently in parentheses from a given text document, grab the referenced verse or range of verses from a database and generate a list, so the actual verse content is there, and not just the reference.

To grab the scripture, I used Chip Chapin's bible-kjv, which does not like roman numerals, and prefers queries to NOT have any spaces (2Corinthians2:1 instead of II Corinthians 2:1).

Code:

bible -f -l80 2Corinthians2:1
Above example would generate formatted (-f) text at 80 characters.

As can be seen in my newbie script, I hack away at the text, extracting all the references enclosed in parentheses, but only if it's got a colon preceded by a number, removes any semi-colons, removes the spaces, etc. finally generating a double-spaced list of scriptures with a place to put a check mark or whatever. bible-kjv is ok with commas.

I just can't seem to figure out how to get around directing to multiple files and using so many seds... I am a newbie, and what I have works fine, just looking to do it better, and learn, any ideas?

Code:

cat $1 | grep \( \
        | cut -d ')' -f 1 - \
        | cut -d '(' -f 2 - \
        | grep \[0-9]\: \
        | sed 's/cf./,/g' \
        | tr \, \\n \
        | sed 's/and//g' \
        | sed 's/ //g' \
        | sed 's/\;//g' \
        | sed 's/^III/3/g' \
        | sed 's/^II/2/g' \
        | sed 's/^I/1/g' \
        | sed 's/Philemon/Philemon1:/' \
        | sed 's/Obadiah/Obadiah1:/' \
        | sed 's/^2John/2John1:/' \
        | sed 's/^3John/3John1:/' \
        | sed 's/^Jude/Jude1:/' \
        > ref.1
cat ref.1 | while read line; do
        bible -f -l80 $line >> ref.2
done
cat ref.2 | sed 's/^/\[_____\] /' | sed G > ref.rtf

Thanks for looking.
-Scott

acid_kewpie 08-21-2009 03:06 PM

There are a million ways to do the same thing in terms of general improvement, I'd probably be doing all of that as an awk script for example, but for what you're specifically asking, you can do multiple seds easily enough... "sed -e 's/a/b/g' -e 's/c/d/g'" and so on. As for the files... why are you using that ref.1 file at all? After all those pipes, why not just use another one?? likewise for ref.2 really, you can pipe after the done to take the entire output as a single stream.

scott_audio 08-21-2009 03:17 PM

I guess I was thinking I wanted to be able to look at the individual files to check the formatting - I have no idea what I'm doing is the best answer :) I'll try a single sed like you demonstrated and lose the pipes, and read up more on awk as well, thanks, Chris.


All times are GMT -5. The time now is 08:40 PM.