LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 12-05-2007, 01:04 PM   #1
onacorpuscle
LQ Newbie
 
Registered: Mar 2006
Location: Barcelona
Distribution: SUSE LINUX 9.3
Posts: 10

Rep: Reputation: 0
Unhappy URGENT: Help for Find and Replace with AWK


Dear collegues

I have a big problem when replacing strings with AWK. I'm integrating a XML conversor in a legacy application and I need to parse the receiving XML files.

Because received XML files are not well formatted I need to eliminate blank spaces -no problem- and to separate the xml tags, each one in one line, and finally parse XML blocks for convert to specific flat files with a C module.

cat $FILE | awk '( NF > 0 ) { print $0 }' > $FILE_TMP1
if [ -f "$FILE_TMP1" ] ; then
cat $FILE_TMP2 | awk -v RS='><' -v OFS='><' '{print -ksh}' > $FILE_TMP2
else
echo "-69258"
exit -1
fi

for i in $(ls -1rt *.$FILE_TYPE)
do
while read -r LINE
do
PARAM=`echo $LINE | cut -f 2 -d "<" | cut -f 1 -d ">"`
if [[ "$PARAM" = "$INICI_BLOC_ORD" ]] ; then
SUFIX=`date +%Y%m%d%H%M%S`
FILE="ORD_"$SUFIX
FILE_FI=$FILE".PROC"
touch ./$FILE
elif [ "$PARAM" = "$FI_BLOC_ORD" ] ; then
echo "$FI_FILE_ORD" >> ./$FILE
mv ./$FILE $DIR_PROC"/"$FILE_FI
else
if [[ -f "$FILE" ]] ; then
echo "$LINE" >> ./$FILE
fi
fi
done < $i
done
....

But the command:
$ cat $FILE_TMP2 | awk -v RS='><' -v OFS='><' '{print $0}' > $FILE_TMP2
don't work well, because in $FILE_TMP2 this appears
...
<NamespaceR:SrvcId
STR</NamespaceR:SrvcId
<NamespaceR:PlanningCode
T</NamespaceR:PlanningCode
<NamespaceR:ProtocolType
COM</NamespaceR:ProtocolType
<NamespaceR:FileRef
NRXS505987257RR6</NamespaceR:FileRef
</NamespaceR:RoutingTable
...
when this was waiting for me
...
<NamespaceR:SrvcId>STR</NamespaceR:SrvcId>
<NamespaceR:PlanningCode>T</NamespaceR:PlanningCode>
<NamespaceR:ProtocolType>COM</NamespaceR:ProtocolType>
<NamespaceR:FileRef>NRXS505987257RR6</NamespaceR:FileRef>
</NamespaceR:RoutingTable>
....

The ">" are missing and the resultant mask is not the desired, because I need replace
"><"
by
">
>"

Any suggerence?

Best Regards!

Xavier
 
Old 12-05-2007, 01:17 PM   #2
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
Have you looked at SED? It is well-suited to search and replace.

My favorite SED and AWK tutorials---and more: http://www.grymoire.com/Unix/

Also, it's better to put your code in [code] tags---preserves formatting and is easier to read.
 
Old 12-05-2007, 05:32 PM   #3
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,344

Rep: Reputation: 2746Reputation: 2746Reputation: 2746Reputation: 2746Reputation: 2746Reputation: 2746Reputation: 2746Reputation: 2746Reputation: 2746Reputation: 2746Reputation: 2746
That could get hairy. You might want to look at Perl, which has several XML parsing modules.
 
Old 12-07-2007, 11:22 AM   #4
onacorpuscle
LQ Newbie
 
Registered: Mar 2006
Location: Barcelona
Distribution: SUSE LINUX 9.3
Posts: 10

Original Poster
Rep: Reputation: 0
Tabs, Blanks, Tags and Trips!

I'm sorry!

I correct the command line of the post:
But the command:
$ cat $FILE_TMP2 | awk -v RS='><' -v OFS='>\n<' '{print $0}' > $FILE_TMP2
don't work well, because in $FILE_TMP2 this ...

Great collegues! Thanks a lot!

I'm using the next command for trim blank lines and tabs, and well-formatting all the XML tags! (ksh AIX5.3)

cat $FILE_TMP2 | awk '( NF > 0 ) { print $0 }' | sed -e 's/ //g' | sed -e 's/></>***</g' | tr -s '*' '\012' > $TMP_FILE

This is the correct order, but if I change it those dont work: the last line is lost!

The sequences is beloww:

For trimiing blank lines...
cat $FILE | awk '( NF > 0 ) { print $0 }'

For triming tabs...
| sed -e 's/ //g'

and finally, avoiding more than one xml tag in one line...
| sed -e 's/></>***</g' | tr -s '*' '\012' > $TMP_FILE

Cheers!

One more time, thanks a lot!

Xavier
 
  


Reply

Tags
awk, find, replace


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Awk- search & replace text viceroy Linux - Newbie 7 07-22-2007 10:18 AM
Find and Replace upwordz Linux - Newbie 7 05-10-2007 09:03 PM
can't find awk D_O_Y_L_E Linux - Software 1 03-21-2006 12:42 PM
find and replace happy78 Programming 11 09-10-2005 10:21 AM
sed or awk question - replace caps with small letters computera Linux - General 1 12-30-2003 04:39 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 04:52 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration