LinuxQuestions.org
Latest LQ Deal: Linux Power User Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-27-2010, 07:28 AM   #16
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551

I don't see how that is possible.

Can you please show me 5-10 lines of the input file?

EDIT: I have just tested using the input data you gave in the first post, and the code above works perfectly for me:
Code:
root@reactor: cat junko
C, 0109390,sfs,sfsf,B,blah,blah
C,B000004,sfs,sfsf,B,blah,b
C,dfsdf,sfs,sfsf,C,blah,b
C,dfsdf,sfs,sfsf,BB ,blah,b
C,dfsdf,sfs,sfsf,D,blah,b
C,dfsdf,sfs,sfsf,B,blah,b
root@reactor: awk 'BEGIN{FS=",";OFS=","; count=1}{gsub(" ","",$5); if($5=="B"){$2=sprintf("BACS%06u",count); count++}; print}' junko
C,BACS000001,sfs,sfsf,B,blah,blah
C,BACS000002,sfs,sfsf,B,blah,b
C,dfsdf,sfs,sfsf,C,blah,b
C,dfsdf,sfs,sfsf,BB,blah,b
C,dfsdf,sfs,sfsf,D,blah,b
C,BACS000003,sfs,sfsf,B,blah,b
root@reactor:

Last edited by GrapefruiTgirl; 10-27-2010 at 07:34 AM. Reason: Added test output
 
Old 10-27-2010, 08:34 AM   #17
redhatuser1
Member
 
Registered: Sep 2009
Posts: 55

Original Poster
Rep: Reputation: 0
Input file - as mentioned it is unformatted output from sql query. I rename it to .csv for use

Would the header/footer cause an issue as there are fewer columns?

Code:
A, 09090901, 154548
C,    8989898, 0000 , date ,B    , soem more stuff , RE
C,    8989898, 0000 , date ,this    , soem more stuff , RE
C,    8989898, 0000 , date ,B    , soem more stuff , RE
C,    8989898, 0000 , date ,this    , soem more stuff , RE
T, 2602111
As mentioned command line execution works perfectly.

Current output looks like:
Code:
A, B0001, 154548,,B
C,    B0002, 0000 , date ,B    , soem more stuff , RE
C,    B0003, 0000 , date ,B    , soem more stuff , RE
C,    B0004, 0000 , date ,B    , soem more stuff , RE
C,    B0005, 0000 , date ,B    , soem more stuff , RE
T, B0005,,,B
I can't exactly copy the file but that format is very similar.

Last edited by redhatuser1; 10-27-2010 at 08:41 AM.
 
Old 10-27-2010, 08:40 AM   #18
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551
No, header and footer will make no difference. Only if the header or footer happen to contain 5+ fields (4 commas in the line) and the 5th field is a "B", will the header or footer be affected.

I ran the above code again, using exactly the data you just gave, and it still works fine for me:
Code:
root@reactor: cat junko
Header here!
A, 09090901, 154548
C,    8989898, 0000 , date ,B    , soem more stuff , RE
C,    8989898, 0000 , date ,this    , soem more stuff , RE
C,    8989898, 0000 , date ,B    , soem more stuff , RE
C,    8989898, 0000 , date ,this    , soem more stuff , RE
T, 2602111
Footer goes here, with lots of fields.

# Output below:

root@reactor: awk 'BEGIN{FS=",";OFS=","; count=1}{gsub(" ","",$5); if($5=="B"){$2=sprintf("BACS%06u",count); count++}; print}' junko
Header here!
A, 09090901, 154548
C,BACS000001, 0000 , date ,B, soem more stuff , RE
C,    8989898, 0000 , date ,this, soem more stuff , RE
C,BACS000002, 0000 , date ,B, soem more stuff , RE
C,    8989898, 0000 , date ,this, soem more stuff , RE
T, 2602111
Footer goes here, with lots of fields.
So, we can see that two lines have had replacements made. Based on the requirements, this is correct operation. Is there something not working for you the same as it is for me?
 
Old 10-27-2010, 08:46 AM   #19
redhatuser1
Member
 
Registered: Sep 2009
Posts: 55

Original Poster
Rep: Reputation: 0
I re-edited my quoted sample output file.

I realised the header and footer were being modified - if you take a look they are being padded out to 5 fields and amendments are being forced in each line. I assume something with the IF is causing this?

I appreciate your time, and apologise for my stupidity.

Record A ----is the header
Record T ---- is the trailer

Code:
A, B0001, 154548,,B
C,    B0002, 0000 , date ,B    , soem more stuff , RE
C,    B0003, 0000 , date ,B    , soem more stuff , RE
C,    B0004, 0000 , date ,B    , soem more stuff , RE
C,    B0005, 0000 , date ,B    , soem more stuff , RE
T, B0005,,,B

Last edited by redhatuser1; 10-27-2010 at 08:49 AM.
 
Old 10-27-2010, 08:58 AM   #20
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551
No need to apologize, I appreciate your patience too. There's no stupidity involved, but one of us is still missing something here..

However, to be honest, I do not see anything wrong with the file you pasted above in post #19. But: Is that the ORIGINAL input file, or is that the OUTPUT file? If it's the INPUT file, then every line will be edited, because field $5 is a "B" in every line. If that's an OUTPUT file, then it was not processed with the code we last are working with, such as I demonstrated in post #18.

If you're trying to say that in the above file you pasted, the first line is the header, and the last line is a footer, then yes, of course they are being edited, because $5 == "B". If you don't want those two lines altered, then we need to know what distignuishes them from lines that ARE to be edited. For example, we could say, if the line does not begin with "C", do not edit it.

If none of this addresses the problem, please post (again) the input file, and the output file, and use some bold or red or something, to illustrate exactly what is wrong with the output. EDIT: Please also clarify what exactly is a "header" and a "footer".

Last edited by GrapefruiTgirl; 10-27-2010 at 09:08 AM.
 
Old 10-27-2010, 09:10 AM   #21
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551
OK, I see your edit on post #19 about the header and footer. Do the NON-header and NON-footer lines always begin with "C"? Or is better to say that headers and footers NEVER begin with "C"? Which logic would be better for distinguishing the header/footer lines?
 
Old 10-27-2010, 09:14 AM   #22
redhatuser1
Member
 
Registered: Sep 2009
Posts: 55

Original Poster
Rep: Reputation: 0
Data records will always begin with C.

I ensure it is the case by using 3 sed statements earlier in the script. 1 to change all records to C, followed by another 2 to assign a header record and calculated trailer record.
 
Old 10-27-2010, 09:18 AM   #23
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551
OK, so I've added a test to see if the record begins with "C", and if it does, check for the "B" situation; if it does not begin with "C", just print the line. This leaves header and footer unchanged.

Note that I'm still replacing "B" with "BACS" here, so adjust your code accordingly for this:

Code:
#!/bin/bash

awk 'BEGIN{FS=","; OFS=","; count=1}

{
if ($0 ~ /^C/) {
        gsub(" ","",$5)
        if($5=="B"){
                $2 = sprintf("BACS%06u",count); count++
        }
}
print
}' junko
Note that it's the same code basically, only I have put it into a script here on my end to make it easier to work on. The data filename I am using is "junko" so replace that with yours.
 
Old 10-27-2010, 09:36 AM   #24
redhatuser1
Member
 
Registered: Sep 2009
Posts: 55

Original Poster
Rep: Reputation: 0
Thanks,

I can confirm the code works fine. I can run it seperately against the input file and I get the correct output. Unfortunately it is still not working when integrated into my larger script.

I will figure this one out - it is annoying but it is going to be something simple - it normally is.

I love linux - very interesting in comparison to windows, I just don't have enough need to use it daily
 
Old 10-27-2010, 09:41 AM   #25
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551
If it isn't working in the larger script, and the problem is the $count not being retained, remember to remove the code that says:
Code:
count=1;
in the start of the awk code - that will keep resetting the counter every time. And of course, remember to import the $count variable into awk at the start:
Code:
-v count=$count
If this isn't the problem, or if you remain stuck, don't be afraid to ask for further help - I don't mind, and if I'm not here, someone else won't mind. And remember, it will (probably) help if you get any error messages, to post them for anyone helping you. If it runs without errors, but doesn't do what you expect, describe what's not happening.

Cheers!

Last edited by GrapefruiTgirl; 10-27-2010 at 09:43 AM.
 
Old 10-27-2010, 09:50 AM   #26
rn_
Member
 
Registered: Jun 2009
Location: Orlando, FL, USA
Distribution: Suse, Redhat
Posts: 127
Blog Entries: 1

Rep: Reputation: 25
-----

Last edited by rn_; 10-27-2010 at 09:52 AM.
 
Old 10-27-2010, 09:53 AM   #27
redhatuser1
Member
 
Registered: Sep 2009
Posts: 55

Original Poster
Rep: Reputation: 0
Thanks - I have yet to implement the retained count whilst I work around getting the general awk to work in the large script.

Basically the awk which we have confirmed works outside of my script (I also created a 2nd script which successfully executes your attached code) does not work when integrated into my main script. I literallty copied the code from 1 file to another but it is still iterating through every line in the record (now just those records beginning with C) and changing the $5 to B and $2 to B00001++

My script is quite long winded, it has been used to help me learn basic scripting so I have seperated a few commands instead of joining then. I can't copy the entire script as it is quite long but here is a little:

I wondered if I need to terminate something or other - I dunnoo.. As you can see It creates lots of files for me to view the changes - I later delete these at the end (since all of them currently work)
Code:
#----------------------------- removes 1st character from each line ------------------------------------
sed 's/^./C,/' < step3.txt > step4.txt
echo "Editing extract, step 4, removes 1st character from each record and replaces with C" >> $LOGFILE

#--------------------------- replace al '|' with ','s for .csv -----------------------------------------
sed 's/|/,/g' < step4.txt > step5.csv
echo "Editing extract, step 5, replaces all | with , and creates csv" >> $LOGFILE

#------------------------- replace the 6th comma with a space ------------------------------------------
sed 's/,/ /6' < step5.csv > step6.csv
echo "Editing extract, step 6, replaces 6th comma with a space" >> $LOGFILE

#---- add trailer record including file total, awk creates line to add - sed removes '.' from total ----
TRAILER=$(awk '{s+=$3}END {print "T, "s}' step6.csv)
echo "File Total is = $TRAILER" >> $LOGFILE
echo "$TRAILER" >> step6.csv
echo "Editing extract, step 7, adds trailer to file" >> $LOGFILE

#------------------------ removes "." from the file -----------------------------------------------------
sed 's/\.//g' < step6.csv > step7.csv
echo "Editing extract, step 8, removing all the dots" >> $LOGFILE

#------------------------ add header to file ------------------------------------------------------------
sed '1s/.*/A, 71038494, 404720/' < step7.csv > step8.csv
echo "Editing extract, step 9, adds header to file" >> $LOGFILE
echo "" >> $LOGFILE

# ----------------------- replacing reference for B records ------------------------------------------
awk 'BEGIN{FS=",";OFS=",";count=1}
{
if ($0 ~ /^C/){
	gsub(" ","",$5)
	if($5="B"){
		$2=sprintf("B%06u",count); count++
	}
}
print
}' < step8.csv > pay2icon.csv
echo "Editing extract, step 10, selecting Brecords and amending Ref Number" >> $LOGFILE
echo "" >> $LOGFILE

#----------------------------- amending file name--------------------------------------------------------
mv pay2icon.csv ACA"$SDATE".ap
FYI - I have also got the intention to cut the script down once I have everything working.

Last edited by redhatuser1; 10-27-2010 at 09:55 AM.
 
Old 10-27-2010, 09:55 AM   #28
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551
_rn is correct in post # 26:
Code:
	if($5="B"){
you're missing an "=" sign. Should read:
Code:
        if($5=="B"){
 
1 members found this post helpful.
Old 10-27-2010, 09:59 AM   #29
redhatuser1
Member
 
Registered: Sep 2009
Posts: 55

Original Poster
Rep: Reputation: 0
didn't I tell you it was going to be something simple and stupid...

Right Ill get that count working now.

I'll start a new thread for the count as I can't see me sorting it by myself.

Last edited by redhatuser1; 10-27-2010 at 10:58 AM.
 
Old 10-27-2010, 01:16 PM   #30
rn_
Member
 
Registered: Jun 2009
Location: Orlando, FL, USA
Distribution: Suse, Redhat
Posts: 127
Blog Entries: 1

Rep: Reputation: 25
Quote:
Originally Posted by GrapefruiTgirl View Post
_rn is correct in post # 26:
Code:
	if($5="B"){
you're missing an "=" sign. Should read:
Code:
        if($5=="B"){
I was?!! ha ha.. I had overlooked a few posts and thought maybe you guys were already past that problem, so i scratched my post.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] sed help - replace line feed with different character bradvan Programming 7 04-23-2012 12:31 AM
How to replace a character with the output of some commands using sed? mamun2015 Linux - Newbie 18 03-16-2010 11:50 AM
Replace 2nd to last Character with SED elproducto Programming 5 03-31-2009 01:41 PM
can I replace text with the result of "wc" using sed? BrianK Linux - General 1 04-21-2004 02:15 PM
Insert character into a line with sed? & variables in sed? jago25_98 Programming 5 03-11-2004 07:12 AM


All times are GMT -5. The time now is 12:42 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration