LinuxQuestions.org
Latest LQ Deal: Linux Power User Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-26-2010, 09:56 AM   #1
redhatuser1
Member
 
Registered: Sep 2009
Posts: 55

Rep: Reputation: 0
if and sed - how to check character and then replace depending on result


Hi,

I have a CSV file with 8 columns.

I want to check the 5th column, which will contain a single capitalised letter. If that letter is say "B" I would then like to replace the 2nd column in the csv with an incremental number starting at 0 (basically a count) with a prefix of B (B0000001)

Sample row would be:
Code:
C, 0109390,sfs,sfsf,B,blah,blah
Amended row would be:
Code:
C, B000001,sfs,sfsf,B,blah,blah
Hope that makes sense
 
Old 10-26-2010, 10:05 AM   #2
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976
I don't know in sed, especially for the incremental part (since sed can't do arithmetic). On the other hand in awk you can try something as simple as:
Code:
awk -F, '$5 == "B" {$2 = sprintf(" B%06d", c++)}1' file
Is this what you're looking for?
 
Old 10-26-2010, 10:14 AM   #3
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551
Code:
awk 'BEGIN{FS=",";OFS=",";count=1}{if($5=="B"){$2=sprintf($5"%06u",count);count++}; print}'
I offer the same thing as colucix, except that it retains the commas and pads leading zeroes onto the $2 field.

Last edited by GrapefruiTgirl; 10-26-2010 at 10:24 AM. Reason: Started the counter at zero, not 1 -- then changed back to 0
 
Old 10-26-2010, 10:16 AM   #4
redhatuser1
Member
 
Registered: Sep 2009
Posts: 55

Original Poster
Rep: Reputation: 0
No worries, AWK will be fine - I was just curious that's all.

If you could break it down for newbie, I would appreciate it.
 
Old 10-26-2010, 10:20 AM   #5
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551
What's happening in both these examples, if the fifth field ($5) is checked to see if it's a "B"., if it is, that "B" is placed into field $2, along with an incrementing counter value appended to it..
We both set the incoming field separator to a comma (FS=",") and I set the output separator (OFS) also to a comma.

Awk does the "B" test on every line of the file. Regardless if a switch was made of $2 or not, the whole record (each line) is printed out, altered or not.
 
1 members found this post helpful.
Old 10-26-2010, 10:30 AM   #6
redhatuser1
Member
 
Registered: Sep 2009
Posts: 55

Original Poster
Rep: Reputation: 0
I am experiencing another problem. I save the file later in my script as a .csv file. However it seems the 5th column has unnecessary white space (space bars I guess). How can I remove these, just from this column?
 
Old 10-26-2010, 10:36 AM   #7
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551
Code:
awk 'BEGIN{FS=",";OFS=",";count=1}{if($5=="B"){$2=sprintf($5"%06u",count);count++};gsub(" ","",$5);print}'
The bold part added here, replaces any " " (space) with nothing, in field $5.

Does that help? If not, please show us the problem or explain again as I've not understood.

EDIT:
Further thought, and I think you may do better to move the gsub statement to before the check for "B", as below:
Code:
awk 'BEGIN{FS=",";OFS=",";count=1}{gsub(" ","",$5);if($5=="B"){$2=sprintf($5"%06u",count);count++};print}'

Last edited by GrapefruiTgirl; 10-26-2010 at 11:08 AM.
 
Old 10-26-2010, 11:12 AM   #8
redhatuser1
Member
 
Registered: Sep 2009
Posts: 55

Original Poster
Rep: Reputation: 0
thanks - one final thing, which benefit me. I will use this in a script I have been building, I can take a guess (replaceing your code in parts with a variable), is there an easy way for me to retain the last value used in the 2nd field for B records.

I.e script is run and final record shows B000019. It would be ideal if the script can take that value and start there next time.

I am pretty inexperienced with awk - my apologies, this will be only the 3rd statement I have used.
 
Old 10-26-2010, 11:26 AM   #9
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551
Passing the variable back into the AWK on th next run, as a starting point, is easy:
Code:
awk -v count=$SOMEVAR '{ blah blah }' filename
And after that, remove the "count=1" in the start of my code.

Passing the variable back out at the end, is a different story far as I know. One way is to print the "count" variable as the very last thing printed when the file is done being processed, and so you'd need some checking to grab that variable and save it..
Alternately, you could incrementally store the value in a temporary file, and grab it each time when needed.

I do not know of a simple way, comparable to sending the variable INTO the awk, of getting the variable OUT of the awk, other than that. If there's a way, I almost guarantee someone around here knows it, so you may yet get a better answer.
 
Old 10-26-2010, 01:18 PM   #10
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551
So did you come up with anything for saving the $count variable? I have, though I don't think it's the *best* way to do it. It works though.

1) Your program sets $count to whatever starting value you like.

2) awk code that dumps its variables to a temp file after execution (set your own temp file/path):
Code:
awk -v count=$count --dump-variables=some_temp_file \
'BEGIN{FS=",";OFS=","}{ if($5=="B"){$2=sprintf($5"%06u",count);count++}; print}' input_file
3) Now, grab $count value from awk's dumped variable file:
Code:
count=$(awk '/^count:/{gsub("[[:punct:]]","",$3);print $3}' some_temp_file)
Now, you can repeat steps 2 and 3 infinitely, and $count will not be lost, and will increase.
 
Old 10-27-2010, 05:26 AM   #11
redhatuser1
Member
 
Registered: Sep 2009
Posts: 55

Original Poster
Rep: Reputation: 0
getting there - i am just trying to tweak it a bit first.

Currently the awk statement is editing 2 lines I don't want it too (header/footer). I am assuming a simple "else" can ignore the lines which don't have a "B" in the 5th column?

In regard to the count being carried over - I am struggling to pick it up when the script is run the 2nd time. I can see it dumps the file, I am about to break the code down a bit to see which bit is failing.

Can you help with the else?
 
Old 10-27-2010, 06:20 AM   #12
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551
Well, the test for $5=="B" is a test to see if field #5 contains only a "B" - so for example, if field $5 contains "Big Balloon", the line will not be edited. So, the only reason a wrong line would be edited, is because the line fits the criteria ($5=="B").

As for an else statement, sure I can help with that, but I haven't a clue what you're trying to use the else statement for until you show me the code you're working on so I can see how an else statement fits into it, if at all.

Now, as for the $count variable: like ANY variable, it is lost if the script terminates, so when you run the script next time, your variables are all initialized fresh. My code above is intended to work in something of a loop situation; i.e. the program runs and examines a number of these files containing "B", and with each successive file, the $count value is retained. But again, if the script finishes and exits, that $count value is no longer around.

If you wish to preserve it across multiple runs of the program, you will need slightly different code. I would make your program check for the existence of the "some_temp_file" where you saved the variables, and if present, grab $count value. If the temp file does not exist, set $count = 1.

If you are having a problem of $count not being kept properly, but you are NOT exiting your program, you've got something else wrong.

So, I'm (we're) happy to help but you'll need to tell what's exactly going on, and show me/us your code if we're to be able to help you fix it.

Last edited by GrapefruiTgirl; 10-27-2010 at 06:21 AM.
 
Old 10-27-2010, 07:09 AM   #13
redhatuser1
Member
 
Registered: Sep 2009
Posts: 55

Original Poster
Rep: Reputation: 0
Hi,

I have got the awk statement working using the following code - slightly different to yours:

Code:
awk 'BEGIN{FS=",";OFS=","; count=1}{gsub(" ","",$5); if($5=="B"){$2=sprintf("BCS%06u",count++)}; print}'
However when I apply this into my script using the following it fails with a syntax error next to the count++)} (curly bracket is highlighted)

Code:
awk 'BEGIN{FS=",";OFS=","; count=1}{gsub(" ","",$5); if($5=="B"){$2=sprintf("BCS%06u",count++)}; print}' < file > file 2
Ignore the "else"requirement just now. As far as the count goes I will implement your code once I have the first part working. Considering the problem, at the end of the code I will dump the variables, at the beginning I will import them and assign to count.

Last edited by redhatuser1; 10-27-2010 at 07:17 AM.
 
Old 10-27-2010, 07:16 AM   #14
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551
I never tried colucix' code from near the top of the thread, but you are using the count++ inside the `sprintf` as he did. Maybe that doesn't work? I don't know, but I don't see any other reason for a syntax error at that location.

However, I question why you are using a "<" to direct "file" into the awk. Also, why do you have "file" and "2" coming out of the awk? If I understand that statement (and its intent) correctly, I would use:
Code:
awk 'BEGIN{FS=",";OFS=","; count=1}{gsub(" ","",$5); if($5=="B"){$2=sprintf("BACS%06u",count); count++}; print}' file > file_2
That should work, processing file1 and dumping the output into file_2 (if this is the intent). I have highlighted in bold the items I changed.
 
Old 10-27-2010, 07:24 AM   #15
redhatuser1
Member
 
Registered: Sep 2009
Posts: 55

Original Poster
Rep: Reputation: 0
that code runs without any errors but with an undesireable result.

It replaces all the contents of the $5 with "B" even the records which aren't. As such it replaces all the content of $2 even for the records where the $5 was different.

I was using 2 files for testing, so I can track the changes.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] sed help - replace line feed with different character bradvan Programming 7 04-23-2012 12:31 AM
How to replace a character with the output of some commands using sed? mamun2015 Linux - Newbie 18 03-16-2010 11:50 AM
Replace 2nd to last Character with SED elproducto Programming 5 03-31-2009 01:41 PM
can I replace text with the result of "wc" using sed? BrianK Linux - General 1 04-21-2004 02:15 PM
Insert character into a line with sed? & variables in sed? jago25_98 Programming 5 03-11-2004 07:12 AM


All times are GMT -5. The time now is 11:24 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration