LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 10-24-2013, 03:32 AM   #1
dhiru_b25@rediffmail.com
LQ Newbie
 
Registered: Oct 2013
Posts: 1

Rep: Reputation: Disabled
Replace 2nd occurance of a special character after nth occurance of a delimiter from


Here My question is,

Replace 2nd or all occurance of a special character after nth occurance of a delimiter from string,in unix/linux

or

Replace "Text Qualifier" character from data field in unix.

I have below string where '"'(Double Quote) should get replaced with space.

String:- "123"~"23"~"abc"~24.50~"descr :- nut size 12" & bolt size 12"1/2, Quantity=20"~"2013-03-13"

From above string, i want below output:- "123"~"23"~"abc"~24.50~"descr :- nut size 12 & bolt size 12 1/2, Quantity=20"~"2013-03-13"

I have replaced " double quote character with space character.

"descr :- nut size 12" & bolt size 12"1/2, Quantity=20" & "descr :- nut size 12 & bolt size 12 1/2, Quantity=20"

I want to identify such rows from file & would like to replace such text qualifier character from data in unix/linux.

Request you to provide your inputs, & thanking you in advance.
 
Old 10-24-2013, 04:25 AM   #2
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,849

Rep: Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309
I would split the line into 3 parts (before, interesting, after). Probably you can split by the delimiters descr and Quantity. Next I will replace all the " chars in the interesting part and finally recreate the line. I do not know how can you identify such rows.
 
Old 10-24-2013, 11:50 AM   #3
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
Replace the SECOND comma after the FOURTH tilde with a #.

With this InFile ...
Code:
aaaa~bb~cc,cccc~d,,d,,d,,~ee,ee,ee,ee,~fff,f,fff~gg
... this cut-and-paste code ...
Code:
cut -d~ -f5-  $InFile >$Work1
cut -d\, -f1-2 $Work1 >$Work2
cut -d\, -f3-  $Work1 >$Work3
cut -d~ -f1-4 $InFile |paste -d'~#' - $Work2 $Work3 >$OutFile
... produced this OutFile ...
Code:
aaaa~bb~cc,cccc~d,,d,,d,,~ee,ee#ee,ee,~fff,f,fff~gg
Daniel B. Martin
 
Old 10-24-2013, 11:52 AM   #4
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
Replace the SECOND comma after the FOURTH tilde with a #.

With this InFile ...
Code:
aaaa~bb~cc,cccc~d,,d,,d,,~ee,ee,ee,ee,~fff,f,fff~gg
... this awk ...
Code:
awk -F "" '{tc=0; cc=0;        # tc = tilde count;  cc = comma count
  for (j=1;j<=NF;j++)          # examine each character, left-to-right 
  {if ($j=="~") {tc++; cc=0};  # at each tilde, reset the comma count
   if ($j==",") cc++;          # increment comma count
   if (tc==4 && cc==2) break}; # when criteria are met, bail out!
   {print substr($0,1,j-1)"#"substr($0,j+1)}}' $InFile >$OutFile
... produced this OutFile ...
Code:
aaaa~bb~cc,cccc~d,,d,,d,,~ee,ee#ee,ee,~fff,f,fff~gg
Daniel B. Martin

Last edited by danielbmartin; 10-24-2013 at 12:01 PM. Reason: Improve code and comments
 
Old 10-24-2013, 12:58 PM   #5
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
Or maybe:
Code:
awk 'BEGIN{OFS=FS="~"}$5 = gensub(/,/,"#",2,$5)' file
This example is based on Daniels example input
 
1 members found this post helpful.
Old 10-24-2013, 01:28 PM   #6
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
Quote:
Originally Posted by grail View Post
Code:
awk 'BEGIN{OFS=FS="~"}$5 = gensub(/,/,"#",2,$5)' file
Once again, grail comes through with a solution which is concise and elegant. Bravo!

Daniel B. Martin

Last edited by danielbmartin; 10-24-2013 at 01:34 PM.
 
Old 11-01-2013, 11:27 PM   #7
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,187

Rep: Reputation: 354Reputation: 354Reputation: 354Reputation: 354
CoSince the OP has not marked this thread as "Solved," and the proposed solutions do not seem to address the OP's original question, here's another stab:

The O.P. phrased the question like this:
Quote:
Replace "Text Qualifier" character from data field in unix.

I have below string where '"'(Double Quote) should get replaced with space.

String:- "123"~"23"~"abc"~24.50~"descr :- nut size 12" & bolt size 12"1/2, Quantity=20"~"2013-03-13"

From above string, i want below output:- "123"~"23"~"abc"~24.50~"descr :- nut size 12 & bolt size 12 1/2, Quantity=20"~"2013-03-13"

I have replaced " double quote character with space character.

"descr :- nut size 12" & bolt size 12"1/2, Quantity=20" & "descr :- nut size 12 & bolt size 12 1/2, Quantity=20"

I want to identify such rows from file & would like to replace such text qualifier character from data in unix/linux.
The first part of the question is "I want to identify such rows [in the] file." Phrased that way, there is no reasonable way to answer it, since "such rows" has nowhere been defined.

Some possibilities:
  1. Rows containing "size <number>"{<number>/<number>}
  2. Rows containing 5 or more ~ delimited fields, and at least 1 "<number>{<number>{/<number>} in the fifth field.
  3. Rows containing <ordinal number> or more <symbol> delimited fields, and at least 1 "<number>{<number>{/<number>} in the <ordinal number> field.
  4. Rows containing <ordinal number> or more <symbol> delimited fields, and at least 1 standard linear SAE measurement in the <ordinal number> field.
Since that last interpretation seems most likely, here an expansion of gail's solution:
Code:
$field ~ /[[:digit:]][\047\042]/ {
  OFS=FS
  $field=gensub(/([[:digit:]]+)[\047\042](.)/,"\\1 \\2","G",$field)
}
{print}
Warning: This code uses gawk extensions, and may not work for other AWK programs.

Using the (single) line provided by the OP, I get:
Code:
$ echo ' "123"~"23"~"abc"~24.50~"descr :- nut size 12" & bolt size 12"1/2, Quantity=20"~"2013-03-13"' | gawk -f ./replace.gawk -- FS=\~ field=5 -
 "123"~"23"~"abc"~24.50~"descr :- nut size 12  & bolt size 12 1/2, Quantity=20"~"2013-03-13"
Notes:
  1. The "awk" part of the line has been highlighted. The final dash, at the end, is a gawk shorthand for /dev/stdin.
  2. The program was saved in a file called replace.gawk for testing.
  3. The last line, {print} is there so that non-SAE lines will be printed as well as the ones converted.
  4. The single-quote (feet) and quote (inches) characters were entered as \047 and \042 to avoid problems when the code is parsed. (The number of backslashes needed is too dependent on the source of the code - command line, program, inside other quotes, etc.)
  5. The program could be made into a command like this:
    Code:
    #!/bin/gawk -f
    $field ~ /[[:digit:]][\047\042]/ {
      OFS=FS
      $field=gensub(/([[:digit:]]+)[\047\042](.)/,"\\1 \\2","G",$field)
    }
    {print}
    , saving it, and making the file executable. If that code were to be saved as replace, this would result:
    Code:
    $ echo ' "123"~"23"~"abc"~24.50~"descr :- nut size 12" & bolt size 12"1/2, Quantity=20"~"2013-03-13"' | ./replace FS=\~ field=5 -
     "123"~"23"~"abc"~24.50~"descr :- nut size 12  & bolt size 12 1/2, Quantity=20"~"2013-03-13"
  6. Since the FS= and field= are passed as arguments rather than parameters, if you need to you can change those values when before different input files.
  7. As written, all output goes to /dev/stdout. If the final line, {print} were changed to {print > out}, then placing an out=name before an input file would create the name file.

Last edited by PTrenholme; 11-01-2013 at 11:38 PM. Reason: The final quote "$field" was (incorrectly) removed
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to replace the space with newline character after every nth field in a line surajchalukya Linux - Newbie 13 03-08-2013 09:58 PM
Find and Replace character/special character from the file MyRelam Red Hat 8 05-21-2012 12:52 AM
Strange occurance? spotslayer Linux - Hardware 0 08-31-2004 06:11 AM
strange occurance with CP homestead1000 Linux - General 4 08-11-2003 01:47 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 06:04 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration