ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
back slash where to use i want to ignore pipe if pipe is coming as value in a column and to consider as single column while i use to print a coulmn using awk command
awk version 4 provides a way to manage such situations. Using the internal variable FPAT you can decide how fields are defined based on regular expressions. This means you don't set a field separator, but you decide what is a field. In your example a field is everything not containing a pipe or everything inside double quotes. Here we go:
Code:
echo 'xx|yy|"xyz|zzz"|zzz|12' | awk 'BEGIN{ FPAT = "([^|]+)|(\"[^\"]+\")" }{ for ( i = 1; i <= NF; i++ ) print $i }'
xx
yy
"xyz|zzz"
zzz
12
i will try with above one guys but wat i actually needed is i am having a file with | seperated in which i need to search char in 3rd column and replace with null. i need to replace only the coulmn where character occurs in 3rd field
for eg:
Code:
file1.txt
xx|yy|xx|12
output file:
xx|yy||12
the above one i achieved with this below code
awk 'BEGIN {FS=OFS="|" } $3 ~ /[[:alnum:]]/ { $3="" }1' file
but wat i faced is if there is any column having pipe that should consider as single column
xx|yy|"xyz|xx"|AAA|12...
not i should achieve my requirement like this
xx|yy|"xyz|xx"||12
now AAA should replace with null considering as AAA as 4th column if use
i will try with above one guys but wat i actually needed is i am having a file with | seperated in which i need to search char in 3rd column and replace with null. i need to replace only the coulmn where character occurs in 3rd field
for eg:
Code:
file1.txt
xx|yy|xx|12
output file:
xx|yy||12
the above one i achieved with this below code
awk 'BEGIN {FS=OFS="|" } $3 ~ /[[:alnum:]]/ { $3="" }1' file
but wat i faced is if there is any column having pipe that should consider as single column
xx|yy|"xyz|xx"|AAA|12...
not i should achieve my requirement like this
xx|yy|"xyz|xx"||12
now AAA should replace with null considering as AAA as 4th column if use
I want to write this post tactfully and respectfully. I realize that English is not your first language. Your post (quoted above) is confusing. Reword it carefully -- get help from a friend if necessary. Strive for clarity. Give more than two examples of input strings and the corresponding desired output strings.
Daniel B. Martin
Last edited by danielbmartin; 08-09-2013 at 06:03 AM.
i want to replace the fourth column of file1.txt with space where 4th column will be alphanumeric value and also to consider zz value as fourth column instead of 5th column
i have achieved my requirement of replacing the column with space by below code but its not considering zz value as 4th column instead its replacing xyz as space since third coloumn i.e ""abc|xyz" is seperated by Pipe delimted
Can you tell us what version of awk you're using? Most of the solutions provided here already gives what you want to do. Only some minor modifications are needed.
For future reference, you should include that you are working on Solaris as it is quite a different beast from linux and often has a smaller / different application set.
I have not tested konsolebox's solution, but the one from colucix will not work in nawk.
You could also look at Perl or Ruby if they are options.
Mine won't work with it as well. The array-generation of match() is an extension of gnu.
I tried to give a solution with this. This works but implementation in other awks compared to GNU awk is slower since when altering $x and NF they regenerate $0 right away.
Code:
#!/usr/bin/awk -f
BEGIN {
OFS = "|"
}
function delete_column(i) {
j = 0
for (k = 1; k <= NF; ++k) {
if (k == i) {
++j
} else if (j) {
$(k - j) = $k
}
}
NF -= j
}
{
string = $0
NF = 0
if (l = length(string)) {
for (;;) {
if (match(string, /^"[^"]+"\|/)) {
next_string = string
sub(/^"[^"]+"\|/, "", next_string)
}
else if (match(string, /^[^|]*\|/)) {
next_string = string
sub(/^[^|]*\|/, "", next_string)
}
else {
break
}
$(++NF) = substr(string, 1, l - length(next_string) - 1)
string = next_string
l = length(string)
}
$(++NF) = string
}
#
# Do anything with $<any> here e.g. $3 = "". or delete_column 3 - which deletes it and not just set it to null value.
#
print
}
Last edited by konsolebox; 08-10-2013 at 01:19 AM.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.