LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 11-08-2011, 12:21 PM   #1
cristalp
Member
 
Registered: Aug 2011
Distribution: Linux Mint
Posts: 103

Rep: Reputation: Disabled
AWK: gsub ' or set ' as field separator


Dear Experts,

I have a file with only one line like:
Code:
Borel log 'J 35' Lombardia
what I need is to get the value J 35. So, I did two things.
First:
Code:
awk 'BEGIN {FS ="\'"}{print $2}' THELINEFILE
It returns me:
Code:
>
which asks me for input and it seems that the \' has something wrong and the awk did not finish by another '

Second:
Code:
awk '{gsub(/\'/, "");print $3, $4}' THELINEFILE
It returns me:
Code:
bash: syntax error near unexpected token `)'
I don't understand why there is such error.

Could anyone tell me how to do these two things in the right way and why I got those errors?

I would greatly appreciate to your help!

Last edited by cristalp; 11-08-2011 at 12:22 PM.
 
Old 11-08-2011, 12:35 PM   #2
crts
Senior Member
 
Registered: Jan 2010
Posts: 2,020

Rep: Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757
not awk's fault

Hi,

this is actually a bash "problem". Try it this way:
Code:
awk 'BEGIN {FS ="'\''"}{print $2}'
This the proper way to *escape* a single quote inside single quotes. If bash encounters an opening single quote then it does not do any interpretation of escape sequences until it encounters another closing single quote. So before your script is passed to awk, bash sees the first single quote and reads everything following until it encounters your second quote - which is interpreted as closing quote. The escaping backslash is ignored. So the last double-quote is interpreted as opening quote.

In order to escape a single quote inside single-quotes you first have to close the quote to enable bash's escape mechanism again. Then it can interpret an escaped single quote and after that you need to reopen the single-quote.

Last edited by crts; 11-10-2011 at 10:07 AM.
 
1 members found this post helpful.
Old 11-08-2011, 02:03 PM   #3
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
In alternative you can try the -F option:
Code:
awk -F"'" '{print $2}' file
or the cut command
Code:
cut -d"'" -f2 file
by enclosing the single quote within double quotes.
 
Old 11-08-2011, 02:57 PM   #4
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
And yet another (two) ways ...
Code:
awk 'BEGIN{FS="\x27"}{print $2}' cristalp 
J 35
awk 'BEGIN{FS="\047"}{print $2}' cristalp 
J 35


Cheers,
Tink

Last edited by Tinkster; 11-08-2011 at 03:00 PM. Reason: added decimal version
 
1 members found this post helpful.
Old 11-09-2011, 01:55 AM   #5
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
Take 3:
Code:
awk '{print $2}' FS="'" file
 
1 members found this post helpful.
Old 11-09-2011, 01:50 PM   #6
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Since you say there's only one line in the file:

Code:
IFS="'" read -a line <file && echo "${line[1]}"

Last edited by David the H.; 11-09-2011 at 01:56 PM.
 
1 members found this post helpful.
Old 11-10-2011, 04:25 AM   #7
cristalp
Member
 
Registered: Aug 2011
Distribution: Linux Mint
Posts: 103

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by crts View Post
Hi,

this is actually a bash "problem". Try it this way:
Code:
awk 'BEGIN {FS ="'\''"}{print $2}'
This the proper way to *escape* a single quote inside single quotes. If bash encounters an opening single quote then it does not do any interpretation of escape sequences until it encounters another closing single quote. So before your script is passed to awk, bash sees the first single quote and reads everything following until it encounters your second quote - which is interpreted as closing quote. The escaping backslash is ignored. So the last single quote is then again interpreted as opening quote. In order to escape a single quote inside single-quotes you first have to close the quote to enable bash's escape mechanism again. Then it can interpret an escaped single quote and after that you need to reopen the single-quote.
Thanks a lot for your detailed explanation crts! I tried your code. It works fine. But I am not sure if I really understand your explanation correctly. Please allow me ask a little question here.

In my mind, what you mean is that I need close the all the quotes appeared in a code. To do this, I though I need to give every single quote a partner to make them a pair. If I have a single quote leave along in the code, the bash will think it does not finish yet, so that the code is failed. Am I understanding right?

If so, the number of quote appearing in a code could only be even number, right?

But, it in your code:
Code:
awk 'BEGIN {FS ="'\''"}{print $2}'
It has 5 single quote which is an odd number. It seems that there is a conflict between your code and your explanation. I don't know if I understand you correctly. Could you please do some further explanations. I would appreciate to your kind help!
 
Old 11-10-2011, 08:48 AM   #8
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
You need to look up escaping (backslash prior to quote) in your bash reference.
 
Old 11-10-2011, 10:03 AM   #9
crts
Senior Member
 
Registered: Jan 2010
Posts: 2,020

Rep: Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757
Quote:
Originally Posted by cristalp View Post
If so, the number of quote appearing in a code could only be even number, right?
No, this is not correct. However, there is in fact an error in my previous explanation (corrected it by now). The character that is responsible for the unclosed open quote is the last double-quote and not the last single-quote as initially stated. Single quotes do not get treated special inside double-quotes. Otherwise, as soon as bash encounters an opening single-quote, it will deactivate its escape mechanism. So let us review your initial example:
Code:
awk 'BEGIN {FS ="\'"}{print $2}'
    ^OQ -> deactivate escape mechanism
                 ^ this backslash now loses its meaning as escape character
As you see, when the first opening quote (OQ) is encountered everything following it is passed as parameters to 'awk'. Even the backslash loses its meaning. Normally it would indicate that the character following it should not be interpreted as special. So the bold part is passed as is to 'awk'.
Code:
awk 'BEGIN {FS ="\'"}{print $2}'
                  ^CQ -> activate escape mechanism
                   ^ this double quote now regains its special meaning
Now the quote following the backslash is interpreted as closing quote (CQ). At this point bash is capable of interpreting other special characters like " and \ again. So the " is interpreted as an opening double-quote that never gets closed.
Let us examine
Code:
awk 'BEGIN {FS ="'\''"}{print $2}'
    ^OQ          ^CQ
                  ^ the backslash now has special meaning
Here the bold part is passed as is to awk because it is enclosed in single-quotes. As you see, the backslash now follows the closing quote. It is interpreted by bash as escape character because it is now not part of an open quote. Therefore, the next (third) single-quote is being escaped and not interpreted as opening quote. The fourth single-quote then again opens it.

To sum it up:
Code:
$ echo single-quote \'  # Just prepend a \
single-quote '
$ echo 'single-quote '\'' inside single-quotes'  # Escape sequence: '\''
single-quote ' inside single-quotes
$ echo "single-quote ' inside double-quotes"  # no need to escape it at all
single-quote ' inside double-quotes

Last edited by crts; 11-10-2011 at 10:12 AM.
 
1 members found this post helpful.
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to print the field separator in awk? 915086731 Linux - Desktop 5 09-04-2011 09:55 AM
awk gsub() command - string (column) manipulation - substitution casperdaghost Linux - Newbie 1 03-08-2010 02:12 AM
print FS (field separator) in awk wakatana Programming 5 11-05-2009 08:17 AM
how to keep Field Separator in AWK when using a sub statement tmcguinness Programming 4 02-09-2009 02:24 PM
My field separator changes when using awk Helene Programming 3 05-01-2004 08:10 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 08:17 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration