LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 08-27-2012, 10:12 AM   #1
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fedora-35
Posts: 5,313

Rep: Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918
weird awk behavior


hi, i am trying to run this command on this sample input. i am also expecting the lines in red to be part of the output but for some reason they are not getting outputted:
Code:
[schneidz@hyper ~]$ cat test.lst
L  180 11000000   :     chun-li                            :     y 
L  180 11000000   :     chun-li                            :     n 
L  180 11000000   :     akuma                              :     y 
L  180 11000000   :     l33t                               :     y 
L  180 11000000   :     h4x0rz                             :     n 
L  180 11000000   :     hello                              :     y 
L  180 11000000   :     world                              :     n 
L  180 11000000   :     chun-li                            :     n 
[schneidz@hyper ~]$ awk 'index($0,"n") == 66 {print $0}' test.lst
L  180 11000000   :     h4x0rz                             :     n 
L  180 11000000   :     world                              :     n
i think its because there is an n in chun-li but that shouldnt matter since it is not the 66th byte in the record ?

thanks,

Last edited by schneidz; 08-27-2012 at 10:17 AM.
 
Old 08-27-2012, 10:39 AM   #2
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,780

Rep: Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214
Some of that spacing might have been done with tabs. You can't distinguise that on the screen, and it will affect the character count. See what this reveals:
Code:
tr '\t' '@' <test.lst
Relying on precise character positions in formatted output is often unreliable. Couldn't you just do this instead?
Code:
awk '$7 == "n" {print}' test.lst
 
Old 08-27-2012, 10:41 AM   #3
firstfire
Member
 
Registered: Mar 2006
Location: Ekaterinburg, Russia
Distribution: Debian, Ubuntu
Posts: 709

Rep: Reputation: 428Reputation: 428Reputation: 428Reputation: 428Reputation: 428
Hi.
Quote:
i think its because there is an n in chun-li
Apparently, yes:
Code:
$ echo banana|  awk '{print index($0,"n")}'
3
Quote:
but that shouldnt matter since it is not the 66th byte in the record ?
No. index() returns the index first (leftmost) occurrence of the string. For "chun-li" index() returns something like 28, so the condition is not met and the line is not printed.

Maybe try this:
Code:
$ awk '$7~/n/' in2 
L  180 11000000   :     chun-li                            :     n 
L  180 11000000   :     h4x0rz                             :     n 
L  180 11000000   :     world                              :     n 
L  180 11000000   :     chun-li                            :     n

Last edited by firstfire; 08-27-2012 at 10:45 AM.
 
1 members found this post helpful.
Old 08-27-2012, 10:46 AM   #4
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
Quote:
Originally Posted by schneidz View Post
... for some reason they are not getting outputted ...
Your awk works correctly on my machine. Perhaps your actual data contains tab characters which look like blanks and that causes confusion. There might be a simpler way to extract the "n" lines. Is that character always the right-most character in each line? Or the right-most non-blank character?

Daniel B. Martin
 
Old 08-27-2012, 10:58 AM   #5
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fedora-35
Posts: 5,313

Original Poster
Rep: Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918
Quote:
Originally Posted by rknichols View Post
Some of that spacing might have been done with tabs. You can't distinguise that on the screen, and it will affect the character count. See what this reveals:
Code:
tr '\t' '@' <test.lst
Code:
[schneidz@hyper ~]$ tr '\t' '@' <test.lst
L  180 11000000   :     chun-li                            :     y 
L  180 11000000   :     chun-li                            :     n 
L  180 11000000   :     akuma                              :     y 
L  180 11000000   :     l33t                               :     y 
L  180 11000000   :     h4x0rz                             :     n 
L  180 11000000   :     hello                              :     y 
L  180 11000000   :     world                              :     n 
L  180 11000000   :     chun-li                            :     n
Quote:
Originally Posted by rknichols View Post
Relying on precise character positions in formatted output is often unreliable. Couldn't you just do this instead?
Code:
awk '$7 == "n" {print}' test.lst
that would be ideal but some records would have chun li instead of chun-li.
 
Old 08-27-2012, 10:59 AM   #6
firstfire
Member
 
Registered: Mar 2006
Location: Ekaterinburg, Russia
Distribution: Debian, Ubuntu
Posts: 709

Rep: Reputation: 428Reputation: 428Reputation: 428Reputation: 428Reputation: 428
Quote:
Originally Posted by schneidz View Post
thats would be ideal but some records would have chun li instead of chun-li.
Code:
$ awk -F ":"  '$3~/n/' in2 
L  180 11000000   :     chun-li                            :     n 
L  180 11000000   :     h4x0rz                             :     n 
L  180 11000000   :     world                              :     n 
L  180 11000000   :     chun-li                            :     n
 
1 members found this post helpful.
Old 08-27-2012, 11:07 AM   #7
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fedora-35
Posts: 5,313

Original Poster
Rep: Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918
Quote:
Originally Posted by danielbmartin View Post
Your awk works correctly on my machine. Perhaps your actual data contains tab characters which look like blanks and that causes confusion. There might be a simpler way to extract the "n" lines. Is that character always the right-most character in each line? Or the right-most non-blank character?

Daniel B. Martin
Code:
[schneidz@hyper ~]$ awk --version
GNU Awk 3.1.8
Copyright (C) 1989, 1991-2010 Free Software Foundation.
...
yes this is from db2-sql output, i added : surrounding column-1 to help with post-processing so maybe i can added another string in the sql export, or use awk -F : '{print $3' or maybe n$...

thnks,
 
Old 08-27-2012, 11:22 AM   #8
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,913

Rep: Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318
or
Code:
awk '$NF == "n"' file
Quote:
Originally Posted by schneidz
i think its because there is an n in chun-li but that shouldnt matter since it is not the 66th byte in the record ?
index($0, "n") will return the position of the first n,
so index($0, "n") == 66 does not check the 66th char in the line (if it was an n), but the first n (if it was the 66th char)
 
2 members found this post helpful.
Old 08-27-2012, 01:38 PM   #9
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fedora-35
Posts: 5,313

Original Poster
Rep: Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918
fyi, i can hack it into this:
Code:
[schneidz@hyper ~]$ awk 'index($0," n") == 65 {print $0}' test.lst
L  180 11000000   :     chun-li                            :     n 
L  180 11000000   :     h4x0rz                             :     n 
L  180 11000000   :     world                              :     n 
L  180 11000000   :     chun li                            :     n
but who knows when i hit a record that reads like chu n-li...
 
Old 08-27-2012, 02:39 PM   #10
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,780

Rep: Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214
The way to ask specifically about the 66th character is:
Code:
awk 'substr($0,66,1) == "n" {print}' test.lst
That extracts a string 1 character in length beginning at position 66 and compares it to "n".
 
2 members found this post helpful.
Old 08-27-2012, 03:52 PM   #11
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Quote:
Originally Posted by rknichols View Post
See what this reveals:
Code:
tr '\t' '@' <test.lst
An easier way is to just use cat -A to view all non-printing characters. Any tab characters will be displayed with the caret notation "^I".
 
1 members found this post helpful.
Old 08-27-2012, 04:25 PM   #12
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
Quote:
Originally Posted by rknichols View Post
The way to ask specifically about the 66th character is:
Code:
awk 'substr($0,66,1) == "n" {print}' test.lst
That extracts a string 1 character in length beginning at position 66 and compares it to "n".
A variation on the same theme...
Code:
awk -F "" '$66~/n/' $InFile
Daniel B. Martin
 
1 members found this post helpful.
Old 08-28-2012, 07:39 AM   #13
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fedora-35
Posts: 5,313

Original Poster
Rep: Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918
Quote:
Originally Posted by danielbmartin View Post
A variation on the same theme...
Code:
awk -F "" '$66~/n/' $InFile
Daniel B. Martin
fyi, this worx on fedora but not on aix.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
awk sub odd behavior! fantasy1215 Programming 7 07-10-2012 07:55 AM
awk strange behavior in bash bingmou Linux - Software 7 09-18-2008 11:27 AM
Weird behavior!! surfer41 Linux - Networking 1 04-25-2006 07:53 AM
Weird behavior Bassy Linux - Software 2 10-20-2005 01:32 PM
Weird Knode behavior PapaNoHair Mandriva 0 11-09-2003 06:39 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 09:36 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration