LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices



Reply
 
Search this Thread
Old 08-31-2011, 01:41 AM   #1
lolmon
LQ Newbie
 
Registered: Aug 2011
Posts: 2

Rep: Reputation: Disabled
awk with pipe delimited file (specific column matching and multiple pattern matching)


Hi all, so I am working on a bash script, and am currently stuck trying to figure out how to deal with the following file:

Code:
467487|field1|more fields|first pattern|stuff|470061|more text|text with spaces|None|Red|another_field|8/30/2011 5:12:30 PM|9/6/2011 5:12:30 PM|one more text field with spaces|651463|
468751|field1|more fields|first pattern|stff|470061|more text|text with spaces|None|Red|another_field|8/30/2011 1:12:30 AM|9/2/2011 5:12:30 PM|one more text field with spaces|651463|
450104|field1|more fields|Legend|text|4700621|more text|text with spaces|None|Red|another_field|8/30/2011 5:44:30 PM|9/1/2011 5:12:30 PM|one more text field with spaces|651463|
465318|field1|more fields|Legend|text|442061|more text|text with spaces|None|Red|another_field|8/21/2011 2:12:30 PM|9/3/2011 5:12:30 PM|one more text field with spaces|651463|
So, what I need to do is match not only 'first pattern' in 4th column, but the date in the 12th column. (the date that needs to me matched in this example is 8/30/2011 and that is stored in $today)
Assuming both of these patterns match, (in the specific column, as they exist in other columns I do not care about) I also need to increment a counter by one.

Now, assuming I have been learning right, I need something like this to start with:

awk -F'|' {
'$4=="first pattern" print $0 >> count.txt
}'

then, I did a linecount on count to find out how many. Now, I know there is a way to do counts in awk, so I dont need to waste clock cycles similar to adding { ++x }END { print x } at the end of my expressions, however, I do not know how I can combine all this so I can add to a counter if both '$4=="first pattern" and $12==$date.

Any help that could steer me in the right direction would be truly fantastic.

-lolmon
 
Old 08-31-2011, 01:56 AM   #2
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 23,005
Blog Entries: 11

Rep: Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903
Hi, welcome to LQ!

Almost there!

Code:
awk -F'|' -v date="$date" '$4=="first pattern" && $12==date {x++}END{print x}' file

Cheers,
Tink
 
1 members found this post helpful.
Old 08-31-2011, 02:20 AM   #3
lolmon
LQ Newbie
 
Registered: Aug 2011
Posts: 2

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by Tinkster View Post
Hi, welcome to LQ!

Almost there!

Code:
awk -F'|' -v date="$date" '$4=="first pattern" && $12==date {x++}END{print x}' file

Cheers,
Tink
Thank you so much! I cannot believe it was as simple as a &&, however I am still having trouble with the date section, if I remove it, it does pick up everything in column 4, but if I keep it in there, it finds nothing, is awk making sure that column 12 ONLY has the date, and since there is also hours mins and seconds for time, its not picking it up? And if so, what is the best way to fix it?

I tried doing ^date, but it does not like that format, something I would assume has to do with the ==, which I am looking up right now, but I figured I should post anyway and thank you for the first part.

EDIT: it is clearly that, I have removed the extra time info and the script works just as planned, so all I need to do is figure out how to tell it that that column only must start with the date...
EDIT2: it works if I use the ~ match op instead of ==, and that should be fine considering the data file will look like my example all the time anyway.

thanks again!

Last edited by lolmon; 08-31-2011 at 02:30 AM.
 
Old 08-31-2011, 03:56 AM   #4
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 23,005
Blog Entries: 11

Rep: Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903
Well ... the problem is that you need to populate the shell variable $date
with the exact time and date you're looking for. If it's just a date then,
yes, the '~' for the date part is adequate :}


Cheers,
Tink
 
1 members found this post helpful.
Old 08-31-2011, 01:17 PM   #5
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950
Incidentally, you could also do the same thing entirely in bash, using an array.

Code:
#!/bin/bash

file="inputfile.txt"
pattern="first pattern"
today="8/30/2011"
count=0

IFS="|"
while read -a line ; do

     if [[ "${line[3]}" == $pattern && "${line[11]}" == $today* ]]; then
          (( count++ ))
     fi

done <"$file"

echo "$count"
Bash's [[ test will let you use either globbing or regex in the tests, whichever suits your purposes better.

I imagine awk would still be more efficient for a large file, however.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] awk pattern matching philipz *BSD 1 05-05-2010 03:21 PM
Pattern matching in a text file - use of AWK?? wtaicken Programming 19 02-06-2009 06:54 PM
complicated pattern matching with awk or sed... alirezan1 Linux - Newbie 1 10-10-2008 07:45 PM
Help with pattern matching, sorting data with awk/gawk or perl placem Programming 2 09-11-2008 03:26 PM
AWK/SED Multiple pattern matching over multiple lines issue GigerMalmensteen Programming 15 12-03-2006 06:08 PM


All times are GMT -5. The time now is 06:58 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration