Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to
LinuxQuestions.org , a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free.
Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please
contact us . If you need to reset your password,
click here .
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a
virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month.
Click here for more info.
10-27-2011, 10:52 AM
#1
LQ Newbie
Registered: Oct 2011
Posts: 9
Rep:
awk extract different parts per line
Hi
I've tried to extract from this lines
Code:
2011-06-26 23:59:56.746#11.203.11.174#9CshTHrcK1jvNCjbpX2kx1SK2SW2phsCm041N2yr4hSLFPJWPdM9#advId=103613446#lang=de
2011-06-26 23:59:56.888#11.203.11.174#2QtTTHycfL1rcy1msP2S1NkbLsvrlpTthm6yKbmnswgLgLHjwbNp#a=default#advId=103659208
2011-06-26 23:59:57.202#11.203.11.174#gGwcTHrdwrxKPS56v7TLxwnN0HsKSpHmGvJc1Vw1t7NyBJJMBvFw#advId=103562066#lang=fr
2011-06-26 23:59:57.908#11.20.11.174#dw4TTHrdQ2M5Y8ypkSvPnFVjVQpKLhJfGQpVD7NyScJPsKqvtWR1#advId=103661409#lang=de
2011-06-26 23:59:57.950#11.203.11.174#WtDDTHrdmmP4SGB2c6d06qXlYf41cTXk0Q2p4VBL5nhDvzjT5NpK#a=default#advId=103613809
2011-06-26 23:59:56.745#111.203.11.174#9CshTHrcK1jvNCjbpX2kx1SK2SW2phsCm041N2yr4hSLFPJWPdM9#advId=103613446#lang=de
2011-06-26 23:59:58.141#111.203.11.174#2QtTTHycfL1rcy1msP2S1NkbLsvrlpTthm6yKbmnswgLgLHjwbNp#a=default#advId=103659208
2011-06-26 23:59:58.270#11.203.11.174#wt8LTHpTQRv6MTwVLSG9WpNT7hLhChj3Kf1DxTMHR2bmTN4Jp1tM#a=default#advId=103655548
2011-06-26 23:59:58.549#11.21.11.174#gmHnTHrpBNQLWBq94rT0QH5LGXtJ9hGqvrGb3yN0drFdP9vc0Qgj#a=default#advId=103613004
2011-06-26 23:59:59.251#125.3.11.174#NqvFTHrfnXFtdYvT3sMyBG3wjhHnyGHJp4rpNBSRjQwzXn65jVhH#advId=103660045#lang=de
2011-06-26 23:59:59.686#11.23.11.4#wt8LTHpTQRv6MTwVLSG9WpNT7hLhChj3Kf1DxTMHR2bmTN4Jp1tM#a=default#advId=103655548
the part "date", IP-adress, Session-ID and the number of the field "advId=" which can be anywhere in the line after the session-ID.
Result should look like this
Code:
2011-06-26 23:59:59.686#11.203.11.174#wt8LTHpTQRv6MTwVLSG9WpNT7hLhChj3Kf1DxTMHR2bmTN4Jp1tM#103655548
Any help would be appreciated.
Kind regards.
---------------
Thanks to remind me the sample record was wrong, I corrected the last line.
I can guarantee the string up to the Session-ID like this
Code:
2011-06-26 23:59:59.686#11.203.11.174#wt8LTHpTQRv6MTwVLSG9WpNT7hLhChj3Kf1DxTMHR2bmTN4Jp1tM#
2011-06-26 23:59:58.549#11.21.11.174#gmHnTHrpBNQLWBq94rT0QH5LGXtJ9hGqvrGb3yN0drFdP9vc0Qgj#
2011-06-26 23:59:59.251#125.3.11.174#NqvFTHrfnXFtdYvT3sMyBG3wjhHnyGHJp4rpNBSRjQwzXn65jVhH#
2011-06-26 23:59:59.686#11.23.11.4#wt8LTHpTQRv6MTwVLSG9WpNT7hLhChj3Kf1DxTMHR2bmTN4Jp1tM#
Last edited by swissmac; 10-28-2011 at 02:54 AM .
10-27-2011, 11:07 AM
#2
LQ Guru
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,005
Well I would say the first column returned is an unusual looking date??
What field positions can you guarantee?
10-28-2011, 01:21 AM
#3
LQ Newbie
Registered: Oct 2011
Posts: 9
Original Poster
Rep:
Thanks to remind me the sample record was wrong, I corrected the last line.
I can guarantee the string up to the Session-ID like this
Code:
2011-06-26 23:59:59.686#11.203.11.174#wt8LTHpTQRv6MTwVLSG9WpNT7hLhChj3Kf1DxTMHR2bmTN4Jp1tM#
2011-06-26 23:59:58.549#11.21.11.174#gmHnTHrpBNQLWBq94rT0QH5LGXtJ9hGqvrGb3yN0drFdP9vc0Qgj#
2011-06-26 23:59:59.251#125.3.11.174#NqvFTHrfnXFtdYvT3sMyBG3wjhHnyGHJp4rpNBSRjQwzXn65jVhH#
2011-06-26 23:59:59.686#11.23.11.4#wt8LTHpTQRv6MTwVLSG9WpNT7hLhChj3Kf1DxTMHR2bmTN4Jp1tM#
ist allways the same.
Last edited by swissmac; 10-28-2011 at 02:54 AM .
10-28-2011, 01:55 AM
#4
Bash Guru
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852
Could you please enclose the text in
[code][/code] tags, to preserve formatting and to keep the screen from side-scrolling? Thanks.
1 members found this post helpful.
10-28-2011, 02:37 AM
#5
LQ Newbie
Registered: Oct 2011
Posts: 9
Original Poster
Rep:
Quote:
Originally Posted by
David the H.
Could you please enclose the text in
[code][/code] tags, to preserve formatting and to keep the screen from side-scrolling? Thanks.
Sorry, I just change it.
10-28-2011, 03:02 AM
#6
Moderator
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Code:
awk -F"#" '{adv=gensub(/.*#advId=([^#]+).*/,"\\1",1,$0); print $1"#"$2"#"$3"#"adv }' swissmac
2011-06-26 23:59:56.746#11.203.11.174#9CshTHrcK1jvNCjbpX2kx1SK2SW2phsCm041N2yr4hSLFPJWPdM9#103613446
2011-06-26 23:59:56.888#11.203.11.174#2QtTTHycfL1rcy1msP2S1NkbLsvrlpTthm6yKbmnswgLgLHjwbNp#103659208
2011-06-26 23:59:57.202#11.203.11.174#gGwcTHrdwrxKPS56v7TLxwnN0HsKSpHmGvJc1Vw1t7NyBJJMBvFw#103562066
2011-06-26 23:59:57.908#11.203.11.174#dw4TTHrdQ2M5Y8ypkSvPnFVjVQpKLhJfGQpVD7NyScJPsKqvtWR1#103661409
2011-06-26 23:59:57.950#11.203.11.174#WtDDTHrdmmP4SGB2c6d06qXlYf41cTXk0Q2p4VBL5nhDvzjT5NpK#103613809
2011-06-26 23:59:56.745#11.203.11.174#9CshTHrcK1jvNCjbpX2kx1SK2SW2phsCm041N2yr4hSLFPJWPdM9#103613446
2011-06-26 23:59:58.141#11.203.11.174#2QtTTHycfL1rcy1msP2S1NkbLsvrlpTthm6yKbmnswgLgLHjwbNp#103659208
2011-06-26 23:59:58.270#11.203.11.174#wt8LTHpTQRv6MTwVLSG9WpNT7hLhChj3Kf1DxTMHR2bmTN4Jp1tM#103655548
2011-06-26 23:59:58.549#11.203.11.174#gmHnTHrpBNQLWBq94rT0QH5LGXtJ9hGqvrGb3yN0drFdP9vc0Qgj#103613004
2011-06-26 23:59:59.251#11.203.11.174#NqvFTHrfnXFtdYvT3sMyBG3wjhHnyGHJp4rpNBSRjQwzXn65jVhH#103660045
2011-06-26 23:59:59.686#11.203.11.174#wt8LTHpTQRv6MTwVLSG9WpNT7hLhChj3Kf1DxTMHR2bmTN4Jp1tM#103655548
Cheers,
Tink
1 members found this post helpful.
10-28-2011, 03:03 AM
#7
LQ Guru
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509
Extracting fields with awk is a trivial task:
Code:
BEGIN {
FS = "#"
}
{
printf "%s#%s#%s#", $1, $2, $3
for ( i = 4; i <= NF; i++ )
if ( $i ~ /advId=/ ) {
sub(/advId=/,"",$i)
print $i
}
}
Edit: beaten by Tinkster with a more concise solution..
Last edited by colucix; 10-28-2011 at 03:04 AM .
1 members found this post helpful.
10-28-2011, 03:18 AM
#8
LQ Newbie
Registered: Oct 2011
Posts: 9
Original Poster
Rep:
Many thanks to both of you! great solutions.
One more thing, after I w'll have to sort the fields in the following order
adv-id, session-id, ip-address, timestamp
Code:
timestamp #ip-address #session-id #adv-id
2011-06-26 23:59:56.746#11.203.11.174#9CshTHrcK1jvNCjbpX2kx1SK2SW2phsCm041N2yr4hSLFPJWPdM9#103613446
2011-06-26 23:59:56.888#11.03.11.174#2QtTTHycfL1rcy1msP2S1NkbLsvrlpTthm6yKbmnswgLgLHjwbNp#103659208
2011-06-26 23:59:57.202#11.203.11.174#gGwcTHrdwrxKPS56v7TLxwnN0HsKSpHmGvJc1Vw1t7NyBJJMBvFw#103562066
2011-06-26 23:59:57.908#11.203.11.174#dw4TTHrdQ2M5Y8ypkSvPnFVjVQpKLhJfGQpVD7NyScJPsKqvtWR1#103661409
2011-06-26 23:59:57.950#11.203.11.174#WtDDTHrdmmP4SGB2c6d06qXlYf41cTXk0Q2p4VBL5nhDvzjT5NpK#103613809
2011-06-26 23:59:56.745#11.203.11.174#9CshTHrcK1jvNCjbpX2kx1SK2SW2phsCm041N2yr4hSLFPJWPdM9#103613446
2011-06-26 23:59:58.141#11.203.11.174#2QtTTHycfL1rcy1msP2S1NkbLsvrlpTthm6yKbmnswgLgLHjwbNp#103659208
2011-06-26 23:59:58.270#11.203.11.174#wt8LTHpTQRv6MTwVLSG9WpNT7hLhChj3Kf1DxTMHR2bmTN4Jp1tM#103655548
2011-06-26 23:59:58.549#11.203.11.4#gmHnTHrpBNQLWBq94rT0QH5LGXtJ9hGqvrGb3yN0drFdP9vc0Qgj#103613004
2011-06-26 23:59:59.251#11.263.11.26#NqvFTHrfnXFtdYvT3sMyBG3wjhHnyGHJp4rpNBSRjQwzXn65jVhH#103660045
2011-06-26 23:59:59.686#11.203.11.122#wt8LTHpTQRv6MTwVLSG9WpNT7hLhChj3Kf1DxTMHR2bmTN4Jp1tM#103655548
Thanks for your appreciated help!
Last edited by swissmac; 10-28-2011 at 04:04 AM .
Reason: changed the length of ip-address
10-28-2011, 03:21 AM
#9
LQ Guru
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,005
Well you could probably use awk with a field separator of # and then loop through the other fields till you find the one you need and get the necessary data.
1 members found this post helpful.
10-28-2011, 12:53 PM
#10
Moderator
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Quote:
Originally Posted by
swissmac
Many thanks to both of you! great solutions.
One more thing, after I w'll have to sort the fields in the following order
adv-id, session-id, ip-address, timestamp
Code:
timestamp #ip-address #session-id #adv-id
2011-06-26 23:59:56.746#11.203.11.174#9CshTHrcK1jvNCjbpX2kx1SK2SW2phsCm041N2yr4hSLFPJWPdM9#103613446
2011-06-26 23:59:56.888#11.03.11.174#2QtTTHycfL1rcy1msP2S1NkbLsvrlpTthm6yKbmnswgLgLHjwbNp#103659208
2011-06-26 23:59:57.202#11.203.11.174#gGwcTHrdwrxKPS56v7TLxwnN0HsKSpHmGvJc1Vw1t7NyBJJMBvFw#103562066
2011-06-26 23:59:57.908#11.203.11.174#dw4TTHrdQ2M5Y8ypkSvPnFVjVQpKLhJfGQpVD7NyScJPsKqvtWR1#103661409
2011-06-26 23:59:57.950#11.203.11.174#WtDDTHrdmmP4SGB2c6d06qXlYf41cTXk0Q2p4VBL5nhDvzjT5NpK#103613809
2011-06-26 23:59:56.745#11.203.11.174#9CshTHrcK1jvNCjbpX2kx1SK2SW2phsCm041N2yr4hSLFPJWPdM9#103613446
2011-06-26 23:59:58.141#11.203.11.174#2QtTTHycfL1rcy1msP2S1NkbLsvrlpTthm6yKbmnswgLgLHjwbNp#103659208
2011-06-26 23:59:58.270#11.203.11.174#wt8LTHpTQRv6MTwVLSG9WpNT7hLhChj3Kf1DxTMHR2bmTN4Jp1tM#103655548
2011-06-26 23:59:58.549#11.203.11.4#gmHnTHrpBNQLWBq94rT0QH5LGXtJ9hGqvrGb3yN0drFdP9vc0Qgj#103613004
2011-06-26 23:59:59.251#11.263.11.26#NqvFTHrfnXFtdYvT3sMyBG3wjhHnyGHJp4rpNBSRjQwzXn65jVhH#103660045
2011-06-26 23:59:59.686#11.203.11.122#wt8LTHpTQRv6MTwVLSG9WpNT7hLhChj3Kf1DxTMHR2bmTN4Jp1tM#103655548
Thanks for your appreciated help!
For the current working set this would do:
Code:
awk -F"#" '{adv=gensub(/.*#advId=([^#]+).*/,"\\1",1,$0); print $1"#"$2"#"$3"#"adv }' swissmac | sort -t# -k4,4n -k3,3 -k2,2 -k1,1
2011-06-26 23:59:57.202#11.203.11.174#gGwcTHrdwrxKPS56v7TLxwnN0HsKSpHmGvJc1Vw1t7NyBJJMBvFw#103562066
2011-06-26 23:59:58.549#11.203.11.174#gmHnTHrpBNQLWBq94rT0QH5LGXtJ9hGqvrGb3yN0drFdP9vc0Qgj#103613004
2011-06-26 23:59:56.745#11.203.11.174#9CshTHrcK1jvNCjbpX2kx1SK2SW2phsCm041N2yr4hSLFPJWPdM9#103613446
2011-06-26 23:59:56.746#11.203.11.174#9CshTHrcK1jvNCjbpX2kx1SK2SW2phsCm041N2yr4hSLFPJWPdM9#103613446
2011-06-26 23:59:57.950#11.203.11.174#WtDDTHrdmmP4SGB2c6d06qXlYf41cTXk0Q2p4VBL5nhDvzjT5NpK#103613809
2011-06-26 23:59:58.270#11.203.11.174#wt8LTHpTQRv6MTwVLSG9WpNT7hLhChj3Kf1DxTMHR2bmTN4Jp1tM#103655548
2011-06-26 23:59:59.686#11.203.11.174#wt8LTHpTQRv6MTwVLSG9WpNT7hLhChj3Kf1DxTMHR2bmTN4Jp1tM#103655548
2011-06-26 23:59:56.888#11.203.11.174#2QtTTHycfL1rcy1msP2S1NkbLsvrlpTthm6yKbmnswgLgLHjwbNp#103659208
2011-06-26 23:59:58.141#11.203.11.174#2QtTTHycfL1rcy1msP2S1NkbLsvrlpTthm6yKbmnswgLgLHjwbNp#103659208
2011-06-26 23:59:59.251#11.203.11.174#NqvFTHrfnXFtdYvT3sMyBG3wjhHnyGHJp4rpNBSRjQwzXn65jVhH#103660045
2011-06-26 23:59:57.908#11.203.11.174#dw4TTHrdQ2M5Y8ypkSvPnFVjVQpKLhJfGQpVD7NyScJPsKqvtWR1#103661409
Of course, the IP address will be a problem ;}
1 members found this post helpful.
11-11-2011, 06:56 AM
#11
LQ Newbie
Registered: Oct 2011
Posts: 9
Original Poster
Rep:
You guys are great. Thanks a lot it worked perfect for me.
All times are GMT -5. The time now is 06:12 PM .
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know .
Latest Threads
LQ News