Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place! |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
|
08-29-2017, 12:33 PM
|
#1
|
Member
Registered: May 2011
Posts: 85
Rep:
|
issue with awk
Hi Team
Am new to unix/linus
I am trying to use awk to pull certain cols but having trouble.
Below is the data in my faile.
Code:
IDNU KEY_NAM SIZE
--------------------------------
DB901 TAB_A 9.8 GB
DB890 TAB _A_1 1.1 GB
DB797 T _A _1 0.1 GB
cat file.log| awk '{ print $1";"$2}'
am only getting TAB_A, TAB, T but missing others in column 2
i tried using sed 's/\s\s * / /g' but it only works for TAB _A_1 but not others
any other better idea to pull whole column 2
|
|
|
08-29-2017, 12:36 PM
|
#2
|
LQ Guru
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,518
|
Are the columns separated by tabs?
Code:
awk '{ print $1,$2; }' FS="\t" OFS=";" file.log
|
|
|
08-29-2017, 12:44 PM
|
#3
|
Member
Registered: May 2011
Posts: 85
Original Poster
Rep:
|
Nope they are not tab separated.
|
|
|
08-29-2017, 01:15 PM
|
#4
|
Senior Member
Registered: Feb 2011
Location: Massachusetts, USA
Distribution: Fedora
Posts: 4,209
|
I think you will just have to count columns.
Code:
awk '{c2=substr($0,11,10); print $1 ";" c2}' foo
IDNU;KEY_NAM
--------------------------------;----------
DB901;TAB_A
DB890;TAB _A_1
DB797;T _A _1
|
|
|
08-29-2017, 01:15 PM
|
#5
|
LQ Guru
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,518
|
You'll have to split or adjust the columns based on their absolute position then.
If you use sed first, then it could be like this:
Code:
sed -E -e 's/^(.{9}) (.{10}) /\1\t\2\t/; s/ +\t/\t/g;' file.txt | awk . . . FS="\t"
Or you could do it all in awk somewhat differently. Probably the substr() function would be of use there. Or if you have Gnu awk (gawk) you can use the FIELDWIDTHS variable.
Which version of awk do you have?
|
|
|
08-29-2017, 02:03 PM
|
#6
|
LQ Guru
Registered: Oct 2003
Location: Bonaire, Leeuwarden
Distribution: Debian /Jessie/Stretch/Sid, Linux Mint DE
Posts: 5,195
|
If you are using Gawk (and you should), there is a solution for exactly that:
https://www.gnu.org/software/gawk/ma...tant-Size.html
jlinkels
|
|
|
08-29-2017, 03:16 PM
|
#7
|
Member
Registered: May 2011
Posts: 85
Original Poster
Rep:
|
Thanks all. sed worked out for me
|
|
|
08-29-2017, 11:34 PM
|
#8
|
LQ Guru
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,518
|
No problem.
Can we still ask which version of awk you have? There is an all-awk solution or two also but the optimal way for that depends on which version you have.
|
|
|
08-30-2017, 03:19 AM
|
#9
|
LQ 5k Club
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware64-15.0
Posts: 6,449
|
To just get column 2, why not 'cut -b11-20'?
|
|
|
08-30-2017, 05:30 AM
|
#10
|
LQ Guru
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,518
|
Quote:
Originally Posted by allend
To just get column 2, why not 'cut -b11-20'?
|
That would work well unless there is some constraint to use only awk. If gawk is available, then the following would be a solution:
Code:
gawk '{print $1, $2, $3;}' OFS=";" FIELDWIDTHS="10 11 10" inputfile.txt
But the FIELDWIDTHS variable is not available for other versions of awk.
|
|
|
08-30-2017, 06:52 AM
|
#11
|
Senior Member
Registered: Dec 2011
Location: Simplicity
Distribution: Mint/MATE
Posts: 2,927
|
Also consider unexpand to convert to tab-separated fields.
|
|
|
08-30-2017, 12:29 PM
|
#12
|
Member
Registered: May 2011
Posts: 85
Original Poster
Rep:
|
awk - 3.1.7
|
|
|
08-30-2017, 12:43 PM
|
#13
|
LQ Guru
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,518
|
Thanks. But is it mawk, gawk, nawk, or old awk? Only gawk has the FIELDWIDTHS built-in variable which can split columns based on width.
|
|
|
09-06-2017, 11:04 AM
|
#14
|
Senior Member
Registered: Dec 2011
Location: Simplicity
Distribution: Mint/MATE
Posts: 2,927
|
With unexpand and the input from post#1 you can convert space to TAB at certain positions,
then pipe to awk with a field separator set to TAB
Code:
unexpand -t 10,21 file.log | awk -F"\t" '{print $1,$2}'
IDNU KEY_NAM
--------------------------------
DB901 TAB_A
DB890 TAB _A_1
DB797 T _A _1
|
|
|
All times are GMT -5. The time now is 04:56 PM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|