Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Location: a warm beach, cool ocean breeze, nice waves, and a Margaritta
Distribution: RHEL 5.5 Tikanga
Posts: 63
Rep:
how do i keep the header
hi guys,
i wrote an awk script that does the stuff below (filtering a file row by row based on criteria in each column) except it does not include the original files header, can someone please show me how to keep the header in the output file for those columns of data that are kept.
Code:
BEGIN {
FS = ' '
}
{
if ($3=="42" && $5=="the answer to the universe")
printf("%f %f %d %f %f %s\, $1, $2, $3, $11, $12, $5)
}
END{}
I often would like to know how to do that too. But it seems that all these editing tools will only treat all lines equally, with the same rules or commands applied to every selected line.
That leaves a two pass procedure as the most universal solution.
Get the headers to one temp file, the sorted to another temp file, then cat them back together.
Save the the fields of interest from the first record into some variable, if you eventually find anything to print, print the saved variable once (set a flag) then the record(s) to follow. print rather than printf as the header will be simple strings, and no need to use cat or any other external command.
the first is the field order, head -n1 is not much use since it won't 'reorder' the fields
another is highlighted by syg00
that is "do we want a header if we would have no data?"
To get round both use a single awk script
have the BEGIN 'capture' the header feilds to some variable,
now test each record
when condition is 'true' check the header variable,, if set print it and then unset it (or set it to null,e.g. Header=""), then print the data line, repeat with all records
should only get the header once, and only when there was actual output data
I am not sure exactly which header we are talking about, ie where it appears in the data (perhaps because no example data was provided (hint)).
However, if we are able to assume that the header is in fact the first row within the file, simply adding this criteria to the existing would do the trick.
I would add that the current setting of FS is also not required as white space is the default.
So it could just be:
Code:
NR == 1 || ($3=="42" && $5=="the answer to the universe"){printf("%f %f %d %f %f %s\, $1, $2, $3, $11, $12, $5)}
Tabby I see an issue prior to the solution. That being that you have more columns of data than you have of header. This means that once you pass the VERSION column, the header
and data become out of sync. Not sure if your current formatting has allowed for this??
Also, looking at your data, your reference in your format for printf to %f and %d will not match most of the data presented.
So I will leave these 2 issues to you, but the sort of thing I would look at doing is:
Code:
BEGIN{ fmt[1] = "%s %s %s %s %s\n" # header
fmt[2] = "<the format for other lines>"
}
NR == 1 || ( $3 == "SM" && $7 == 5 ){
printf(fmt[NR==1?1:2],<choose your columns here>)
}
Location: a warm beach, cool ocean breeze, nice waves, and a Margaritta
Distribution: RHEL 5.5 Tikanga
Posts: 63
Original Poster
Rep:
thanks sooo much grail !!!
yep, that's actually the way the files are after VERSION, the headers and the data columns don't line up 1 for 1 and when FAIL is set to TRUE then no more data is written to that line
i know i can't have a different numbers of arguments types in the format statement of fmt[2], but is there a way to "PAD" the control characters in fmt[1] ?
Unfortunately I seem to have lead you astray it seems for fmt[2]. It should be of the same format as fmt[1] but using different modifiers to display the data you need, like
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.