LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (http://www.linuxquestions.org/questions/programming-9/)
-   -   Replace a field for a whole line in the same file, with awk. (http://www.linuxquestions.org/questions/programming-9/replace-a-field-for-a-whole-line-in-the-same-file-with-awk-768515/)

amwink 11-11-2009 05:19 PM

Replace a field for a whole line in the same file, with awk.
 
Dear programmers,

I'm trying to write a script with awk that would replace a field in a line for an entire line in the same file.

E.g. if the field $3 contains the value "57", replace it for the content of line #57 of the same file.

Any suggestion?

lewc 11-11-2009 08:32 PM

use sed and sh, why bother with awk when the commnd is only one line long from terminal

ghostdog74 11-11-2009 08:52 PM

Quote:

Originally Posted by amwink (Post 3753460)
Dear programmers,

I'm trying to write a script with awk that would replace a field in a line for an entire line in the same file.

E.g. if the field $3 contains the value "57", replace it for the content of line #57 of the same file.

Any suggestion?

post what you have and what its not working. provide sample input and output. have you read awk docs?

ghostdog74 11-11-2009 08:54 PM

Quote:

Originally Posted by lewc (Post 3753615)
use sed and sh, why bother with awk when the commnd is only one line long from terminal

why bother with sed or sh when awk is much simpler to understand and it does the job of sed+sh. ?

lewc 11-12-2009 04:36 AM

A single question. Does Awk allow you to do this in one line, if so continue with awk, otherwise just keep it simple

jschiwal 11-12-2009 04:51 AM

The OP asked for an AWK solution.

Such a contrived problem seems like it is a homework assignment. So the OP should post what he has done so far, and we can help with that.

AWK is usually the best tool to use when the datafile consists of lines of fields. In this example, one of the fields will be replaced with contents from another line based on the value of a certain field, so this involves more than regular expression pattern matching & replacing which is where SED excels.

Amwink: For questions about awk and grep, it is usually necessary to provide a short input sample, and what the output should look like. The awk command will need to know what delimiter is used for example.

lewc 11-12-2009 05:12 AM

:S sorry I must have misunderstood, I thought he anted to replace a field on a particular line

ghostdog74 11-12-2009 05:14 AM

Quote:

Originally Posted by lewc (Post 3753985)
Does Awk allow you to do this in one line

of course.

lewc 11-12-2009 05:16 AM

I meant allow you to do this in one line of code, I know AWK operates on lines and fields, perhaps a rethink of my sentence structure :p

jschiwal 11-12-2009 05:19 AM

lewc: response to post #7

If I read the question correctly, if the third field of the fourth line, contains "57", then replace the third field with the contents of the entire record of the 57th line. I don't see how this would be useful, because you will be replacing a line (record) with N fields with a line with 2*N-1 fields. If $3 in a later line references an earlier line in the file, the number of records increases even more.

A general description of what the OP is trying to accomplish would be useful. Are the values in $3 of all the lines unique so that the number of fields in the output will be constant? What should happen if this is a constraint but is violated? Or does the input file have sections, such that you just process the first 50 lines and take the values from the next 50 lines?

ghostdog74 11-12-2009 05:39 AM

Quote:

Originally Posted by lewc (Post 3754029)
I meant allow you to do this in one line of code,

i say again, YES!

amwink 11-12-2009 02:04 PM

Thanks for all for the feedback.
Quote:

Originally Posted by ghostdog74 (Post 3753633)
post what you have and what its not working. provide sample input and output. have you read awk docs?

Yes, I have been reading about awk for quite a while. This is actually the second time I face the same problem in disparate applications. The first time was for a less important project, that would relate 2 databases for use with GMT tools. I ended up not working in this project after all and just gave up. Although this time I solved already the problem by other means (see below), I would still like to learn the principles for doing this in awk.

Quote:

Originally Posted by jschiwal (Post 3753996)
The OP asked for an AWK solution.

Such a contrived problem seems like it is a homework assignment. So the OP should post what he has done so far, and we can help with that.
[...]
Amwink: For questions about awk and grep, it is usually necessary to provide a short input sample, and what the output should look like. The awk command will need to know what delimiter is used for example.

It's not homework in the sense that I'm not taking classes anymore. It's for real life. The reason for not posting what I have is because I don't have anything to show so far. I couldn't even formulate a way to organise the code from the very beginning in terms of awk logic. It's not much a problem on syntax, but on the how organise the code, from which I could build the rest.

Quote:

Originally Posted by jschiwal (Post 3754033)
[...] Or does the input file have sections, such that you just process the first 50 lines and take the values from the next 50 lines?

Precisely. The first section of the file contains coordinates of the vertices of a mesh in a 3D space (x,y,z), one vertex per line. The second section of the file contains a list of triangular faces, one face per line, each with the index of the 3 vertices that it is made of. I can separate the two parts in separate files if needed, but even in that case, I still wouldn't know how to replace a vertex index, say vertex #57, for its coordinates triplet on line #57 (x,y,z).

The task is to compute the area of each face using an analytic formula, and print the sum of all faces. The idea is that after replacement, I would have, in each line of the second part, 9 values corresponding to the coordinates of the vetices A, B and C of each face (xA,yA,zA,xB,yB,zB,xC,yC,zC), from which then awk itself could be used to compute some determinants (the area of the projections of each triangle in each of orthogonal planes, XY, YZ and XZ), then the area of each face the 3D space, then the sum for all faces of the mesh, all simple maths. I'm sure there are other ways to get the same using awk, but it always ends up having to search for a value found one point in the file for its match elsewhere in the same file or another.

It can also be performed in a number of different ways using different languages, and since yesterday when I opened the thread, I already wrote a bulky Matlab function that does the job, so the actual need is solved. However, I'm sure awk would do the trick in a more elegant way, with a much shorter code, faster to run, and I could also submit to a cluster (i.e. a computer grid) to do it faster in large scale here.

Thanks again!

jschiwal 11-13-2009 07:51 AM

I would recommend downloading the source for the gawk package. Besides the usual make and make install targets, there is also a "make pdf" target to produce the book "GAWK: Effective AWK Programming". This book is produced from the .texi files used to produce the info files. The pdf (or ps) version is a very print worthy book and is an excellent guide for learning awk.

---

Your input file presents the references to the data before the data itself, which isn't read until the second section of the file. You would probably would produce your output document in the END{ ... } block.

Be sure to read the section on arrays in AWK. They are sparse associative arrays. You could build up one array in the first part of the file and then in the second section, enter values in a second array, and then in the END block, iterate through the first array, and use the value of $3 as the index of the second array.

Remember the AWK model. For each line of text, the awk script is executed. Since your file consists of two parts, you can use pattern matching (on the line of input) to determine which part of the file you are in. If the patterns of a line for parts are distint, your awk script could look like this:

BEGIN { ... }
/coordinates pattern/{
awk
commands
}
/faces pattern/{
awk
commands
}
END { awk commands to produce the output }

You can instead use ranges. Suppose that each section has a header, the the two parts have a blank line in between.
/coordinates header/,/^$/{ awk commands for coordinates section }
/faces section/,${ awk commands for faces section }
END {
process the values in your arrays and produce the output
}

Good Luck!


All times are GMT -5. The time now is 11:31 AM.