Reading files in shell scripting
Hi,
I am trying to read the fields of a file and manipulate them, record by record. Lets say using awk : awk -F":" `{print $1 $2 $3 $4 $5}' TrackMsgFile.0806` This prints my fields on screen.But I dont want to print these fields while reading the records instead store them in some variable and manipulate them as per my logic. Does "awk" or some other shell command provides something for this ? |
Assign the results to a shell variable:
Code:
var=$(awk -F":" '{print $1 $2 $3 $4 $5}' TrackMsgFile.0806) |
If you want each field as a separate var:
Code:
awk -F":" '{print $1,$2,$3,$4,$5}' TrackMsgFile.0806 | while read one two three four five; do |
Quote:
|
@OP, you might want to define clearly what you want to do. If you want to store your lines in some variable for LATER use, use and array
Code:
awk '{array[++d]=$0} |
Quote:
Code:
awk -F":" `{<do some stuff with $1-$5> ;print $1 $2 $3 $4 $5}' TrackMsgFile.0806` Code:
awk -F":" -f prog.awk TrackMsgFile.0806` see http://en.wikipedia.org/wiki/Awk |
But lets say one record exists like
1::3::5 Here, I have omitted 2, 4 for explanation. awk -F":" '{print $1,$2,$3,$4,$5}' TrackMsgFile.0806 | while read one two three four five; 3 will be stored for variable two, 5 for three. Reason: awk -F":" '{print $1,$2,$3,$4,$5}' TrackMsgFile.0806 , prints 1 3 5 (as 2nd, 4th filed are blank returns space character) 1 3 5 |while read one two three four five, takes 3,5 for variables two and three respectively. => Is there a mechanism to avoid this ? Likewise if it will show, wrong result for records like 1:2 3: 4:5 (the second filed contains "2 3") as we are separatinf fields using ":" as field separator) |
Code:
awk -F":" '{for (x=1;x<=5;x++){if($x==" "){$x="*"}};print $1,$2,$3,$4,$5}' TrackMsgFile.0806 | while read one two three four five; do There are mechanisms however, yes. For example, the above will make an "empty" field (a field that is a single space) return a "*" instead. So, you check your returned variables to see if it's a "*" and if so, then you know it was empty. Maybe a "*" is not the best choice of character to use - a better idea would be to return the word "EMPTY" instead. That would be sensible. :) |
Assuming that, we dont use "*" in any of our records, the above code will work.
But it will still fail to parse records embedded with space. 1:2 3: 4:5 (the second field contains "2 3") awk -F":" '{for (x=1;x<=5;x++){if($x==" "){$x="*"}};print $1,$2,$3,$4,$5}' TrackMsgFile.0806 | while read one two three four five; do echo "Here's your vars: $one $two $three $four $five" done awk -F":" '{for (x=1;x<=5;x++){if($x==" "){$x="*"}};print $1,$2,$3,$4,$5}' TrackMsgFile.0806, returns 1 2 3 4 5 awk -F":" '{for (x=1;x<=5;x++){if($x==" "){$x="*"}};print $1,$2,$3,$4,$5}' TrackMsgFile.0806 | while read one two three four five is same as 1 2 3 4 5 | while read one two three four five But since field separator for read command is space " ", the above command takes: two = 2 three =3 So, still we need some modifications. |
I don't really understand all of what you've written in the above post, especially the part about embedded whitespace; embedded whitespace should be irrelevant going into the awk, because the field separator is ":", not " ".
In the future, please use some formatting in your posts, and code tags ( http://www.phpbb.com/community/faq.php?mode=bbcode#f2r1 ) to show us code and output. Also it would help to show us exactly what your input file contains -- a few example lines would be good. Now.. If I understand at all post #9, try this: Code:
IFS=":" Note for demonstration purposes, I put single quotes around the outputted variables, to that any whitespace will be obvious. |
I regret , if u r confused with post #9. There is nothing wrong with awk, but with the code "while read one two three four five". Please consider following example for clarification.
Let's say the input file TrackMsgFile contains a record like: 1:2 3: 4:5 Note, here the second field contains "2 3". Now if u run the command Code:
awk -F":" '{print $1,$2,$3,$4,$5}' TrackMsgFile.0806 | while read one two three four five; do two: 2 |
OK, the use of $IFS in post #10 should fix that. Did you try it? Please tell me if that addresses this issue.
|
For record :
1:2 3: 4:5 Code:
awk -F":" '{IFS=":"; print $1,$2,$3,$4,$5}' TrackMsgFile.0806 | while read one two three four five; do two: 2 Hence incorrect. |
Re-read the code in post #10 carefully- what you just executed is not according to that code. IFS is a bash thing, not for use inside the AWK. OFS is for use in AWK.
And, notice I set IFS=":" before running the code. |
Consider the input file TestFile contains four records:
TestFile: 1::2:3:4:5 4:5:6:7:8 7:8:9:10 7: 8 9:1:2:3 Code:
awk -F":" '{OFS=":"; print $1, $2, $3, $4, $5}'' TestFile Code:
awk -F":" '{OFS=":";print $1,$2,$3,$4,$5}' TestFile | while read one two three four five; do echo "two : $two"; done |
You are not paying attention to details! You did not set IFS in your second codeblock above. Watch:
Code:
root@reactor: awk -F":" '{OFS=":"; print $1, $2, $3, $4, $5}' TestFile |
Code:
IFS=":" But, could not solve the empty space issue addressed in post #8. Thanks. |
OK, I thought the empty space issue was addressed by returning the "*" character when a field contains only an empty space. I did however adjust the code in post #10 so that if either a space or an empty field (between two colons) is detected, a "*" is returned.
If this doesn't address any remaining issue, then I'm afraid I'm missing something about this situation. If none of these ideas work, and the ideas presented by the other members in this thread, are of no help either, than it may be a good idea to demonstrate anew the requirements of this problem: The way I understand it is: you have an input file containing 5 fields, delimetered by colons. Fields might be empty or contain a space (meaning it is empty), but the colon delimiters are always there. You want each of the 5 fields' values returned into a shell variable for processing each time a line from the file is read in. If a space is contained within one of the fields, it doesn't matter, right? I mean, if a field contains "John Smith", would you prefer to have returned "JohnSmith" or "John*Smith"? So, I thought the requirements have been met. If there's something I've not got right in that, please correct me, and: --Show us the exact input file. The part about assigning the fields to the variables, is easy enough; the problem here seems to have to do with spaces, but I don't understand just what that problem is. Thank you for you patience! EDIT: Plus, if there's someone else reading this, who sees what I'm missing, by all means, jump in! |
All times are GMT -5. The time now is 02:21 AM. |