LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   Reading files in shell scripting (https://www.linuxquestions.org/questions/linux-software-2/reading-files-in-shell-scripting-840295/)

barunparichha 10-25-2010 06:34 AM

Reading files in shell scripting
 
Hi,
I am trying to read the fields of a file and manipulate them, record by record.

Lets say using awk :
awk -F":" `{print $1 $2 $3 $4 $5}' TrackMsgFile.0806`

This prints my fields on screen.But I dont want to print these fields while reading the records instead store them in some variable and manipulate them as per my logic.

Does "awk" or some other shell command provides something for this ?

GrapefruiTgirl 10-25-2010 06:36 AM

Assign the results to a shell variable:
Code:

var=$(awk -F":" '{print $1 $2 $3 $4 $5}' TrackMsgFile.0806)

echo "$var"


GrapefruiTgirl 10-25-2010 06:40 AM

If you want each field as a separate var:
Code:

awk -F":" '{print $1,$2,$3,$4,$5}' TrackMsgFile.0806 | while read one two three four five; do
    echo "Here's your vars: $one $two $three $four $five"
done

Note the commas separating the output, so the variables aren't all jumbled together in one.

ghostdog74 10-25-2010 07:01 AM

Quote:

Originally Posted by GrapefruiTgirl (Post 4138484)
If you want each field as a separate var:
Code:

awk -F":" '{print $1,$2,$3,$4,$5}' TrackMsgFile.0806 | while read one two three four five; do
    echo "Here's your vars: $one $two $three $four $five"
done

Note the commas separating the output, so the variables aren't all jumbled together in one.

but the whole operation of awk+while is still iterating the file and processing them as it iterates. Therefore, the while loop really is redundant.

ghostdog74 10-25-2010 07:03 AM

@OP, you might want to define clearly what you want to do. If you want to store your lines in some variable for LATER use, use and array

Code:

awk '{array[++d]=$0}
{
 ....
 # some other processing here.
}
END{
  # here , you process your array
  for(i=1;i<=d;i++){
    print "do something with " array[i]
  }
}'


fbobraga 10-25-2010 07:10 AM

Quote:

Originally Posted by barunparichha (Post 4138478)
Does "awk" or some other shell command provides something for this ?

Or you can use a awk program to manipulate fields:

Code:

awk -F":" `{<do some stuff with $1-$5> ;print $1 $2 $3 $4 $5}' TrackMsgFile.0806`
there is a way of put the awk program externally, on a text file (prog.awk in the following example), with the -f switch:
Code:

awk -F":" -f prog.awk TrackMsgFile.0806`


see http://en.wikipedia.org/wiki/Awk

barunparichha 10-25-2010 08:26 AM

But lets say one record exists like
1::3::5

Here, I have omitted 2, 4 for explanation.
awk -F":" '{print $1,$2,$3,$4,$5}' TrackMsgFile.0806 | while read one two three four five;

3 will be stored for variable two, 5 for three.

Reason:
awk -F":" '{print $1,$2,$3,$4,$5}' TrackMsgFile.0806 , prints
1 3 5 (as 2nd, 4th filed are blank returns space character)

1 3 5 |while read one two three four five, takes

3,5 for variables two and three respectively.

=>
Is there a mechanism to avoid this ?


Likewise if it will show, wrong result for records like
1:2 3: 4:5 (the second filed contains "2 3") as we are separatinf fields using ":" as field separator)

GrapefruiTgirl 10-25-2010 08:38 AM

Code:

awk -F":" '{for (x=1;x<=5;x++){if($x==" "){$x="*"}};print $1,$2,$3,$4,$5}' TrackMsgFile.0806 | while read one two three four five; do
    echo "Here's your vars: $one $two $three $four $five"
done

Technically I'd have thought a space character would count as a character to be read into the appropriate variable, but for whatever reason (maybe multiple spaces count as one?) that doesn't happen.

There are mechanisms however, yes. For example, the above will make an "empty" field (a field that is a single space) return a "*" instead. So, you check your returned variables to see if it's a "*" and if so, then you know it was empty. Maybe a "*" is not the best choice of character to use - a better idea would be to return the word "EMPTY" instead. That would be sensible. :)

barunparichha 10-25-2010 08:53 AM

Assuming that, we dont use "*" in any of our records, the above code will work.

But it will still fail to parse records embedded with space.
1:2 3: 4:5 (the second field contains "2 3")

awk -F":" '{for (x=1;x<=5;x++){if($x==" "){$x="*"}};print $1,$2,$3,$4,$5}' TrackMsgFile.0806 | while read one two three four five; do
echo "Here's your vars: $one $two $three $four $five"
done


awk -F":" '{for (x=1;x<=5;x++){if($x==" "){$x="*"}};print $1,$2,$3,$4,$5}' TrackMsgFile.0806, returns
1 2 3 4 5
awk -F":" '{for (x=1;x<=5;x++){if($x==" "){$x="*"}};print $1,$2,$3,$4,$5}' TrackMsgFile.0806 | while read one two three four five
is same as
1 2 3 4 5 | while read one two three four five

But since field separator for read command is space " ", the above command takes:
two = 2
three =3




So, still we need some modifications.

GrapefruiTgirl 10-25-2010 09:12 AM

I don't really understand all of what you've written in the above post, especially the part about embedded whitespace; embedded whitespace should be irrelevant going into the awk, because the field separator is ":", not " ".

In the future, please use some formatting in your posts, and code tags ( http://www.phpbb.com/community/faq.php?mode=bbcode#f2r1 ) to show us code and output. Also it would help to show us exactly what your input file contains -- a few example lines would be good.

Now.. If I understand at all post #9, try this:
Code:

IFS=":"
awk -F":" '{OFS=":";for(x=1;x<=5;x++){if($x==" " || $x==""){$x="*"}};print $1,$2,$3,$4,$5}' TrackMsgFile.0806 | while read one two three four five; do
 echo "Here's your vars: '$one' '$two' '$three' '$four' '$five'"
done

If the problem you're describing still persists with that modification, you'll have to please re-explain what the problem is.
Note for demonstration purposes, I put single quotes around the outputted variables, to that any whitespace will be obvious.

barunparichha 10-25-2010 09:33 AM

I regret , if u r confused with post #9. There is nothing wrong with awk, but with the code "while read one two three four five". Please consider following example for clarification.
Let's say the input file TrackMsgFile contains a record like:

1:2 3: 4:5

Note, here the second field contains "2 3".
Now if u run the command

Code:

  awk -F":" '{print $1,$2,$3,$4,$5}' TrackMsgFile.0806 | while read one two three four five; do
    echo "two: $two"
done

Result:
two: 2

GrapefruiTgirl 10-25-2010 09:35 AM

OK, the use of $IFS in post #10 should fix that. Did you try it? Please tell me if that addresses this issue.

barunparichha 10-25-2010 09:58 AM

For record :
1:2 3: 4:5

Code:

awk -F":" '{IFS=":"; print $1,$2,$3,$4,$5}' TrackMsgFile.0806 | while read one two three four five; do
    echo "two: $two"
 done

shows:
two: 2

Hence incorrect.

GrapefruiTgirl 10-25-2010 10:02 AM

Re-read the code in post #10 carefully- what you just executed is not according to that code. IFS is a bash thing, not for use inside the AWK. OFS is for use in AWK.

And, notice I set IFS=":" before running the code.

barunparichha 10-25-2010 10:19 AM

Consider the input file TestFile contains four records:
TestFile:
1::2:3:4:5
4:5:6:7:8
7:8:9:10
7: 8 9:1:2:3

Code:

awk -F":" '{OFS=":"; print $1, $2, $3, $4, $5}'' TestFile

Returns :
1::2:3:4
4:5:6:7:8
7:8:9:10:
7: 8 9:1:2:3
::::

Therefore,

Code:

awk -F":" '{OFS=":";print $1,$2,$3,$4,$5}' TestFile | while read one two three four five; do echo "two : $two"; done

Returns
two :
two :
two :
two : 8
two :

So using "OFS" and "IFS" will not help. Using array with awk, might help.

GrapefruiTgirl 10-25-2010 10:48 AM

You are not paying attention to details! You did not set IFS in your second codeblock above. Watch:

Code:

root@reactor: awk -F":" '{OFS=":"; print $1, $2, $3, $4, $5}' TestFile
1::2:3:4
4:5:6:7:8
7:8:9:10:
7: 8 9:1:2:3
root@reactor: IFS=":"; awk -F":" '{OFS=":";print $1,$2,$3,$4,$5}' TestFile | while read one two three four five; do echo "two : $two"; done
two :
two : 5
two : 8
two :  8 9
root@reactor:

Now, output appears, to me anyway, to be correct. Is it not?

barunparichha 10-26-2010 01:41 AM

Code:

IFS=":"
awk -F":" '{OFS=":";for(x=1;x<=5;x++){if($x==" "){$x="*"}};print $1,$2,$3,$4,$5}' TrackMsgFile.0806 | while read one two three four five; do
 echo "Here's your vars: '$one' '$two' '$three' '$four' '$five'"
done

solves the issue raised in post #11.
But, could not solve the empty space issue addressed in post #8.

Thanks.

GrapefruiTgirl 10-26-2010 05:18 AM

OK, I thought the empty space issue was addressed by returning the "*" character when a field contains only an empty space. I did however adjust the code in post #10 so that if either a space or an empty field (between two colons) is detected, a "*" is returned.
If this doesn't address any remaining issue, then I'm afraid I'm missing something about this situation. If none of these ideas work, and the ideas presented by the other members in this thread, are of no help either, than it may be a good idea to demonstrate anew the requirements of this problem:

The way I understand it is: you have an input file containing 5 fields, delimetered by colons. Fields might be empty or contain a space (meaning it is empty), but the colon delimiters are always there. You want each of the 5 fields' values returned into a shell variable for processing each time a line from the file is read in. If a space is contained within one of the fields, it doesn't matter, right? I mean, if a field contains "John Smith", would you prefer to have returned "JohnSmith" or "John*Smith"?

So, I thought the requirements have been met. If there's something I've not got right in that, please correct me, and:

--Show us the exact input file.

The part about assigning the fields to the variables, is easy enough; the problem here seems to have to do with spaces, but I don't understand just what that problem is.

Thank you for you patience!

EDIT: Plus, if there's someone else reading this, who sees what I'm missing, by all means, jump in!


All times are GMT -5. The time now is 02:21 AM.