-   Linux - General (
-   -   Shell script to parse csv-like output, row by row (

utahnix 12-06-2007 06:30 PM

Shell script to parse csv-like output, row by row
Here goes...

I'm trying to write a shell script which parses the output of the Cyrus IMAP's quota command, which spits out quota usage information for each user. The idea is that if a user is over quota, certain actions will be taken by the script.

The output is space separated (in contrast to commas or tabs). Each row in the output constitutes a user.

Now, I can pull the values (quota usage, user name) I want out of each row with gawk with no problems. The problem I'm having is finding a way to iterate through the output, line-by-line (so that each line is an interation in the loop), and I'd prefer not saving temp files in this process.

I suppose I could write a perl script to do all of this, but given the situation (long story), I'd rather do it with a shell script.


weibullguy 12-06-2007 07:49 PM

Have you tried feeding your file to a while loop?

while read FIELD1 FIELD2 FIELD3; do

    some stuff based on $FIELD1
    some other stuff based on $FIELD2
    even more stuff based on $FIELD3
done < /path/to/input/file

May not be exactly what you need, but maybe it'll get you started.

utahnix 12-07-2007 09:05 PM

This is my problem:

I run this command:
su cyrus -c /usr/lib/cyrus/bin/quota

It spits formatted data like this:
Quota % Used Used Root
102400 0 0 user/
102400 24 333 user/
262144 84 220504 user/
102400 104 3 user/

I want a loop that will give me each row as a string per indice.

i.e. indice 1, the string would contain " 102400 0 0 user/". I can then process this line as needed.

I guess the other part of the question is, how do I get the info into the loop without saving it to a file? Do I do VAR=`cyrus_cmd_above` with the tick marks, and then somehow use that VAR in the loop? Or do I pipe the actual command in somehow?

I need very specific instructions. Help is greatly appreciated.

utahnix 12-07-2007 09:41 PM

Okay, after some research and experimentation, this is what I came up with (and it works great):

# Get the quota info in a variable
QUOTAS=`su cyrus -c /usr/lib/cyrus/bin/quota`

# Determine number of rows (users) that we have
NUMROWS=`echo -e "$QUOTAS" | wc -l`

# Begin user processing loop. We start with a value of 2 because the first line constitutes column headers
for i in `seq 2 $NUMROWS`; do

# Pull the next user out of the list
UINFO=`echo -e "$QUOTAS" | head -$i | tail -1`

# Pull the quota usage pct and email address from this record
pct=`echo $UINFO | gawk '{print $2}'`
addr=`echo $UINFO | gawk '{print $4}'`

# $pct = percentage of quota used (80% would mean $pct=80)
# $addr = "user/"

... blah, blah, blah (processing of pct and addr vars here)


The "echo -e" was key, as well as the combination of head and tail. Nuts, but it works really well for what I need it for.

rupertwh 12-07-2007 10:12 PM

Or you could do it *much* simpler, as weibullguy already pointed out:


while read qt pct used addr rest ; do
        test "$qt" = "Quota" && continue

        # do something with pct and addr here...

done < <(su cyrus -c /usr/lib/cyrus/bin/quota)

utahnix 12-08-2007 01:26 AM

I understand that the last line:

done < <(su cyrus -c /usr/lib/cyrus/bin/quota)

is way of directing the output of the quota command into the while loop. I understand the less than directing the data inward.

But what I don't understand (and if you can enlighten me, I would greatly appreciate it) is why there are two arrows and they are separated by spaces. Moreover, why the parenthesis?

I guess I don't see why I can't do this:

done < `su cyrus -c /usr/lib/cyrus/bin/quota`


I just want to understand what is happening instead of just blindly following directions.

That, or why can't I pipe su cyrus -c /usr/lib/cyrus/bin/quota to the while loop:

su cyrus -c /usr/lib/cyrus/bin/quota | while x x x x; do

jschiwal 12-08-2007 01:50 AM

You could do it that way as well. You left out the "read" command after while in your generalized description but I think I caught the meaning.

I will use the other form quite often at the end of cp and mv commands with the -i (interactive option):
< <(yes n)

This will automatically respond "n" to any question.

You can use this form "<( command statements )" to use the output of a command in place of a filename.
I will use this form to do things like sort the input file. For example, the "comm" command can provide common entries or unique entries in two files. It is required that the files be sorted.
So if file1 & file2 are already sorted, then you could use "comm -23 file1 file2" to print out items unique to file1. If they aren't presorted, you could use: "comm -23 <(sort file1) <(sort file2)".
This allows you to use the output of 2 programs as input to the command that expects 2 filenames.
You could use something like "sort file2 | comm -23 file1 -" if only file2 were unsorted, but the < <(command) form allows you to do this with both files. Plus the < <(...) is located at the same place that a file would be in the arguments making it readable once you learn what it is doing.

utahnix 12-08-2007 02:12 AM

Do you know of a good web resource that explains in greater detail what you are describing? I'm not sure what keywords to use. (thanks for all the info so far everyone)

jschiwal 12-08-2007 05:03 AM

For bash, the best is the Advanced Bash Scripting Guide on the website. ( The pdf file is named "abs-guide.pdf") It consists almost entirely of examples that you can try yourself, which is the best way to learn. Don't let the word "advanced" scare you. The work "Thorough" would be a better description. If there is something that you don't understand in the "info bash" manual, it will probably be explained with examples in the "abs-guide.pdf". Not having enough examples is the problem I have with a lot of documentation. After covering command substitution, there are 15 links to other scripts in the Guide.

One of the first chapters deals with special characters. You can look there to find out what && does, or <<.

All times are GMT -5. The time now is 11:19 AM.