Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
ok, so now I got this, its producing some promising results but not what I was expecting.
Code:
#!/bin/bash
set -xv
MONTH=""
DAY=""
HOUR=""
DL_SIZE=0
DL_RATE=0
while read -a line; do
CURR_MONTH=${line[1]}
CURR_DAY=${line[2]}
CURR_HOUR=${line[3]}
CURR_DL_SIZE=${line[6]}
CURR_DL_RATE=${line[8]}
if [[ $CURR_MONTH == $MONTH ]] && [[ $CURR_DAY == $DAY ]] && [[ $CURR_HOUR == $HOUR ]]
then
$((DL_SIZE=$DL_SIZE+$CURR_DL_SIZE))
echo "DL_SIZE is $DL_SIZE"
$((DL_RATE=$DL_RATE+$CURR_DL_RATE))
echo "DL_RATE is $DL_RATE"
else # Update $MONTH and $DAY and $HOUR
if [[ $MONTH != "" ]]
then
echo "$MONTH $Day $HOUR total $DL_SIZE bytes \n" # introduce average DL rate later
fi
MONTH=${line[1]}
DAY=${line[2]}
HOUR=${line[3]}
fi
done < simplified.vsftpd.log
# If we have reched EOF, print out the final month
echo "$MONTH $Day $HOUR total $DL_SIZE bytes "
Here is the output:
Code:
./ftp_analyzer.sh: line 19: 208431030: command not found
DL_SIZE is 208431030 \n
./ftp_analyzer.sh: line 21: 47604: command not found
DL_RATE is 47604 \n
./ftp_analyzer.sh: line 19: 416860806: command not found
DL_SIZE is 416860806 \n
./ftp_analyzer.sh: line 21: 125911: command not found
DL_RATE is 125911 \n
./ftp_analyzer.sh: line 19: 625289612: command not found
DL_SIZE is 625289612 \n
./ftp_analyzer.sh: line 21: 211343: command not found
DL_RATE is 211343 \n
./ftp_analyzer.sh: line 19: 833720399: command not found
DL_SIZE is 833720399 \n
... skipping some of the repetitive stuff here
./ftp_analyzer.sh: line 19: 5658923994: command not found
DL_SIZE is 5658923994 \n
./ftp_analyzer.sh: line 21: 765490: command not found
DL_RATE is 765490 \n
Oct 13 total 5658923994 bytes \n
Oct 09 total 5658923994 bytes
So the DL_SIZE is incrementing as it should, but the output is formated weird and those command not found is very strange and I am skipping the first DL_RATE for some reason.
2. $() will expand the commands inside and return the data. () on there own will either create an array or a sub-shell depending on the usage. So $(()) is of no particular use if you are not
assigning to a variable:
Code:
$((DL_SIZE=$DL_SIZE+$CURR_DL_SIZE))
# should be
((DL_SIZE=$DL_SIZE+$CURR_DL_SIZE))
# or
(( DL_SIZE = DL_SIZE + CURR_DL_SIZE ))
# or
(( DL_SIZE += CURR_DL_SIZE ))
The white space I have used is not required but I think it just looks clearer
Now once you fix up the above you are going to receive a new error and it is due to you placing the wrong items from the array into your variables
So I am beginning to think bash isn't the best way to tackle my problem. But I do believe I am close.
This is my output now. I am not sure how to fix the last bit, where instead of output the Oct 06 data it just prints Oct 7 again. I think it has to do with the way the loop is structured. by the time I should be printing the Oct 6th line it has already run to the next line.
Code:
# ./ftp_analyzer.sh
Oct 7 13 total 5867354816 bytes
Oct 7 13 total 0 bytes
Oct 7 13 total 0 bytes
total 0 bytes
Below is what my code looks like now:
Code:
#!/bin/bash
#set -xv
#declare everything
MONTH=""
DAY=""
HOUR=""
DL_SIZE=0
DL_RATE=0
counter=0
isFirstLine=1
#if same month, same day and same hour, add the DL size and DL rate, DL rate for avg hourly transfer rate, which will be implemented later
while read -a line; do
if [[ $isFirstLine -eq 1 ]]
then # if this is the first line we are reading we want both current time and previous time to match
CURR_MONTH=${line[0]}
CURR_DAY=${line[1]}
CURR_HOUR=${line[2]}
MONTH=${line[0]}
DAY=${line[1]}
HOUR=${line[2]}
isFirstLine=0
else
#if this is not the first line then we just need to read the new time
CURR_MONTH=${line[0]}
CURR_DAY=${line[1]}
CURR_HOUR=${line[2]}
fi
CURR_DL_SIZE=${line[5]}
CURR_DL_RATE=${line[7]}
if [[ $CURR_MONTH -eq $MONTH && $CURR_DAY -eq $DAY && $CURR_HOUR -eq $HOUR ]]
then
((DL_SIZE=$DL_SIZE+$CURR_DL_SIZE))
((DL_RATE=$DL_RATE+$CURR_DL_RATE))
((counter=$counter+1))
else # Update the current $MONTH and $DAY and $HOUR
# we have hit a new time, print result of old size
echo "$MONTH $DAY $HOUR total $DL_SIZE bytes" # introduce average DL rate later
#reset DL Size and Rate counter
DL_SIZE=0
DL_RATE=0
counter=0
CURR_MONTH=${line[0]}
CURR_DAY=${line[1]}
CURR_HOUR=${line[2]}
fi
done < simplified.vsftpd.log
# If we have reched EOF, print out the final month
echo "$CURR_MONTH $CURR_DAY $CURR_HOUR total $DL_SIZE bytes"
Ok, I am getting a little lost now with your logic, so we will start at the top:
1. Testing for the first line is rather pointless when you think that you can simply set MONTH, DAY and HOUR at the bottom of the loop and it will have the same affect
2. Even if you keep the test for the first line, there is no need to repeat the setting of the current items in both parts of the 'if'. Simply perform the test for true and set those items and then set the current items after the 'if'
3. In the 'else' portion of the next 'if' you set the current items for a third time. I do not see that this is needed at all
4. Why is the final echo using different variables than the previous one? On leaving the while loop all the variables will have been set and both current and standard items will be set to the same values
5. I assume we are currently ignoring the download rate and counter as they are not used or displayed
6. You must be using an alternate file format as download size is not in the sixth column in previous examples, hence your new script does not work for me.
Thanks for your comments. I have been running the logic through my head as well and you are right, the check first line wasn't necessary. It was because I noticed while debugging that the first line wasn't being included so I added that specific statement when in fact it is just a matter of updating the right variables.
And I have been doing a bit sanitizing of the input file as well so to simplify the scripting logic.
#!/bin/bash
#need to target this per host
set -xv
#declare everything
MONTH=""
DAY=""
HOUR=""
CURR_MONTH=""
CURR_DAY=""
CURR_HOUR=""
DL_SIZE=0
DL_RATE=0
counter=0
#if same month, same day and same hour, add the DL size and DL rate, DL rate for avg hourly transfer rate, which will be implemented later
while read -a line; do
MONTH=${line[0]}
DAY=${line[1]}
HOUR=${line[2]}
CURR_DL_SIZE=${line[5]}
CURR_DL_RATE=${line[7]}
#$((10#$hour)) is due to bash treats 0xxx as octo based not base 10
if [[ $CURR_MONTH -eq $MONTH && $CURR_DAY -eq $DAY && $((10#$CURR_HOUR)) -eq $((10#$HOUR)) ]]
then
((DL_SIZE=$DL_SIZE+$CURR_DL_SIZE))
((DL_RATE=$DL_RATE+$CURR_DL_RATE))
((counter=$counter+1))
else # Update the current $MONTH and $DAY and $HOUR
# we have hit a new time, print result of old size
echo "$CURR_MONTH $CURR_DAY $CURR_HOUR total $DL_SIZE bytes" # introduce average DL rate later
#reset DL Size and Rate counter
DL_SIZE=0
DL_RATE=0
counter=0
CURR_MONTH=${line[0]}
CURR_DAY=${line[1]}
CURR_HOUR=${line[2]}
fi
done < simplified.vsftpd.log
# If we have reched EOF, print out the final month
echo "$CURR_MONTH $CURR_DAY $CURR_HOUR total $DL_SIZE bytes"
and here is the output without debugging
Code:
total 0 bytes
Oct 7 13 total 5658923994 bytes
Oct 6 09 total 1170448 bytes
total 0 bytes
So currently we just have one problems: It skips the first line and last line and prints that "total" statement and I am not sure how to stop it. Below is the result with debugging.
#!/bin/bash
#need to target this per host
set -xv
#declare everything
MONTH=""
DAY=""
HOUR=""
CURR_MONTH=""
CURR_DAY=""
CURR_HOUR=""
DL_SIZE=0
DL_RATE=0
counter=0
isFirstLine=1
#if same month, same day and same hour, add the DL size and DL rate, DL rate for avg hourly transfer rate, which will be implemented later
while read -a line; do
if [[ $isFirstLine -eq 1 ]]
then # if this is the first line we are reading we want both current time and previous time to match
CURR_MONTH=${line[0]}
CURR_DAY=${line[1]}
CURR_HOUR=${line[2]}
MONTH=${line[0]}
DAY=${line[1]}
HOUR=${line[2]}
isFirstLine=0
else
#if this is not the first line then we just need to read the new time
CURR_MONTH=${line[0]}
CURR_DAY=${line[1]}
CURR_HOUR=${line[2]}
fi
CURR_DL_SIZE=${line[5]}
CURR_DL_RATE=${line[7]}
#$((10#$hour)) is due to bash treats 0xxx as octo based not base 10
if [[ $CURR_MONTH -eq $MONTH && $CURR_DAY -eq $DAY && $((10#$CURR_HOUR)) -eq $((10#$HOUR)) ]]
then
((DL_SIZE=$DL_SIZE+$CURR_DL_SIZE))
((DL_RATE=$DL_RATE+$CURR_DL_RATE))
((counter=$counter+1))
else # Update the current $MONTH and $DAY and $HOUR
# we have hit a new time, print result of old size
echo "$MONTH $DAY $HOUR total $DL_SIZE bytes over $counter downloads" # introduce average DL rate later
#reset DL Size and Rate counter
DL_SIZE=0
DL_RATE=0
counter=0
isFirstLine=1
((DL_SIZE=$DL_SIZE+$CURR_DL_SIZE))
((DL_RATE=$DL_RATE+$CURR_DL_RATE))
((counter=$counter+1))
fi
done < simplified.vsftpd.log
output:
Code:
Oct 7 13 total 5867354816 bytes over 15 downloads
Oct 6 09 total 2340897 bytes over 2 downloads
./ftp_analyzer.sh: line 58: ((: DL_SIZE=0+: syntax error: operand expected (error token is "+")
./ftp_analyzer.sh: line 59: ((: DL_RATE=0+: syntax error: operand expected (error token is "+")
yeah I don't know why these two errors are happening. I think it is caused by hitting EOF or something, but everything else looks right
MONTH=""
DAY=""
HOUR=""
CURR_MONTH=
CURR_DAY=""
CURR_HOUR=""
DL_SIZE=0
DL_RATE=0
counter=0
#if same month, same day and same hour, add the DL size and DL rate, DL rate for avg hourly transfer rate, which will be implemented later
while read -a line; do
MONTH=${line[0]}
DAY=${line[1]}
HOUR=${line[2]}
#$((10#$hour)) is due to bash treats 0xxx as octo based not base 10
if [[ "$CURR_MONTH" ]] && ( [[ "$CURR_MONTH" != "$MONTH" ]] || (( CURR_DAY != DAY || ${CURR_HOUR#0} != ${HOUR#0} )) )
then
# we have hit a new time, print result of old size
echo "$CURR_MONTH $CURR_DAY $CURR_HOUR total $DL_SIZE bytes" # introduce average DL rate later
#reset DL Size and Rate counter
DL_SIZE=0
DL_RATE=0
counter=0
fi
CURR_MONTH=${line[0]}
CURR_DAY=${line[1]}
CURR_HOUR=${line[2]}
(( DL_SIZE += ${line[5]} ))
(( DL_RATE += ${line[7]} ))
(( counter++ ))
done < simplified.vsftpd.log
# If we have reched EOF, print out the final month
echo "$CURR_MONTH $CURR_DAY $CURR_HOUR total $DL_SIZE bytes"
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.