LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 09-30-2014, 02:54 PM   #1
master-of-puppets
Member
 
Registered: Jun 2011
Posts: 49

Rep: Reputation: Disabled
awk behavior unpredictable


On my debian wheezy box the output of df when using the -m option for 1MB Blocks and awking the values "Size" and "Used" looks like this:
Code:
[19:07:35] billsb@sideshow:/share/es-ops/scripts/BUILD_FARM $ df -m /export/ws/bob | awk '{print $2, $3}'
1M-blocks Used
1032124 531937
So to get my values you'd think I'd have to use:
Code:
df -m /export/ws/bob | awk '{print $2, $3}' | sed '1d'
But the loop that awks these values uses printf and doesn't seem to need the | sed '1d':
Code:
df -m "${WORKSPACES2[@]/#//export/ws/}" | awk '
BEGIN  { "date +'%m-%d-%y'" | getline date;
             printf "%s",date }
    NR > 1 { printf ",%s,%s", $2, $3; }
    END    { printf "\n"}' >> "$OUTPUT_DIR/$HOSTNAME.csv"
On my squeeze boxes the output of df when using the -m option for 1MB Blocks and awking the values "Size" and "Used" looks like this:
Code:
[19:16:31] billsb@simpsons:/share/es-ops/scripts/BUILD_FARM $ df -m /export/ws/bart | awk '{print $2, $3}'
1M-blocks Used

1140818 56858
It puts a blank line between the headers and the values. When I run it with sed '1d' it looks like this:
Code:
[19:18:14] billsb@simpsons:/share/es-ops/scripts/BUILD_FARM $ df -m /export/ws/bart | awk '{print $2, $3}' | sed '1d'

1140820 56856
I have 4 hosts sideshow, simpsons, moes, and flanders. The folders I'm trying to extract the values of "Size" and "used" from are:
Code:
case "$HOSTNAME" in
	simpsons) WORKSPACES=(bart_avail bart_used homer_avail home_used lisa_avail lisa_used marge_avail marge_used releases_avail releases_used rt-private_avail rt-private_used simpsons-ws0_avail simpsons-ws0_used simpsons-ws1_avail simpsons-ws1_used simpsons-ws2_avail simpsons-ws2_used vsimpsons-ws_avail vsimpsons-ws_used) ;;
	moes)     WORKSPACES=(barney_avail barney_used carl_avail carl_used lenny_avail lenny_used moes-ws2_avail moes-ws2_used) ;;
	flanders) WORKSPACES=(flanders-ws0_avail flanders-ws0_used flanders-ws1_avail flanders-ws1_used flanders-ws2_avail flanders-ws2_used maude_avail maude_used ned_avail ned_used rod_avail ro
d_used todd_avail todd_used to-delete_avail to-delete_used) ;;
esac
The path base is:
Code:
BASE=/export/ws
On sideshow (debian wheezy) the output looks like this:
Code:
,,,
sideshow
,bob_size,bob_used,mel_size,mel_used,sideshow-ws2_size,sideshow-ws2_used
09-25-14,1032124,508509,1032124,683647,1032108,46787
09-28-14,1032124,519385,1032124,690727,1032108,178159
09-29-14,1032124,519385,1032124,691161,1032108,178159
09-30-14,1032124,520456,1032124,711363,1032108,180249
Which is the desired output. On the others the output gets all messed up. Can anybody see by looking at the output of df on debian squeeze what might be happening?

I'm sure it's just a difference in df and awk between the systems.

This is squeeze:
Code:
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 6.0.7 (squeeze)
Release:        6.0.7
Codename:       squeeze
Code:
GNU Awk 3.1.7
This is wheezy:
Code:
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 7.3 (wheezy)
Release:        7.3
Codename:       wheezy
Code:
mawk 1.3.3 Nov 1996, Copyright (C) Michael D. Brennan

compiled limits:
max NF             32767
sprintf buffer      2040

Last edited by master-of-puppets; 09-30-2014 at 09:28 PM.
 
Old 09-30-2014, 05:51 PM   #2
master-of-puppets
Member
 
Registered: Jun 2011
Posts: 49

Original Poster
Rep: Reputation: Disabled
Nobody wants to help? hey at least I tried the best I could on my own before asking for help. Sheesh!
 
Old 09-30-2014, 08:01 PM   #3
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,779

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
I doubt you'll find many people who want to dig through all that code. Have you looked at the actual output from df on each system? One problem might be that df can insert extra line breaks when mount point names get too long. You can use the "-P" (--portability) option to prevent that.

My other suggestion is to try the standard debugging technique of putting a "set -x" command at the top of the script and then examining the output in detail to see what is actually getting executed. Saving the output from df in a file rather than piping it directly to awk might also prove instructive.
 
Old 09-30-2014, 08:09 PM   #4
metaschima
Senior Member
 
Registered: Dec 2013
Distribution: Slackware
Posts: 1,982

Rep: Reputation: 492Reputation: 492Reputation: 492Reputation: 492Reputation: 492
I agree that it is a lot of code and rather vague description of what you need, and you expect someone to respond in 3 hours.

It shouldn't be too difficult for you to debug it, but nearly impossible for us with the amount of info you gave. Look at the output of 'df' on each system and you are likely to figure it out, if not, post the output. Also, adding functions will help clarify what is going on.
 
Old 09-30-2014, 09:03 PM   #5
master-of-puppets
Member
 
Registered: Jun 2011
Posts: 49

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by rknichols View Post
I doubt you'll find many people who want to dig through all that code. Have you looked at the actual output from df on each system? One problem might be that df can insert extra line breaks when mount point names get too long. You can use the "-P" (--portability) option to prevent that.

My other suggestion is to try the standard debugging technique of putting a "set -x" command at the top of the script and then examining the output in detail to see what is actually getting executed. Saving the output from df in a file rather than piping it directly to awk might also prove instructive.
I'll add the output of df at the top of my post and I will use pseudo code to show to what's supposed to be happening. I have been pulling my hair out all day trying to figure it out. Please stay with me.
 
Old 09-30-2014, 09:05 PM   #6
master-of-puppets
Member
 
Registered: Jun 2011
Posts: 49

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by metaschima View Post
I agree that it is a lot of code and rather vague description of what you need, and you expect someone to respond in 3 hours.

It shouldn't be too difficult for you to debug it, but nearly impossible for us with the amount of info you gave. Look at the output of 'df' on each system and you are likely to figure it out, if not, post the output. Also, adding functions will help clarify what is going on.
I'll add the output of df at the top of my post and I will use pseudo code to show to what's supposed to be happening. I have been pulling my hair out all day trying to figure it out. Please stay with me.
 
Old 09-30-2014, 10:41 PM   #7
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
So a few things:

1. Others have already advised to look at your df output

2. sed code is not needed in original awk as it says NR > 1, which means process all but the first line

3. The above obviously does not work as now the data starts on the third line. So i see your options as:

a. See point 1

b. NR > 2

c. NR > 1 && NF


You are going to have to learn more about awk if you are going to continue to use it. May I suggest looking at the manual
 
1 members found this post helpful.
Old 09-30-2014, 11:25 PM   #8
NevemTeve
Senior Member
 
Registered: Oct 2011
Location: Budapest
Distribution: Debian/GNU/Linux, AIX
Posts: 4,862
Blog Entries: 1

Rep: Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869
my df(1) manual suggests using option '-P' to get a 'standard, portable output format'
 
Old 09-30-2014, 11:35 PM   #9
master-of-puppets
Member
 
Registered: Jun 2011
Posts: 49

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by grail View Post
So a few things:

1. Others have already advised to look at your df output

2. sed code is not needed in original awk as it says NR > 1, which means process all but the first line

3. The above obviously does not work as now the data starts on the third line. So i see your options as:

a. See point 1

b. NR > 2

c. NR > 1 && NF


You are going to have to learn more about awk if you are going to continue to use it. May I suggest looking at the manual
You are awesome.

The first pass:
Code:
[21:32:31] billsb@sideshow:/share/es-ops/scripts/BUILD_FARM $ bash -x ./test.sh
+ OUTPUT_DIR=/share/es-ops/Build_Farm_Reports/WorkSpace_Reports
+ BASE=/export/ws
++ date +%m-%d-%y
+ TODAY=09-30-14
++ hostname
+ HOSTNAME=sideshow
+ case "$HOSTNAME" in
+ WORKSPACES3=("bob_avail" "bob_used" "mel_avail" "mel_used" "sideshow-ws2_avail" "sideshow-ws2_used")
+ '[' -f test.csv ']'
++ hostname
+ '[' sideshow == sideshow ']'
+ echo sideshow
+ separator=,
+ for v in '"${WORKSPACES3[@]}"'
+ echo -n ,bob_avail
+ for v in '"${WORKSPACES3[@]}"'
+ echo -n ,bob_used
+ for v in '"${WORKSPACES3[@]}"'
+ echo -n ,mel_avail
+ for v in '"${WORKSPACES3[@]}"'
+ echo -n ,mel_used
+ for v in '"${WORKSPACES3[@]}"'
+ echo -n ,sideshow-ws2_avail
+ for v in '"${WORKSPACES3[@]}"'
+ echo -n ,sideshow-ws2_used
+ echo
+ case "$HOSTNAME" in
+ WORKSPACES4=("bob" "mel" "sideshow-ws2")
+ df -m /export/ws/bob /export/ws/mel /export/ws/sideshow-ws2
+ awk '
BEGIN  { "date +%m-%d-%y" | getline date;
                         printf "%s",date }
        NR > 1 && NF { printf ",%s,%s", $2, $3; }
        END    { printf "\n"}'

[21:32:35] billsb@sideshow:/share/es-ops/scripts/BUILD_FARM $ cat test.csv
sideshow
,bob_avail,bob_used,mel_avail,mel_used,sideshow-ws2_avail,sideshow-ws2_used
09-30-14,1032124,531937,1032124,722301,1032108,151337
good so far and now the second pass:
Code:
[21:32:40] billsb@sideshow:/share/es-ops/scripts/BUILD_FARM $ bash -x ./test.sh
+ OUTPUT_DIR=/share/es-ops/Build_Farm_Reports/WorkSpace_Reports
+ BASE=/export/ws
++ date +%m-%d-%y
+ TODAY=09-30-14
++ hostname
+ HOSTNAME=sideshow
+ case "$HOSTNAME" in
+ WORKSPACES3=("bob_avail" "bob_used" "mel_avail" "mel_used" "sideshow-ws2_avail" "sideshow-ws2_used")
+ '[' -f test.csv ']'
+ '[' -f test.csv ']'
++ hostname
+ '[' sideshow == sideshow ']'
+ case "$HOSTNAME" in
+ WORKSPACES5=("bob" "mel" "sideshow-ws2")
+ df -m /export/ws/bob /export/ws/mel /export/ws/sideshow-ws2
+ awk '
BEGIN  { "date +%m-%d-%y" | getline date;
                         printf "%s",date }
        NR > 1 && NF { printf ",%s,%s", $2, $3; }
        END    { printf "\n"}'

[21:33:59] billsb@sideshow:/share/es-ops/scripts/BUILD_FARM $ cat test.csv
sideshow
,bob_avail,bob_used,mel_avail,mel_used,sideshow-ws2_avail,sideshow-ws2_used
09-30-14,1032124,531937,1032124,722301,1032108,151337
09-30-14,1032124,531937,1032124,722301,1032108,151337
Thanks so much this is SOLVED!!!!!
 
Old 10-01-2014, 12:48 AM   #10
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,838

Rep: Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308
I would not put that date call into awk, instead:
Code:
awk -v A="$(date +%m-%d-%y)" ' BEGIN { printf A } '
looks better at least for me.

About looking for valid lines you can do the following too:
df -m | awk '/% \// { printf ",%s,%s", $2, $3; }'
 
Old 10-01-2014, 02:37 AM   #11
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Is there something wrong with awk's date abilities?
Code:
awk 'BEGIN{print strftime("%m-%d-%Y")}'
 
Old 10-01-2014, 09:53 AM   #12
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian, Arch
Posts: 3,780

Rep: Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081
Quote:
Originally Posted by grail View Post
Is there something wrong with awk's date abilities?
Portability:

Quote:
9.1.5 Time Functions

gawk provides the following functions for working with timestamps. They are gawk extensions; they are not specified in the POSIX standard. However, recent versions of mawk (see Other Versions) also support these functions.
 
Old 10-01-2014, 03:22 PM   #13
master-of-puppets
Member
 
Registered: Jun 2011
Posts: 49

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by pan64 View Post
I would not put that date call into awk, instead:
Code:
awk -v A="$(date +%m-%d-%y)" ' BEGIN { printf A } '
looks better at least for me.

About looking for valid lines you can do the following too:
df -m | awk '/% \// { printf ",%s,%s", $2, $3; }'
I'll try it when I get some time thanks. Plus I'm taking an awk tutorial to save you guys headaches.

---------- Post added 10-01-14 at 03:23 PM ----------

Quote:
Originally Posted by grail View Post
Is there something wrong with awk's date abilities?
Code:
awk 'BEGIN{print strftime("%m-%d-%Y")}'
Thanks I'll give it a try when I get the chance.
 
Old 10-01-2014, 03:23 PM   #14
master-of-puppets
Member
 
Registered: Jun 2011
Posts: 49

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by ntubski View Post
Portability:
Awesome information thank you very much.
 
Old 10-01-2014, 03:25 PM   #15
master-of-puppets
Member
 
Registered: Jun 2011
Posts: 49

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by grail View Post
So a few things:

1. Others have already advised to look at your df output

2. sed code is not needed in original awk as it says NR > 1, which means process all but the first line

3. The above obviously does not work as now the data starts on the third line. So i see your options as:

a. See point 1

b. NR > 2

c. NR > 1 && NF


You are going to have to learn more about awk if you are going to continue to use it. May I suggest looking at the manual
How do I give you credit for solving the issue? I looked around for a "Solved" button but couldn't find one.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
#awk If else loop not working in ksh sdhanawade Linux - Newbie 8 09-21-2014 01:28 AM
[SOLVED] Using the for loop in awk ilukacevic Programming 5 02-15-2012 09:32 AM
[SOLVED] using for loop in awk ilukacevic Programming 6 03-29-2011 02:48 PM
How to use awk in a for loop? cliffyao Programming 2 10-27-2010 10:51 PM
awk in loop How to Nkunzis Linux - Newbie 3 12-10-2006 01:34 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 01:12 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration