LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 04-19-2012, 07:53 AM   #16
Linux_Kidd
Member
 
Registered: Jan 2006
Location: USA
Posts: 737

Original Poster
Rep: Reputation: 78

Quote:
Originally Posted by grail View Post
You don't happen to work in a bank by chance? I had a very similar experience where a process could only be run over the weekend as it took almost 40hrs to complete a single run. After I had been
there 3 months we were able to run it adhoc whenever we liked as it took about 3 minutes
not a bank. i am a security consultant, my client is a city retirement system. i have discovered lots of processes that are human heavy and should be automated. in this case this was a task being done by Infosec group, getting audit reports for mainframe. the reports (txt files) were being manually stripped of needed data, then copied into an excel sheet which is later imported into a access db.

so, i have two version of my script, the 1st produces some wacky output on the "last" part, seems to repeat the same data as if "last" doesnt get updated on next line read. the input doesnt have more than 12 fields, so in the 2nd script i accommodate up to 13, but the beauty of the "for" loop is i wouldnt need to worry about how many fields there are, etc.

bad
Code:
#!/bin/bash
awk '
BEGIN {
OFS="|";
}
{
        if ( NF == 0 || $1 ~ /^(TOP|-|+|=|0$|1\/|\/\/|PASSWORD|1E)/ ) {}
        else {
        for (i=6; i<=NF; i++) {
        last = last FS $i;      }
        print $1,$2,$3,$4,$5,last;}
} ' | sed 's/^0\(.*\)/\1/'
good
Code:
#!/bin/bash
awk '
BEGIN {
OFS="|";
}
{
        if ( NF == 0 || $1 ~ /^(TOP|-|+|=|0$|1\/|\/\/|PASSWORD|1E)/ ) {}
        else {
        last = $6FS$7FS$8FS$9FS$10FS$11FS$12FS$13;
        print $1,$2,$3,$4,$5,last;}
} ' | sed 's/^0\(.*\)/\1/'
 
Old 04-19-2012, 08:37 AM   #17
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,005

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Do you look at the options I suggested around the fifth field? Also your single items in the regex could just go in square brackets, ie. [-+=].

Also, I notice you call the awk from within a bash script, unless there is more being done by the script you can just as easily make it an awk script, interpreter is simply:
Code:
#!/usr/bin/awk -f
Of course alter if your path to awk is elsewhere.
 
Old 04-19-2012, 09:08 AM   #18
Linux_Kidd
Member
 
Registered: Jan 2006
Location: USA
Posts: 737

Original Poster
Rep: Reputation: 78
yes, there will be more bash stuff in there. i will eventually pipe sed out to a file "$HOME/out.$2.txt" ,etc.
i am not sure what you mean by unique 5th field. all the fields can be unique.
 
Old 04-19-2012, 09:34 AM   #19
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,005

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Let me see if an example helps. I will only use 5 items total but third is the one to watch for:

The following two examples can be dealt with:
Code:
one two three four two
one two three four three
So the first line shows that the third field is unique from all others and the second shows it to be the first occurrence of the value so we could use:
Code:
awk '{print gensub(".*"$3" ","","1")}' file
So this could accommodate the last variable.

Where this does not work is something like:
Code:
three two three four five
Hope that helps explain what I was meaning.
 
Old 04-19-2012, 10:20 AM   #20
Linux_Kidd
Member
 
Registered: Jan 2006
Location: USA
Posts: 737

Original Poster
Rep: Reputation: 78
hmmm, i cannot verify that the field would be unique. in this case i really only care about the "fields" and not what's in them, etc.

btw, my gawk is gnu v3.1.5

this has been a quick learning exercise, and a quick analysis on cost savings looks like this. thnx all.

(salary #’s are for example only, the other #’s are real)

Code:
John Doe salary
80k/yr
$38.46/hr = $0.0107/sec

181 txt files (9 months worth) processed manually = estimated 40 man hrs = $1538.40/9mo = $170.93 per one months worth of data processing.

The script processed the same 181 files in 6sec = $0.0642/9mo = $0.007133 per one months worth of data processing.

However, we need to account for the time spent developing/testing the script in the total cost evaluation/analysis.

We will estimate the person who can script has hourly rate 1.25x that of the person processing the data, so that’s 80k * 1.25 = 100k/yr = $48.08/hr

A guru scriptor can develop/test this script probably in under 2hrs, but for me I took longer, 4hrs. So development/testing cost is $192.32


So, yearly analysis:

•	Manually processing the files = $170.93 * 12 = $2051.16/yr
•	Automated script = $192.32 + ($0.007133 * 12) = $192.41/yr for the 1st year, years 2+ = $0.085/yr
•	that's a 966% savings year-1, and a savings of almost 100% saving $2051.075/yr for years 2+

Conclusion - automate where possible, it saves lots of $$.

Last edited by Linux_Kidd; 04-19-2012 at 10:24 AM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
awk error awk: line 2: missing } near end of file boscop Linux - Networking 2 04-08-2012 10:49 AM
[SOLVED] call awk from bash script behaves differently to awk from CLI = missing newlines titanium_geek Programming 4 05-26-2011 09:06 PM
shell command using awk fields inside awk one71 Programming 6 06-26-2008 04:11 PM
Suse Newb: Not Linux Newb rodericj SUSE / openSUSE 9 03-25-2005 10:03 AM
The first step to ascending newb status, acknowledging you're a newb :P LordRaven LinuxQuestions.org Member Intro 1 08-24-2004 05:05 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 05:30 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration