ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
My company is currently receiving a production release from our customer that is very difficult to read (this week's release is 365 pages!) It contains 100+ item numbers, and fields that I want to pull out, seperated by spaces or tabs. Here is the format for each item:
EDIT - see formatted code in my post below...
The above repeats for each item number. What I want to do is to pull out the "Item Number" value towards the top left, along with the following:
- In Transit Qty (top right)
- Cum Received (top right)
- Cum Req Qty at a certain date row ("matrix" at bottom)
I'm trying to write a shell script to filter this information out for each part number, and echo to an output file for easier reading. Any suggestions would help a bunch. Thanks! ...Steve
You should put your example input between code tags [ code ]text goes here[ /code ]. You must not use the spaces between the square bracket and the word 'code'. Take a look here for all the possible code tag's and their meaning.
About your question:
What do you mean by this: - Cum Req Qty at a certain date row ("matrix" at bottom)?
Do you need to see this for all dates present or a specific date (if so, how's that determined?)?
Could you give a rough example of the output you would like (if at all possible based on part of a 'real' infile)? I think it's something like this:
Code:
Item Number: XXXXXXX-XXX0
Transit Qty : 1,638.0
Cum Received : 47,502.0
Cum Req Qty : 1,234.0
tem Number: XXXXXXX-XXX1
Transit Qty : 2,336.0
Cum Received : 51,401.0
Cum Req Qty : 671.0
I won't promise a working version, but I will take a good look at it
That's great feedback, thanks for working with me! The output I would like to see is exactly what you did. For the "Cum Req Qty", I need the value at a date (no addition or anything). I'm good with just making this date a variable I can prompt for and read in to the script. Here is the code again:
Code:
SUPPLIER SCHEDULE / MATERIAL RELEASE
Supplier: XXXXXX Ship-To: XXXX
XXXXXXX XXXXXXXX X.X.X.X. X XX XX XX XX
XXX X XXXXXXXXX, XXX XXX (XXXXXX XXXXX XXX XXXXX)
XXXXXX XXXXXXXXX # X XXX. XXXXXX
XXXXXXX, XX XXXXX XXXXXX XXXXXXXXXX XXXXX
UNITED STATES OF AMERICA XXXXX XXXXXX, XXXX XXXXX
XXXXXX
Attention: Attention:
Telephone: Telephone:
Fax/Telex: Fax/Telex:
Release ID: 20061106-001 Release Date: 11/06/06
Purchase Order: XXXXXXXX Buyer: SAPPO
Item Number: XXXXXXX-XXXX UM: EA In Transit Qty: 1,638.0
XXX,XXX XXX XXXXX Receipt Date: 11/03/06 23:53
Receipt Quantity: 1,764.0
Supplier Item: Cum Received: 47,502.0
Contact:
Packing Slip/Shipper: 140611
Ship/Delv Pattern: 12
Item Ship Ship
Int Rev Date Time Reference Q Req Qty Cum Req Qty Net Req Qty
--- ---- -------- ----- ---------- - ------------ ------------- ------------
Prior 47,502.0 0.0
W 10/30/06 P 504.0 48,006.0 504.0
11/06/06 P 3,150.0 51,156.0 3,150.0
11/13/06 P 0.0 51,156.0 0.0
11/20/06 P 1,764.0 52,920.0 1,764.0
11/27/06 P 1,890.0 54,810.0 1,890.0
12/04/06 P 1,260.0 56,070.0 1,260.0
12/11/06 P 1,134.0 57,204.0 1,134.0
12/18/06 P 2,142.0 59,346.0 2,142.0
M 12/25/06 P 0.0 59,346.0 0.0
01/29/07 P 0.0 59,346.0 0.0
02/26/07 P 0.0 59,346.0 0.0
03/26/07 P 0.0 59,346.0 0.0
Fab Authorization Cum Qty: 51,156.0 Thru: 11/19/06
Raw Authorization Cum Qty: 52,920.0 Thru: 12/03/06
#!/bin/bash
#
# filter.sh - show relevant information
#
# Usage: filter.sh <date> <infile>
# date => MM/DD/YY
case $# in
2);;
*|1) echo "Usage: $0 <date> <infile>";exit;;
esac;
thisDate=$1
inFile=$2
awk -v thisDate=${thisDate} '
/ Item Number: / {
print ""
print "Item Number : " $3
print " Transit Qty : " $9
}
/ Cum Received: / { print " Cum Received : " $5 }
$1 == thisDate { print " Cum Req Qty : " $4 }
' $inFile
A sample run with your input (times 3 to mimick 'huge' file):
Code:
$ ./filter.sh 12/18/06 infile
Item Number : XXXXXXX-XXXX
Transit Qty : 1,638.0
Cum Received : 47,502.0
Cum Req Qty : 59,346.0
Item Number : XXXXXXX-XXXX
Transit Qty : 1,638.0
Cum Received : 47,502.0
Cum Req Qty : 59,346.0
Item Number : XXXXXXX-XXXX
Transit Qty : 1,638.0
Cum Received : 47,502.0
Cum Req Qty : 59,346.0
Holy crap, that's exactly what I was looking for :-) Let me play with it some more, but what a huge help for us! I'll probably insert some math to calculate required shipping numbers for each week, but that's easy enough. Thanks again! Do you do any commercial contract work for bigger jobs? I'm not a computer guy, but I'm pitching Linux at my company.
Holy crap, that's exactly what I was looking for :-) Let me play with it some more, but what a huge help for us! I'll probably insert some math to calculate required shipping numbers for each week, but that's easy enough. Thanks again! Do you do any commercial contract work for bigger jobs? I'm not a computer guy, but I'm pitching Linux at my company.
You might want to learn 0a little about textflow-oriented a scripting language such as perl, awk, or even sed. It comes in handy in a lot of situations (even if your not a computer guy) and it would probably make your pitch for linux more qualified-sounding to a manager
Holy crap, that's exactly what I was looking for :-) Let me play with it some more, but what a huge help for us! I'll probably insert some math to calculate required shipping numbers for each week, but that's easy enough. Thanks again!
Glad to be of service
The script isn't that complicated (from my point of view), if you have some AWK experience it shouldn't be too hard to edit/add to/change the above code. And I guess you'll ask if you cannot get the changes to work the way you want them to.
Osor gave some good advice: Do learn the basics of sed/awk, it will make your computer life a lot easier. I don't know about the perl part. If you occasionally have to write a script/tool learning perl could be a bit much. On the other hand: Perl is a lot more powerfull then sed and awk combined.
Quote:
Do you do any commercial contract work for bigger jobs? I'm not a computer guy, but I'm pitching Linux at my company.
I'm an Unix System Admin by trade and have been playing with computers since late 1980. Helping people, whenever possible, on this forum is just for fun and relaxation. And to answer your question: Nope, I'm not for hire
I do have a case where the "Cum Req Qty" does not work correctly. I think the issue is when there is a character in front of the ship date (like the "4" in this case, or sometimes there is a "W" or "M" as the first string)...
Code:
Item Ship Ship
Int Rev Date Time Reference Q Req Qty Cum Req Qty Net Req Qty
--- ---- -------- ----- ---------- - ------------ ------------- ------------
Prior 1,383,750.0 0.0
W 4 10/30/06 P 49,500.0 1,433,250.0 49,500.0
4 11/06/06 P 141,750.0 1,575,000.0 141,750.0
4 11/13/06 P 0.0 1,575,000.0 0.0
Is there an easy way to make it work whether this string is the 4th, 5th, or 6th string in the line?
Thanks for being patient with me. I've been a Slackware user for years at home, but haven't done anything like this. I'm actually not even a computer guy, but I love it. Thanks! ...Steve
In general awk reads one input line at the time. Taking my script as an example, every line is checked against the 3 'triggers':
1) / Item Number: / => a regular expression that looks for a space followed by Item Number: and a space. If that is the case 3 things happen: print an empty line (print ""), print Item Number :and the third field, print Transit Qty : and the ninth field.
2) / Cum Received: / => look for Cum Received (again with a space in front and behind). If a hit is made print Cum Received : and field 5.
3) $1 == thisDate => If field number one equals the datestring given it will print Cum Req Qty : and the fourth field.
If a line does not meet any of the above criteria, it is discarded (= not printed to screen).
If 'words' on a line are separated by spaces/tabs, awk sees this 'word' as a field. The following is seen as a line with three fields:
foo bar whatever
foo can be represented as $1, bar as $2 and whatever as $3.
The so called field separator can be set to be just about anything you like, by default it is a space/tab. This for example will print all users and the uid in the /etc/passwd file: awk -F: '{ print $1, $3 }' /etc/passwd. The field separator is changed to : (the -F: part).
About the '4' in you second example, if it is irregular (I did not see it in the other infile example) it could render my third rule 'useless'. If I'm honest, the lines with W or M in front are also discarded in the above script But this can be solved (and maybe the 4/number problem):
Add these 2 lines to the script (bold are the new lines):
/ Cum Received: / { print " Cum Received : " $5 } $1 ~ /[0-9MW]/ && $2 == thisDate { print " Cum Req Qty : " $5 }
$1 ~ /[0-9MW]/ && $3 == thisDate { print " Cum Req Qty : " $6 }
$1 == thisDate { print " Cum Req Qty : " $4 }
This will take care of the W, M and/or 4 (actually: 0-9).
And finally. You can redirect the output to a file. Easiest way:
./filter.sh 11/06/06 infile > /path/to/outfile
I hope this cleared things up a bit, if not...... Just ask!
Your suggestion for the irregularities works perfectly as I assumed :-) Plus, I now understand everything in the script!
I have modified what we have to output very close to what I'm looking for. Only 2 things now that I'm stuck on...
- I know how to format the output per "trigger", but I can't find how to format between "triggers". What I want to do is to make the output all on the same row, with a space separation. I can do this easily with the first 1 fields, but I want to keep the other fields on the same line as well. This will help us import into a spreadsheet and use through the whole plant.
- I haven't been able to get a mathematical formula to work within the awk command. What I want to do is to create a new variable that will equal "Cum Req Qty" minus "Cum Received" minus "In Transit Qty" and output for each "Item Number".
I found the answer to my first question in the man page... my solution is to re-define the ORS separator. Works perfectly!
With some further research, I'm thinking the reason why my equations do not work is because awk is seeing strings, not numbers? The man page says to convert, "add 0 to it", but I'm not sure how I'm supposed to do that?
I'm having fun now... next thing you know, the only Linux box in the whole company is going to be busy doing everyone's busy-work all day!
About your first question: Changing the ORS is one way of solving the output problem you have/had. Another would be using printf instead of print, which is a more powerful and elegant solution. Elegance isn't always important, sometimes the script just has to do what you want it to do, no matter what
About the arithmetic: variables with a comma in them are seen as string, not a decimal integer. Placing a 0 (zero) in front will not help in this situation. You need to rewrite some parts of the script and add some extra lines. I came up with the following:
$ ./filter.sh 10/30/06 infile
Item Number Transit Qty Cum Received Cum Req Qty Calculated
XXXXXXX-XXX1 2638.0 47502.0 52920.0 2780.0
XXXXXXX-XXX2 638.0 502.0 433250.0 432110.0
XXXXXXX-XXX3 1638.0 47502.0 1575000.0 1525860.0
As you can see I introduced some 'new' awk stuff
I'll let you try to figure it all out, I know you will ask if something is unclear.
Before you start looking for printf in the awk manuals: Check man printf first. printf is used in different languages (C, perl, awk to name just 3).
I thought I mentioned it already, but I haven't: If you like a paper reference, take a look at the O'Reilly's Sed & Awk book.
Apparently Slackware isn't packaged with nawk, so I just combined some of my solutions with yours, and I can now do my math, and output exactly as I wanted. Thanks! With your guidance, I have this script now finding and calculating current numbers, along with forecasts directly into a comma-separated file my co-workers can open into Excel.
Not to keep this thread going forever, but... there's 1 more thing I can't find how to do. We have a production team that is responsible for each "Item Number". Is there an easy way to direct my awk outputs to a different file depending on what team is responsible for that item? I was trying to do this with an array, but it hasn't worked so far. Obviously I would need to produce some input that includes what "Item Number" goes with each team. Right now I am directing my outputs like this:
Sorry about the nawk/awk mix up. I was testing it at work on a SUN box, I won't bore you with details, but you need to use nawk instead of awk on a SUN box. My mistake, it will work with awk on a 'normal' linux box.
I'm not sure if I understand your question. I think you want to send (Item Number) XXXXXXX-XXXX to group X, XXXXXXX-YYYYY to group Y and XXXXXXX-ZZZZ to group Z (send to a specific group based on item number).
You will end up with the following (input file has 4 entries. 2 XXXX, 1 YYYY and 1 ZZZZ):
Code:
$ ./filter.sh 12/18/06 infile
$ ls -l *team*
-rw-r----- 1 druuna internet 128 Nov 11 17:17 x-team.outfile
-rw-r----- 1 druuna internet 64 Nov 11 17:17 y-team.outfile
-rw-r----- 1 druuna internet 64 Nov 11 17:17 z-team.outfile
$ cat x-team.outfile
XXXXXXX-XXXX 1638.0 47502.0 59346.0 10206.0
XXXXXXX-XXXX 1638.0 47502.0 59346.0 10206.0
$ cat y-team.outfile
XXXXXXX-YYYY 1638.0 47502.0 59346.0 10206.0
$ cat z-team.outfile
XXXXXXX-ZZZZ 1638.0 47502.0 59346.0 10206.0
This solution is only practical if there aren't too many different item numbers and/or too many new item numbers. This will also make it possible to send 2 or more different item numbers to the same team (see dummy XXXXX-AAAA entry).
Another solution would be to use the item numbers as filename. But this only works if you (your co-workers) know which item number is for which team.
I decided to not print the header line. If the output(file) is imported in excel, you don't need it (ok, I'm lazy as well, takes a lot of extra lines to put a header line in the file )
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.