To extract certain columns with only header
Hi,
I have data which has many columns but i need to extract only 5 columns. three columns are direct extract but for two columns logic needs to be applied before getting output. I have pasted sample content of the files. 1. I need to pick only those record under MOVEMENT CODE 26 which is in header section of each record other than MOVEMENT CODE 26 should not come in output. 2. only 2 digits needs to display from DOC NO. 3.To display DDMM colmmn also but with year in file records only date and month i.e 05 date and month 12 but year also needs be appended. year logic is to read from the file name. file name = R950CMA_01JAN15 so data is for the past month means dec14 so year is 14 in a same way go on.. so total 5 column needs to be extracted ( ORD NO,P A R T N U M B E R,DDMM,INV NO / SER NO,DOC NO). Input file content: Code:
1 OUTPUT: Code:
P A R T N U M B E R ORD NO DOC NO DDMM PGC / SER NO Appreciate your assistance by AWK. |
Hi azheruddin,
If your preference is to do this by awk, then you should have already made some attempt to do this. Please post what you have so far for others to review and offer their suggestions towards. As you know from your 6 years with LinuxQuestions.org, LQ is volunteer members who are here to help you learn the information you need to accomplish your goals versus an on-demand script provider. |
awk has some built-in variables like FNR and NR which can be used to change what the script does. So before a certain line number, you can print one set of data, after that you can print the five columns you want.
|
Hi,
awk ' /^1/ {SKIP = NR + 5 + HDFND } NR < SKIP {next } !HDFND {MX = split (COLUMNS, HD, ",") for (i=1; i<=MX; i++) {match ($0, HD[i] " *") P[i] = RSTART L[i] = RLENGTH } HDFND = 2 } {for (i=1; i<=MX; i++) printf "%s ", substr ($0, P[i], L[i]) printf RS } ' COLUMNS=" ORD NO, P A R T N U M B E R,INV NO / SER NO" file Got output as below. Code:
ORD NO P A R T N U M B E R PGC / SER NO 1. records should be picked only for the movement code 26 as mentioned in top of every record header ,can be hard coded. 2. addition column only 2 digits needs to display from DOC NO so as per below record 39,36,11,11 3. To display DDMM colmmn also but with year in file records only date and month i.e 05 date and month 12 but year also needs be appended. year logic is to read from the file name. file name = R950CMA_01JAN15 so data is for the past month means dec14 so year is 14 in a same way go on.. expected output: Code:
P A R T N U M B E R ORD NO DOC NO DDMM PGC / SER NO |
Have to say, the last post made me laugh. You have gone to extraordinary lengths to put all your data in code tags, but when it came to your actual code you left it twisting in the breeze so
there is zero formatting :lol: Looking at you further requirements: 1. Check line contains the words :- /MOVEMENT CODE 26/, if not found, continue until the next blank line and start searching again 2. Look at substr function to get just the digits required 3. You seem to tell us that a file is of the format R950CMA_01JAN15, but in your code example it is simply called "file". I will assume this was a typo and advise you can either use the date functions provided or design your own function to perhaps use an array to retrieve the previous month and year |
Quote:
Quote:
I do realize it is just example data for the question, however the discontinuity confused me as to what was being sought. Still feel that using awk to process a line and then print out the fields from the line in the sequence you desire is more desirable of a solution. From something I googled: Code:
NR==1 |
[duplicate due to keybounce]
|
Ok. I see the pattern now.
One way is to use a pair of flags and toggle them on and off, only printing when the both are set. The sets are separated by empty lines (or at least lines with only whitespace). You're in the right set when you have "MOVEMENT CODE 26" or whatever. The data starts after the line with many dashes. When you are in the right set and on a data line, print. Otherwise go to the next line of input. Maybe that's the 7th-grade approach but it should work with awk and can be don with a lot of nexts and ifs. However, the field KEYWORD will cause awk to choke with data like "DMNMDB PE" and "D DJKND" because of the spaces. Are those typos? Or are the main fields separated by tabs, I hope? |
Quote:
Code:
awk '/MOVEMENT.*26/ {getline;getline;getline;getline;while (NF==10) {print $2,$4,$5,$6,$7;getline}}' <Input file> You can add a header to the output by adding a print command in a BEGIN rule. Code:
BEGIN {OFS="\t";print "PART NUMBER","ORD NO","DOC NO","DDMM","PGC/SER NO"} |
I agree with rtmistler that the data needs to be clarified, ie how is it separated? and things like the 'JAN' field, will these ever contain data?
You may opt to use something like ruby/perl/python and then use the unpack function to set fixed widths for each field |
Hello Guys,
sorry for the delayed feedback. lot of queries raised, here are the answers Quote:
filename always as would be in format R950CMA_01JAN15 accordingly -1 month need to manipulated, in sample it was just given as a sample Quote:
Quote:
Quote:
Now somebody can help? |
awk has a variable called FILENAME, so you can use that to get the data wanted. As previously advised you will need to take said data and manipulate it with either awk related functions or create your own.
Also the requirement for only part of one of the fields is done using substr. As you have advised that bogus spaces will not be in real data, here is a quick knock up to get the data from current input: Code:
#!/usr/bin/awk -f |
Hi ,
getting some issue to merge code with mine, can anyone send me the code to complete the complete requirement on top of my code or new one. |
Ultimately you are asking if someone will do the job for you, the short answer is no.
I am not following why it is so complicated. If using my example code, run it, check the output and slowly add in the pieces that are missing that you want. You will never get any more proficient at this if someone else is doing the work for you and you will simply end up back here next time asking someone to do your next task. At least present what you have done and advise where you are stuck??? |
Quote:
http://www.linuxquestions.org/questi...141/page2.html http://www.linuxquestions.org/questi...rn-4175516737/ http://www.linuxquestions.org/questi...on-4175438073/ http://www.linuxquestions.org/questi...pt-4175440649/ http://www.linuxquestions.org/questi...pt-4175444590/ http://www.linuxquestions.org/questi...at-4175448079/ http://www.linuxquestions.org/questi...ng-4175509609/ After being here for six years now, and essentially being told many times that you have to write your own scripts, I'm with Grail here...show us your efforts FIRST, and tell us where you're stuck. |
All times are GMT -5. The time now is 02:39 PM. |