Well I played with it over the weekend and came up with more pieces (sorry guys I'm really stumbling how to do this in one awk script) then I put the pieces together in a bash script to run it all. Here are the pieces in the bash script:
Code:
#!/usr/bin/awk -f Code:
split -d -a 3 -l 210 one_big_padded_file.dat Code:
for f in x* ; do mv "$f" "file_$f" ; done I'd rather do it with this awk command and when I run this awk script from the command line after I run grail's it all works Code:
awk '!(NR%210) {i++;} {print > "file_"i".dat";}' i=1 giant_padded_file.txt I'm lost, Tabby |
Well my first point would be that there is no need for 2 for loops as you can append to the start and end of the variable as you have done in awk.
To help with putting the awks together, look at the original awk and you will see how the file name was being created. The only difference now is that instead of changing the name every 6 rows (NR % 6) you are now going to change it at a different point. The gotcha is, it will not be changing every 210 rows as the new file created with second awk script (giant_padded_file.txt) has had additions. The math is fairly trivial though. Let me know how you get on? |
...I have to take another step back
so say the input file has 240 lines I want to break up the 240 lines every 6 lines so now I have 40 blocks of data I want to add to each block 15 columns of zeros and 15 lines of zeros now I have 21 lines per block, and 40 blocks of data so I get an output file with 840 lines which I've been calling the "one_giant_padded_file.txt" as grail wrote the awk script that does it, it works perfectly, many many thanks! now for my awk line command (which kinda works right sorta). I take the one giant file with 840 lines and run this line command. Code:
awk '!(NR%210) {i++;} {print > "file_"i".dat";}' i=1 one_giant_padded_file.txt Code:
file_1.dat has 209 lines and the file is missing it's last line grail, can you please help me, I've been trying to fix this scince yesterday afternoon thank you, Tabby |
You need to think about order of execution.
Code:
!(NR%210) {i++;} Once you have this ... you can then simply add this into the original script ;) |
Gail's approach to answering question is to provide a fish hook, pole, and line; I prefer to offer a little advice about how to use the fishing equipment and something about where the fish live. . .
So, a suggestion: See if your system responds to pinfo gawk or the older info gawk. Here's an UNTESTED modification of Gail's program, with some added comments. Code:
#!/usr/bin/gawk -f |
grail
I got it, I got it, wohoooo :) :) :) Code:
awk 'NR%210==1 {"file_"i".dat";i++;} {print > "file_"i".dat"}' i=0 giant_input.file if there is something that I should change to stop any errors/bugs that I don't know about please tell me PTrenholme, I'll give yours a look over too... excited/happy Tabby |
well guys I think that does it. thanks sooooo much for all your help!
Tabby |
I agree with PTrenholme's analogy that I provide direction as opposed to answers, but generally only to those that seem to be following :)
Glad you found a solution. Now that you have one, here is what I would look at: 1. Your final solution works, which is cool, but what I was pointing at in my last advice was that by simply changing the position of the increment you would achieve the same affect: Code:
awk '{print > "file_"i".dat"}!(NR%210) {i++}' i=1 one_giant_padded_file.txt Code:
#!/usr/bin/awk -f |
good morning Grail, you get up too early, even I'm up too early today :)
I did try to follow your direction of moving the iterator, but... Code:
warning these 2 awk commands do not work what I couldn't think past was not having the seperating statment Code:
!(NR%210) the "combined script" is definetly beyound my coding ability. I've never had a class in any sort of programming, so I'm learning awk, sed, and bash writting on my own because alot of what I do is re-formating and re-configuring files and directories to run in already existing programs. I do write psuedo code to organize my thoughts, but putting into real code is much tuffer for me, like I can understand what for and while loops do, but I'll pull my hair out trying to write one, and having 3 print statments in that combined script no way would I have got, so I'll keep working at it, and thank you so much for your help! Thanks so much, Tabby please read my PM to you, ahhhh, I haven't figured out how to do that, is there a link somewhere? |
I found out I can't send a PM, so out in the open, the pepole I work with asked me to change my username here, so I did. My new username here is "tabbyagirl"
|
Well I would need more information on any error messages to help with them.
Looking at the 2 lines you have in post #24, neither would work well for what you want, but I shall try to explain: Code:
awk '!(NR%210) {print > "file_"i".dat";} {i++;} ' i=1 one_giant_padded_file.txt 1. The 'i' variable is now going to increase for every line read in the file, ie by the end of the script it will be 841 2. As you now have the condition '!(NR%210)' prior to your print command, it will only print every 210th line, ie only 4 single lines, one per file will be printed Code:
awk '!(NR%210) {print > "file_"i".dat" i++;} ' i=1 one_giant_padded_file.txt If you look at my example: Code:
awk '{print > "file_"i".dat"}!(NR%210) {i++}' i=1 one_giant_padded_file.txt !(NR%210) {i++} - This tells awk that when NR is evenly divisible by 210 that the variable 'i' will be increased by 1, hence our file of 840 lines will force the variable to be increased 4 times Note: Even though 'i' is increased 4 times, the last value of 'i' is 5 but it is never used Lastly, instead of comparing the new script from post #23 to the previous version in post #15, compare it instead to the one in post #7 as apart from a slight change in the BEGIN section the following is the only new line: Code:
!(NR%60){ file_name = sprintf("file_%02d.dat",++cnt) } |
that helps VERY much, in learning what's going as it steps through the lines of code.
a friend of mine has some C code development tool that let's him step through each line so he can see what's happening, kinda like you explained up above. Do they hav such a thing for scripting languages? converting over to C seems like a BIG step, IDK Tabby |
Scripting in bash you can use the following as second line in script to set logging of a sorts:
Code:
set -xv to a separate file (or on screen if only a few lines) and then you can track down where things have gone wrong. Other options like the one above or something like gdb to step through C code can be adopted later when executing much larger programs / scripts :) |
There is also a "full-fledged" gawk debugger available. It's described in the gawk info file to which I referred you above.
Basically, instead of, for example, gawk '{print > "file_"i".dat"}!(NR%210) {i++}' i=1 one_giant_padded_file.txt you would use dgawk '{print > "file_"i".dat"}!(NR%210) {i++}' i=1 one_giant_padded_file.txt If your 'C' friend is familiar with gdb usage, dgawk commands are similar to those. The info section on "debugging" describes the usage fairly well. A comment that. hopefully, will help you understand where you're loosing the track: In a "one-line" command like awk '{print > "file_"i".dat"}!(NR%210) {i++}' i=1 one_giant_padded_file.txt, the "stuff" between the single quotes is a gawk program and the rest of the line are the argument for that program. You could, instead of that "on-line" program, done this: Code:
$ cat > tmp.gawk |
All times are GMT -5. The time now is 10:15 AM. |