[SOLVED] replicate a block of data throughout a file
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Location: a warm beach, cool ocean breeze, nice waves, and a Margaritta
Distribution: RHEL 5.5 Tikanga
Posts: 63
Rep:
replicate a block of data throughout a file
hi guys,
i have a very large text file (largefile.txt) that has data organized by blocks (in the file they’re called Tables), and i need to replace one of those blocks (the one called Table 10) with a different Table 10 block of data that I have in another file called table10.dat i need to do the Table 10 switch-out throughout the whole file. my table10.dat file has the same format as Table 10 in largefile.txt so it's really a "cut and paste" just a gazillion times.
all of the blocks of data have 12 lines (including a blank/return line at the end of each block)and Table 10 first shows up on line 2271, and then 2684, and then 3097 repeating itself every 413 lines til eof
so i think i could use a counter to paste in my table10.dat or i could string replace every time "Table 10," shows up.
is one way better than the other?
i started working on this by moding a command I already know (colucix helped me with)
Code:
awk 'NR<=2270{print $0}{for (i = 2; i <= NF; i = i+413) print $i}' largefile.txt table10.dat > out.txt
this of course doesn't work cause nowhere in the command does it call for table10.dat I don’t know how to "feed" in the table10.dat
thanks so much for whatever help you can provide!!!
#!/usr/bin/perl
file1=shift(@ARGV); # file to be mangled
file2=shift(@ARGV); # file to be included
$size=shift(@ARGV); # line to start on
$count=shift(@ARGV); # number of lines to remove from file1
open(INP,"<".file1) or die "can't open source file";
open(INP2,"<".file2) or die "can't open file to include";
$not_done = 1;
$i = 0;
while (<INP>))
$i++;
if ($i < $size && $not_done) {
print;
} elsif ($not_done) {
for ($j =0; $j < count; $j++) { # discard the count number of records
$discard = <INP>;
}
while(<INP2>) { # copy the new input
print;
}
$not_done = 0; # finished copying the replacement
} else {
print;
}
}
The output is sent to stdout. So usage would be "name input1 replacement startingline skip >newinput1"
Note, this has not been debugged. But the perl code will run faster than awk or any combination of awk and shell... I think only python would go faster.
Location: a warm beach, cool ocean breeze, nice waves, and a Margaritta
Distribution: RHEL 5.5 Tikanga
Posts: 63
Original Poster
Rep:
Quote:
Originally Posted by jpollard
Looks more like a job for perl... Something like:
Code:
#!/usr/bin/perl
file1=shift(@ARGV); # file to be mangled
file2=shift(@ARGV); # file to be included
$size=shift(@ARGV); # line to start on
$count=shift(@ARGV); # number of lines to remove from file1
open(INP,"<".file1) or die "can't open source file";
open(INP2,"<".file2) or die "can't open file to include";
$not_done = 1;
$i = 0;
while (<INP>))
$i++;
if ($i < $size && $not_done) {
print;
} elsif ($not_done) {
for ($j =0; $j < count; $j++) { # discard the count number of records
$discard = <INP>;
}
while(<INP2>) { # copy the new input
print;
}
$not_done = 0; # finished copying the replacement
} else {
print;
}
}
The output is sent to stdout. So usage would be "name input1 replacement startingline skip >newinput1"
Note, this has not been debugged. But the perl code will run faster than awk or any combination of awk and shell... I think only python would go faster.
hi j,
i've almost never used perl before so i know even less of it than awk, and my awk is not good
i don't think i understand the command line elements, are they
Code:
"name" js_perl_script.pl
Code:
"input1" my largefile.txt
Code:
"input2" my table10.dat
Code:
"replacement startingline" my 2271
Code:
"skip" my 413
so putting it all together it would look like this?
I believe the skip, ie last argument, should be 12. This is the number of lines to not include from the original file, but to then be replaced by the new data.
I would agree that Perl may well be faster, but thought I would put up an awk for you to see how it may be done:
Location: a warm beach, cool ocean breeze, nice waves, and a Margaritta
Distribution: RHEL 5.5 Tikanga
Posts: 63
Original Poster
Rep:
yea grail, that looks more like what i know i don't care how long it take, i'll run down to *$
so grail, is yours looking for the string "Table 10"
is the RS="\n\n" what tells awk to look for the table10.dat file (in my first code post i did know that with awk to put table10.dat before largefile.txt)
jay, i won't give up on yours either, but perl is a STEEP learning curve and i'm only now after a loooong time catching on to some of the stuff in awk
Looking again at the perl you may have been right with 413 ... just noticed how your original idea was setup
And yes, happy to explain:
Code:
RS="\n\n" - I put this first as it is actually interpreted prior to the script running and basically says that each record is delimited by 2 new lines, ie one after the record and the one on the empty line
FNR==NR{new=$0;next} - Expression will only be true for the first file and since we are using the same record separator it will effectively read the entire file into the variable new
/Table 10/{$0 = new} - Once the string has been found within a record, substitute the current value for our saved one
1 - print all records when found
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.