SED Help (Pattern Buffer Overflow I think?)
$ sed --version
GNU sed version 4.1.5 Copyright (C) 2003 Free Software [...] I'm using Ubuntu 9.04. I want to search for "XXX[^~]*XXY" and replace XXY with a new value. Reasion I think it has something to do with a buffer overflow is because the line I created works correctly on a 1kb file, but it does not work correctly on the original 8kb file (the 1kb is a subset of the 8kb file). --Doesn't Work-- $ sed -n '1h;1!H;${;g;s/\(0x23F00000[^~]*\) 0x2[137]F00000/\1 NEWADDRESS/;p;}' MYPATCH-8kb.patch --Works-- $ sed -n '1h;1!H;${;g;s/\(0x23F00000[^~]*\) 0x2[137]F00000/\1 NEWADDRESS/;p;}' MYPATCH-1kb.patch I can post my patch file if needed, but basically it is just the following: **Lots of Stuff Including Tilda's** 0x23F00000 **Some stuff that doesn't include Tilda** 0x2[137]F00000 **Lots of Stuff Including Tilda's** I think I have to modify the sed command to not put everything into the hold-space and only do it when it finds a "0x23F00000", but I'm unsure if this is possible. Or a way to increase the buffer size for sed via command line option that I missed? Alternatively I'm open to other methods that would be simpler than using sed. I've tried ssed but it had the same problem. I started looking at awk but haven't finished testing it yet. Thanks for any help. |
Hi, welcome to LQ!
It would help if you substituted "doesn't work" with an actual error message. If there's none then there's a good chance the problem is with the actual data (hard to say w/o having seen it). Cheers, Tink |
There is no error messages, but, I can do diff's to illustrate
--1kB Works-- Code:
nvrbst@kubuntu-pc:~/test$ sed -n '1h;1!H;${;g;s/\(0x23F00000[^~]*\) 0x2[137]F00000/\1 NEWADDRESS/;p;}' MYPATCH-1kb.patch > new1k.patch Code:
nvrbst@kubuntu-pc:~/test$ sed -n '1h;1!H;${;g;s/\(0x23F00000[^~]*\) 0x2[137]F00000/\1 NEWADDRESS/;p;}' MYPATCH-8kb.patch > new8k.patch If you think it'll help I don't mind making a 8kb+ test file to simulate the problem. You should be able to make it yourself though by doing the following. 1. PASTE 7kb of Junk Characters (anything including newlines). 2. Add "0x23F00000". 3. PASTE More Junk Character (no tilda's, newlines are okay). 4. Add "0x21F00000". 5. PASTE 1kb of Junk Characters (anything including newlines). 6. Save as MYPATCH-8kb.patch 7. Run commands in first post. |
I'm by no means a sed expert, it's just that I've never seen any
docu referring to there being a limit on buffer size or line length in GNU sed. It only doesn't seem to like NULL characters, which is fair enough, and which is where my thought that it might be the data came in. |
Code:
5. GNU sed's Limitations and Non-limitations I'm no sed expert either by far, and Tinkster knows much more than I about it, AND I am aware that the "Last Updated" date at the page I linked above is 2003, but 2003 is also the only date I see on my sed man page in my machine too. FInally, there are other pages (maybe more recent, maybe less recent) which do NOT say the same as quoted above.. My guess based on this all, is that it is *possible* for *something* to be limiting at least the buffer space, if not the line length which seems to be unlimited according to everywhere. :twocents: Sasha |
An 8k file should not cause a problem.
I don't see why you need to store the entire file in the Hold register. Why not let sed apply the rule to each line. The substitution command should be enough. Is it because you only want the first match to be substituted? |
Sed doesn't deal with newlines - it's a stream editor. The OP indicated there could be (possibly multiple) newlines in the test field.
If there is the the possibility of multiple consecutive newlines, then that even rules out the next option - awk with the RS set to null. Maybe just translate the newlines to some other known unused character, do a normal sed substitution on the data, then set the newlines back. |
Quote:
However, I'm very sorry I have found my problem. Tinkster was correct, it had to do with data not the buffer size. Somehow all the tabs got converted to spaces somewhere along the way with the 1kb file and the regular expression I was using "\(0x23F00000[^~]*\) 0x2[137]F00000" matched the spaces, but it didn't match the tabs (I thought " " matched any blank space? hehe). When I converted it to "\(0x23F00000[^~]*\)\t0x2[137]F00000" it then worked for the original 8kb file. Sorry for the inconvenience :) |
Use [[:space:]] in future - I didn't even see that space character.
|
All times are GMT -5. The time now is 03:03 AM. |