[SOLVED] SED Help (Pattern Buffer Overflow I think?)
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
$ sed --version
GNU sed version 4.1.5
Copyright (C) 2003 Free Software [...]
I'm using Ubuntu 9.04. I want to search for "XXX[^~]*XXY" and replace XXY with a new value. Reasion I think it has something to do with a buffer overflow is because the line I created works correctly on a 1kb file, but it does not work correctly on the original 8kb file (the 1kb is a subset of the 8kb file).
--Doesn't Work--
$ sed -n '1h;1!H;${;g;s/\(0x23F00000[^~]*\) 0x2[137]F00000/\1 NEWADDRESS/;p;}' MYPATCH-8kb.patch
--Works--
$ sed -n '1h;1!H;${;g;s/\(0x23F00000[^~]*\) 0x2[137]F00000/\1 NEWADDRESS/;p;}' MYPATCH-1kb.patch
I can post my patch file if needed, but basically it is just the following:
**Lots of Stuff Including Tilda's** 0x23F00000 **Some stuff that doesn't include Tilda** 0x2[137]F00000 **Lots of Stuff Including Tilda's**
I think I have to modify the sed command to not put everything into the hold-space and only do it when it finds a "0x23F00000", but I'm unsure if this is possible. Or a way to increase the buffer size for sed via command line option that I missed?
Alternatively I'm open to other methods that would be simpler than using sed. I've tried ssed but it had the same problem. I started looking at awk but haven't finished testing it yet. Thanks for any help.
It would help if you substituted "doesn't work" with an actual error
message. If there's none then there's a good chance the problem is
with the actual data (hard to say w/o having seen it).
To make the 1kb patch I simply copied the 8kb patch and deleted the lines that are not relevant (so I can see sed printed to stdout while testing). I don't see how it can be actual data as at the very least it should match the same part of the file (unless there is a sed buffer size issue); which is what I am guessing.
If you think it'll help I don't mind making a 8kb+ test file to simulate the problem. You should be able to make it yourself though by doing the following.
1. PASTE 7kb of Junk Characters (anything including newlines).
2. Add "0x23F00000".
3. PASTE More Junk Character (no tilda's, newlines are okay).
4. Add "0x21F00000".
5. PASTE 1kb of Junk Characters (anything including newlines).
6. Save as MYPATCH-8kb.patch
I'm by no means a sed expert, it's just that I've never seen any
docu referring to there being a limit on buffer size or line length
in GNU sed. It only doesn't seem to like NULL characters, which
is fair enough, and which is where my thought that it might be
the data came in.
5. GNU sed's Limitations and Non-limitations
For those who want to write portable sed scripts, be aware that some
implementations have been known to limit line lengths
(for the pattern and hold spaces) to be no more than 4000 bytes.
The POSIX standard specifies that conforming sed implementations shall
support at least 8192 byte line lengths. GNU sed has no built-in limit
on line length; as long as it can malloc() more (virtual) memory,
you can feed or construct lines as long as you like.
However, recursion is used to handle subpatterns and indefinite
repetition. This means that the available stack space may limit
the size of the buffer that can be processed by certain patterns.
I came across some stuff about limits the other day, and don't know where exactly, but just did a search of the net for a second and found this above, from here : http://www.delorie.com/gnu/docs/sed/sed_31.html
I'm no sed expert either by far, and Tinkster knows much more than I about it, AND I am aware that the "Last Updated" date at the page I linked above is 2003, but 2003 is also the only date I see on my sed man page in my machine too. FInally, there are other pages (maybe more recent, maybe less recent) which do NOT say the same as quoted above.. My guess based on this all, is that it is *possible* for *something* to be limiting at least the buffer space, if not the line length which seems to be unlimited according to everywhere.
Sasha
Last edited by GrapefruiTgirl; 01-13-2010 at 05:21 PM.
An 8k file should not cause a problem.
I don't see why you need to store the entire file in the Hold register. Why not let sed apply the rule to each line. The substitution command should be enough. Is it because you only want the first match to be substituted?
Sed doesn't deal with newlines - it's a stream editor. The OP indicated there could be (possibly multiple) newlines in the test field.
If there is the the possibility of multiple consecutive newlines, then that even rules out the next option - awk with the RS set to null.
Maybe just translate the newlines to some other known unused character, do a normal sed substitution on the data, then set the newlines back.
Last edited by syg00; 01-13-2010 at 05:42 PM.
Reason: RS not FS
An 8k file should not cause a problem.
I don't see why you need to store the entire file in the Hold register. Why not let sed apply the rule to each line. The substitution command should be enough. Is it because you only want the first match to be substituted?
Aye, I only want the 2nd match to be substituted (and there are newlines between them). So I couldn't figure out how to do a line by line thing (if it processed the file in reverse i'd be able to get it working I think).
However, I'm very sorry I have found my problem. Tinkster was correct, it had to do with data not the buffer size. Somehow all the tabs got converted to spaces somewhere along the way with the 1kb file and the regular expression I was using "\(0x23F00000[^~]*\) 0x2[137]F00000" matched the spaces, but it didn't match the tabs (I thought " " matched any blank space? hehe).
When I converted it to "\(0x23F00000[^~]*\)\t0x2[137]F00000" it then worked for the original 8kb file. Sorry for the inconvenience
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.