[SOLVED] looking for a cross between strace and gdb
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I want to run a program under ... um ... something like strace, something like gdb. Let's call the something Fred. Every time I run a particular program under Fred, it detects when a system call is about to take place or a signal has occurred, and traps to Fred, which decides dynamically how to respond, whether the stimulus is a signal or an attempted open(), close(), read(), write(), socket(), connect(), listen(), select(), ioctl(), time(), or whatever. Although strace does this marvelously, what I'd like Fred to do is use code that I supply to doctor up what the subject program sees. In the case of write(), it should be able to modify what actually gets written.
In other words, a hard shell environment around the program which completely mediates between the subject program and the outside world.
I'll start with the source code for strace and gdb if I have to. But has this been done already?
So long as the program you're wanting to wrap is dynamically linked, you coax the dynamic linker into using your own system calls, with the possibility of using the real system calls as a back-end if you really want. There's an example of how to do this in this blog post of mine, which (as it happens) shows how to interrupt calls to write() and modify its behaviour.
If the program isn't dynamically linked then I don't think there's anything you can do without modifying the kernel, but then not many binaries are statically linked.
If the program isn't dynamically linked then I don't think there's anything you can do without modifying the kernel
I'm not interested in library calls like read() itself; I'm interested in the corresponding system calls.
I want to make no assumptions about how the program is linked. strace traps system calls, doesn't it? If it traps before a system call is made, and then single steps through that call, then I'm in (for annoying values of "in", as explained below). I can use the same techniques that gdb uses for modifying memory before and/or after the system call.
By "annoying values of 'in'", I mean that developing this will be a lot of work. I want to find out whether someone has already produced such a framework, so all I have to do is fill in the (hundreds of) stubs to do exactly what I want done on each system call.
It seems that gdb (the current 7.2, not the old 6.8 which was shipped with the debian lenny I'm running) has the "catch syscall" capability. In principle, this is all I need. I can change the source of gdb 7.2 to run C code that I add to that source at gdb compile time instead of coming back with a prompt.
In principle, this is all I need, and I don't need to look at strace at all. Once I've modified gdb to do what I want, I can strip out the parts of gdb that I don't need.
But is there any project out there that has already developed a framework for such a shell around a program?
I'm not interested in library calls like read() itself; I'm interested in the corresponding system calls.
That probably sounded a bit snotty. There are distinct advantages to the library approach. One is portability of the wrapping code; another is execution speed. If you want to wrap one program around another and have a production-variety new animal, the method outlined by John Graham is idea. It's just that my goal is to put a program with unknown characteristics into a sandbox, not only so it can't reach outside the sandbox, but possibly not even letting it see outside the sandbox.
Currently I'm back to looking at strace, because strace seems to have more knowledge of the individual system calls than gdb does. Also, strace is far simpler.
Both gdb and strace are based on the ptrace(2) system call, so you'll probably want to look at that.
Exactly. And it looks as though I'll be able to play Dr. Frankenstein and bend strace to my will, so I probably won't learn as much about ptrace() as I should. :)
Exactly. And it looks as though I'll be able to play Dr. Frankenstein and bend strace to my will, so I probably won't learn as much about ptrace() as I should. :)
I've tweaked strace and created a simple program using ptrace to perform a task similar to yours. I was particularly interested in tracing/modifying write() system call, and working with ptrace was a lot easier. From my experience, and given the fact that you are ready to play with strace, it shouldn't take much time to develop what you are looking for.
Here's a tutorial on ptrace: http://www.linuxjournal.com/article/6100
One of the examples illustrates how write call is traced. Just add a few lines of code and you're already with modifying contents of write().
Thank you, wagaboy. At first I was all, "Let's just stick with strace, because it knows about all the system calls." But then I was like, "Welllll, maybe it wouldn't hurt to look at the link." And I was impressed. I'll still keep the strace source as a reference for how the different system calls work, but I can write a cleaner program from scratch, and the LJ article (as well as maybe its successor as part of the tarball) provides just what I need.
And thanks also to JohnGraham for providing a different perspective and a possibility for behavior modification for dynamically linked precompiled programs in a production environment.
And as ntubski points out, none of this is possible without whoever thought up ptrace().
I have followed the link provided by wagaboy, and the Linux Journal article is quite informative. It also includes a link to the source code. I exploded that link, and not everything would compile. I think that's because things have changed over the years. This is a script you can use to make everything compile correctly. The breakpoint program didn't work for me, but everything else did, and I'm not interested in breakpoints right now. The script assumes that it's sitting in the same directory as 6011.tgz, the tarball of the source files, and that you don't have a "work" subdirectory under that directory with anything you want to keep. :)
For anyone interested in the source code tarball for the LJ articles, I've found a bug in the author's putdata() function. Since I've fixed that, the breakpoint program now works. The script for compiling everything, complete with this bug fix in the four programs which contain function putdata(), is here. It's now too long to just copy and paste into this post. I plan to keep it there at least through the rest of 2010.
The reason I came across the bug was this. I was going to use his getdata() and putdata() functions in my code, pretty much unmodified. But if you're familiar with ptrace(), you know that PTRACE_PEEKDATA and PTRACE_POKEDATA work with four bytes of data at a time. This means that if you want to use putdata() to change, oh, say, one byte, or two, or three, then putdata() had better contain not only PTRACE_POKEDATA, but also PTRACE_PEEKDATA, to get the full four bytes, right? (And that's just the simple case.) But no. putdata() doesn't (or didn't) cause any PTRACE_PEEKDATA to happen.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.