[SOLVED] looking for a cross between strace and gdb

wje_lq · 12-06-2010, 12:07 AM

I want to run a program under ... um ... something like strace, something like gdb. Let's call the something Fred. Every time I run a particular program under Fred, it detects when a system call is about to take place or a signal has occurred, and traps to Fred, which decides dynamically how to respond, whether the stimulus is a signal or an attempted open(), close(), read(), write(), socket(), connect(), listen(), select(), ioctl(), time(), or whatever. Although strace does this marvelously, what I'd like Fred to do is use code that I supply to doctor up what the subject program sees. In the case of write(), it should be able to modify what actually gets written.

In other words, a hard shell environment around the program which completely mediates between the subject program and the outside world.

I'll start with the source code for strace and gdb if I have to. But has this been done already?

JohnGraham · 12-06-2010, 04:06 AM

So long as the program you're wanting to wrap is dynamically linked, you coax the dynamic linker into using your own system calls, with the possibility of using the real system calls as a back-end if you really want. There's an example of how to do this in this blog post of mine, which (as it happens) shows how to interrupt calls to write() and modify its behaviour.

If the program isn't dynamically linked then I don't think there's anything you can do without modifying the kernel, but then not many binaries are statically linked.

wje_lq · 12-06-2010, 06:32 AM

Quote:

Originally Posted by JohnGraham

If the program isn't dynamically linked then I don't think there's anything you can do without modifying the kernel

I'm not interested in library calls like read() itself; I'm interested in the corresponding system calls.

I want to make no assumptions about how the program is linked. strace traps system calls, doesn't it? If it traps before a system call is made, and then single steps through that call, then I'm in (for annoying values of "in", as explained below). I can use the same techniques that gdb uses for modifying memory before and/or after the system call.

By "annoying values of 'in'", I mean that developing this will be a lot of work. I want to find out whether someone has already produced such a framework, so all I have to do is fill in the (hundreds of) stubs to do exactly what I want done on each system call.

wje_lq · 12-06-2010, 07:32 AM

It seems that gdb (the current 7.2, not the old 6.8 which was shipped with the debian lenny I'm running) has the "catch syscall" capability. In principle, this is all I need. I can change the source of gdb 7.2 to run C code that I add to that source at gdb compile time instead of coming back with a prompt.

In principle, this is all I need, and I don't need to look at strace at all. Once I've modified gdb to do what I want, I can strip out the parts of gdb that I don't need.

But is there any project out there that has already developed a framework for such a shell around a program?

wje_lq · 12-06-2010, 06:26 PM

Quote:

Originally Posted by wje_lq

I'm not interested in library calls like read() itself; I'm interested in the corresponding system calls.

That probably sounded a bit snotty. There are distinct advantages to the library approach. One is portability of the wrapping code; another is execution speed. If you want to wrap one program around another and have a production-variety new animal, the method outlined by John Graham is idea. It's just that my goal is to put a program with unknown characteristics into a sandbox, not only so it can't reach outside the sandbox, but possibly not even letting it see outside the sandbox.

Currently I'm back to looking at strace, because strace seems to have more knowledge of the individual system calls than gdb does. Also, strace is far simpler.

ntubski · 12-06-2010, 07:23 PM

Both gdb and strace are based on the ptrace(2) system call, so you'll probably want to look at that.

wje_lq · 12-06-2010, 07:26 PM

Quote:

Originally Posted by ntubski

Both gdb and strace are based on the ptrace(2) system call, so you'll probably want to look at that.

Exactly. And it looks as though I'll be able to play Dr. Frankenstein and bend strace to my will, so I probably won't learn as much about ptrace() as I should. :)

wagaboy · 12-06-2010, 10:16 PM

Quote:

Originally Posted by wje_lq

Exactly. And it looks as though I'll be able to play Dr. Frankenstein and bend strace to my will, so I probably won't learn as much about ptrace() as I should. :)

I've tweaked strace and created a simple program using ptrace to perform a task similar to yours. I was particularly interested in tracing/modifying write() system call, and working with ptrace was a lot easier. From my experience, and given the fact that you are ready to play with strace, it shouldn't take much time to develop what you are looking for.

Here's a tutorial on ptrace: http://www.linuxjournal.com/article/6100
One of the examples illustrates how write call is traced. Just add a few lines of code and you're already with modifying contents of write().

wje_lq · 12-07-2010, 12:19 AM

Thank you, wagaboy. At first I was all, "Let's just stick with strace, because it knows about all the system calls." But then I was like, "Welllll, maybe it wouldn't hurt to look at the link." And I was impressed. I'll still keep the strace source as a reference for how the different system calls work, but I can write a cleaner program from scratch, and the LJ article (as well as maybe its successor as part of the tarball) provides just what I need.

And thanks also to JohnGraham for providing a different perspective and a possibility for behavior modification for dynamically linked precompiled programs in a production environment.

And as ntubski points out, none of this is possible without whoever thought up ptrace().

wje_lq · 12-07-2010, 08:17 AM

I have followed the link provided by wagaboy, and the Linux Journal article is quite informative. It also includes a link to the source code. I exploded that link, and not everything would compile. I think that's because things have changed over the years. This is a script you can use to make everything compile correctly. The breakpoint program didn't work for me, but everything else did, and I'm not interested in breakpoints right now. The script assumes that it's sitting in the same directory as 6011.tgz, the tarball of the source files, and that you don't have a "work" subdirectory under that directory with anything you want to keep. :)

Code:

rm -rf work
mkdir work
cd work
cp ../6011.tgz .
tar -xovzf 6011.tgz
tar -xovf 6100-progs.tar
patch <<EOD
*** attach.c    Sat Jun  1 09:06:29 2002
--- attach.c.1  Tue Dec  7 02:57:05 2010
***************
*** 1,8 ****
  #include <sys/ptrace.h>
  #include <sys/types.h>
  #include <sys/wait.h>
  #include <unistd.h>
- #include <linux/user.h>
  
  int main(int argc, char *argv[])
  {   pid_t traced_process;
--- 1,10 ----
  #include <sys/ptrace.h>
  #include <sys/types.h>
+ #include <sys/user.h>
  #include <sys/wait.h>
+ #include <stdio.h>
+ #include <stdlib.h>
  #include <unistd.h>
  
  int main(int argc, char *argv[])
  {   pid_t traced_process;
EOD
patch <<EOD
*** changesyscall.c     Tue Dec  7 03:28:14 2010
--- changesyscall.c.1   Tue Dec  7 03:49:35 2010
***************
*** 1,9 ****
  #include <sys/ptrace.h>
  #include <sys/types.h>
  #include <sys/wait.h>
  #include <unistd.h>
- #include <linux/user.h>
- #include <sys/syscall.h>
  
  const int long_size = sizeof(long);
  
--- 1,12 ----
  #include <sys/ptrace.h>
+ #include <sys/syscall.h>
  #include <sys/types.h>
+ #include <sys/user.h>
  #include <sys/wait.h>
+ #include <stddef.h>
+ #include <stdlib.h>
+ #include <string.h>
  #include <unistd.h>
  
  const int long_size = sizeof(long);
  
***************
*** 88,108 ****
              wait(&status);
              if(WIFEXITED(status))
                  break;
!             orig_eax = ptrace(PTRACE_PEEKUSER, child, 4 * ORIG_EAX, NULL);
             
              if(orig_eax == SYS_write) {
                  if(toggle == 0) {
                      toggle = 1;
  
!                     params[0] = ptrace(PTRACE_PEEKUSER, child, 4 * EBX, NULL);
!                     params[1] = ptrace(PTRACE_PEEKUSER, child, 4 * ECX, NULL);
!                     params[2] = ptrace(PTRACE_PEEKUSER, child, 4 * EDX, NULL);
  #if 0
                  printf("Write called with params %ld, %s, %ld\n", 
                          params[0], str, params[2]);
                  printf("Changing write params\n");
  #endif
!                     str = (char *)calloc((params[2] + 1) * sizeof(char));
                      getdata(child, params[1], str, params[2]);
                      reverse(str);
                      putdata(child, params[1], str, params[2]);
--- 91,111 ----
              wait(&status);
              if(WIFEXITED(status))
                  break;
!             orig_eax = ptrace(PTRACE_PEEKUSER, child, offsetof(struct user_regs_struct,orig_eax), NULL);
             
              if(orig_eax == SYS_write) {
                  if(toggle == 0) {
                      toggle = 1;
  
!                     params[0] = ptrace(PTRACE_PEEKUSER, child, offsetof(struct user_regs_struct,ebx), NULL);
!                     params[1] = ptrace(PTRACE_PEEKUSER, child, offsetof(struct user_regs_struct,ecx), NULL);
!                     params[2] = ptrace(PTRACE_PEEKUSER, child, offsetof(struct user_regs_struct,edx), NULL);
  #if 0
                  printf("Write called with params %ld, %s, %ld\n", 
                          params[0], str, params[2]);
                  printf("Changing write params\n");
  #endif
!                     str = (char *)calloc((params[2] + 1), sizeof(char));
                      getdata(child, params[1], str, params[2]);
                      reverse(str);
                      putdata(child, params[1], str, params[2]);
EOD
patch <<EOD
*** freespaceinject.c   Tue Dec  7 04:00:36 2010
--- freespaceinject.c.1 Tue Dec  7 04:02:19 2010
***************
*** 1,9 ****
  #include <sys/ptrace.h>
  #include <sys/types.h>
  #include <sys/wait.h>
- #include <unistd.h>
- #include <linux/user.h>
  #include <stdio.h>
  
  const int long_size = sizeof(long);
  
--- 1,11 ----
  #include <sys/ptrace.h>
  #include <sys/types.h>
+ #include <sys/user.h>
  #include <sys/wait.h>
  #include <stdio.h>
+ #include <stdlib.h>
+ #include <string.h>
+ #include <unistd.h>
  
  const int long_size = sizeof(long);
  
EOD
patch <<EOD
*** inject.c    Tue Dec  7 04:04:18 2010
--- inject.c.1  Tue Dec  7 04:05:03 2010
***************
*** 1,9 ****
  #include <sys/ptrace.h>
  #include <sys/types.h>
  #include <sys/wait.h>
- #include <unistd.h>
- #include <linux/user.h>
  #include <stdio.h>
  
  const int long_size = sizeof(long);
  
--- 1,11 ----
  #include <sys/ptrace.h>
  #include <sys/types.h>
+ #include <sys/user.h>
  #include <sys/wait.h>
  #include <stdio.h>
+ #include <stdlib.h>
+ #include <string.h>
+ #include <unistd.h>
  
  const int long_size = sizeof(long);
  
EOD
patch <<EOD
*** registers.c Tue Dec  7 04:06:44 2010
--- registers.c.1       Tue Dec  7 04:10:25 2010
***************
*** 1,9 ****
  #include <sys/ptrace.h> 
  #include <sys/types.h>
  #include <sys/wait.h>
  #include <unistd.h>
- #include <linux/user.h>     /* For constants ORI_EAX etc */
- #include <sys/syscall.h>    /* For SYS_write etc */
  
  int main()
  {   pid_t child;
--- 1,11 ----
  #include <sys/ptrace.h> 
+ #include <sys/syscall.h>    /* For SYS_write etc */
  #include <sys/types.h>
+ #include <sys/user.h>     /* For constants ORI_EAX etc */
  #include <sys/wait.h>
+ #include <stddef.h>
+ #include <stdio.h>
  #include <unistd.h>
  
  int main()
  {   pid_t child;
***************
*** 23,29 ****
              wait(&status);
              if(WIFEXITED(status))
                  break;
!             orig_eax = ptrace(PTRACE_PEEKUSER, child, 4 * ORIG_EAX, NULL);
              if(orig_eax == SYS_write) {
                  if(insyscall == 0) {    /* Syscall entry */
                      insyscall = 1;
--- 25,31 ----
              wait(&status);
              if(WIFEXITED(status))
                  break;
!             orig_eax = ptrace(PTRACE_PEEKUSER, child, offsetof(struct user_regs_struct,orig_eax), NULL);
              if(orig_eax == SYS_write) {
                  if(insyscall == 0) {    /* Syscall entry */
                      insyscall = 1;
***************
*** 32,38 ****
                          regs.ebx, regs.ecx, regs.edx);
                  }
                  else { /* Syscall exit */ 
!                     eax = ptrace(PTRACE_PEEKUSER, child, 4 * EAX, NULL);
                      printf("Write returned with %ld\n", eax);
                      insyscall = 0;
                  }
--- 34,40 ----
                          regs.ebx, regs.ecx, regs.edx);
                  }
                  else { /* Syscall exit */ 
!                     eax = ptrace(PTRACE_PEEKUSER, child, offsetof(struct user_regs_struct,eax), NULL);
                      printf("Write returned with %ld\n", eax);
                      insyscall = 0;
                  }
EOD
patch <<EOD
*** simple.c    Tue Dec  7 04:13:03 2010
--- simple.c.1  Tue Dec  7 04:15:42 2010
***************
*** 1,8 ****
  #include <sys/ptrace.h> 
  #include <sys/types.h>
  #include <sys/wait.h>
  #include <unistd.h>
- #include <linux/user.h>   /* For constants ORI_EAX etc */
  
  int main()
  {   pid_t child;
--- 1,10 ----
  #include <sys/ptrace.h> 
  #include <sys/types.h>
+ #include <sys/user.h>   /* For constants ORI_EAX etc */
  #include <sys/wait.h>
+ #include <stddef.h>
+ #include <stdio.h>
  #include <unistd.h>
  
  int main()
  {   pid_t child;
***************
*** 15,21 ****
      }
      else {
          wait(NULL);
!         orig_eax = ptrace(PTRACE_PEEKUSER, child, 4 * ORIG_EAX, NULL);
          printf("The child made a system call %ld\n", orig_eax);
          ptrace(PTRACE_CONT, child, NULL, NULL);
      }
--- 17,23 ----
      }
      else {
          wait(NULL);
!         orig_eax = ptrace(PTRACE_PEEKUSER, child, offsetof(struct user_regs_struct,orig_eax), NULL);
          printf("The child made a system call %ld\n", orig_eax);
          ptrace(PTRACE_CONT, child, NULL, NULL);
      }
EOD
patch <<EOD
*** singlestep.c        Tue Dec  7 04:17:01 2010
--- singlestep.c.1      Tue Dec  7 04:17:52 2010
***************
*** 1,9 ****
  #include <sys/ptrace.h>
  #include <sys/types.h>
  #include <sys/wait.h>
  #include <unistd.h>
- #include <linux/user.h>
- #include <sys/syscall.h>
  
  int main()
  {   pid_t child;
--- 1,10 ----
  #include <sys/ptrace.h>
+ #include <sys/syscall.h>
  #include <sys/types.h>
+ #include <sys/user.h>
  #include <sys/wait.h>
+ #include <stdio.h>
  #include <unistd.h>
  
  int main()
  {   pid_t child;
EOD
patch <<EOD
*** syscallparams.c     Tue Dec  7 04:25:43 2010
--- syscallparams.c.1   Tue Dec  7 04:27:27 2010
***************
*** 1,9 ****
  #include <sys/ptrace.h> 
  #include <sys/types.h>
  #include <sys/wait.h>
  #include <unistd.h>
- #include <linux/user.h>     /* For constants ORI_EAX etc */
- #include <sys/syscall.h>    /* For SYS_write etc */
  
  int main()
  {   pid_t child;
--- 1,11 ----
  #include <sys/ptrace.h> 
+ #include <sys/syscall.h>    /* For SYS_write etc */
  #include <sys/types.h>
+ #include <sys/user.h>     /* For constants ORI_EAX etc */
  #include <sys/wait.h>
+ #include <stddef.h>
+ #include <stdio.h>
  #include <unistd.h>
  
  int main()
  {   pid_t child;
***************
*** 22,40 ****
              wait(&status);
              if(WIFEXITED(status))
                  break;
!             orig_eax = ptrace(PTRACE_PEEKUSER, child, 4 * ORIG_EAX, NULL);
              if(orig_eax == SYS_write) {
                  if(insyscall == 0) {    /* Syscall entry */
                      insyscall = 1;
!                     params[0] = ptrace(PTRACE_PEEKUSER, child, 4 * EBX, NULL);
!                     params[1] = ptrace(PTRACE_PEEKUSER, child, 4 * ECX, NULL);
!                     params[2] = ptrace(PTRACE_PEEKUSER, child, 4 * EDX, NULL);
  
                      printf("Write called with %ld, %ld, %ld\n", 
                          params[0], params[1], params[2]);
                  }
                  else { /* Syscall exit */ 
!                     eax = ptrace(PTRACE_PEEKUSER, child, 4 * EAX, NULL);
                      printf("Write returned with %ld\n", eax);
                      insyscall = 0;
                  }
--- 24,42 ----
              wait(&status);
              if(WIFEXITED(status))
                  break;
!             orig_eax = ptrace(PTRACE_PEEKUSER, child, offsetof(struct user_regs_struct,orig_eax), NULL);
              if(orig_eax == SYS_write) {
                  if(insyscall == 0) {    /* Syscall entry */
                      insyscall = 1;
!                     params[0] = ptrace(PTRACE_PEEKUSER, child, offsetof(struct user_regs_struct,ebx), NULL);
!                     params[1] = ptrace(PTRACE_PEEKUSER, child, offsetof(struct user_regs_struct,ecx), NULL);
!                     params[2] = ptrace(PTRACE_PEEKUSER, child, offsetof(struct user_regs_struct,edx), NULL);
  
                      printf("Write called with %ld, %ld, %ld\n", 
                          params[0], params[1], params[2]);
                  }
                  else { /* Syscall exit */ 
!                     eax = ptrace(PTRACE_PEEKUSER, child, offsetof(struct user_regs_struct,eax), NULL);
                      printf("Write returned with %ld\n", eax);
                      insyscall = 0;
                  }
EOD
patch <<EOD
*** dummy2.c    Tue Dec  7 04:28:32 2010
--- dummy2.c.1  Tue Dec  7 04:29:18 2010
***************
*** 1,3 ****
--- 1,5 ----
+ #include <stdio.h>
+ 
  int main() 
  {   int i;
      for(i = 0;i < 10; ++i) {
EOD
patch <<EOD
*** breakpoint.c        Tue Dec  7 04:59:06 2010
--- breakpoint.c.1      Tue Dec  7 05:04:24 2010
***************
*** 1,8 ****
  #include <sys/ptrace.h>
  #include <sys/types.h>
  #include <sys/wait.h>
  #include <unistd.h>
- #include <linux/user.h>
  
  const int long_size = sizeof(long);
  
--- 1,11 ----
  #include <sys/ptrace.h>
  #include <sys/types.h>
+ #include <sys/user.h>
  #include <sys/wait.h>
+ #include <stdio.h>
+ #include <stdlib.h>
+ #include <string.h>
  #include <unistd.h>
  
  const int long_size = sizeof(long);
  
***************
*** 92,97 ****
--- 95,101 ----
  
      /* Setting the eip back to the original instruction to let */
      /* the process continue */
+     regs.eax=-1;
      ptrace(PTRACE_SETREGS, traced_process, NULL, &regs);
     
      ptrace(PTRACE_DETACH, traced_process, NULL, NULL);
EOD
make

gnashley · 12-07-2010, 10:59 AM

I was trying to remember the name of this the other day and couldn't, but today it cam back: trackfs
http://www.mr511.de/software/english.html

"trackfs runs the child program(s) with tracing enabled and tracks the system calls" (using ptrace):
http://www.softpile.com/linux/trackfs.html

Might save you some time or give you some ideas.

wje_lq · 12-07-2010, 02:18 PM

I'm not sure what this does that strace does not. It's good to know that it's out there, however. Thank you.

wje_lq · 12-07-2010, 07:19 PM

For anyone interested in the source code tarball for the LJ articles, I've found a bug in the author's putdata() function. Since I've fixed that, the breakpoint program now works. The script for compiling everything, complete with this bug fix in the four programs which contain function putdata(), is here. It's now too long to just copy and paste into this post. I plan to keep it there at least through the rest of 2010.

The reason I came across the bug was this. I was going to use his getdata() and putdata() functions in my code, pretty much unmodified. But if you're familiar with ptrace(), you know that PTRACE_PEEKDATA and PTRACE_POKEDATA work with four bytes of data at a time. This means that if you want to use putdata() to change, oh, say, one byte, or two, or three, then putdata() had better contain not only PTRACE_POKEDATA, but also PTRACE_PEEKDATA, to get the full four bytes, right? (And that's just the simple case.) But no. putdata() doesn't (or didn't) cause any PTRACE_PEEKDATA to happen.