LinuxQuestions.org
Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back   LinuxQuestions.org > Blogs > rainbowsally
User Name
Password

Notices

Rate this Entry

ASM: Intercepting (and using) Errors (like segfaults) in asm (aka SEH)

Posted 02-03-2012 at 04:10 PM by rainbowsally

This is for i86 and x86_64 types but with a bit of poking around you can probably find a similar way to do this with other CPUs.

Note: This asm requires 'sigaction'. (See the test code. And bear in mind that this is not an attempt to create a full-blown system of signal handlers. Check the libc docs for 'sigaction' for ideas on how you can use something like this for a seed for a more extensive application.)

There used to be a lot of cool little programs for the terminal that did all kinds of fun stuff. There were BASICs, FORTHs, etc., but nowadays it seems we have confused the GUI with the nuts and bolts of the underlying programs. A 'fast' GUI does nothing more than typing faster than a slow one (in essence). And though the simple programs would have looked nicer with a nice GUI, choosing which GUI to use is one problem that may have caused them to disappear, and not having a great system for catching errors (structured error handling) may have been another.

Let's look at the structured error handling (seh) problem in asm and C since the GUI issue will continue to be one of preferences and a tendency toward idiosyncracy in Linux programs at any level above the kernel.

SEH.

First of all, we need to know where the internal data of a CPU is stored (called the "context" in Windows) when we catch a signal like "SIGSEGV" (11) which is the example we'll use here.

If you've ever tried to find that in 32- or 64- bit asm, this is probably all you need.

That struct pointer is located on the stack and is located like this (and this might also be where the context is for other CPUs but I haven't tried
it):

For 32 bits.
Code:
    //  int* pi = &signum;
    //  int z = pi[2]; 
    //  z += 20;
    //  sig_context* ps = (void*)z;

    // translation: get parameter 3.
    // set sig_context pointer to param3 + 5 stack_cells

    asm(
        "   lea    8(%ebp),%eax;"         // int *eax = &signum
        "   mov    8(%eax),%eax;"         // signum, p2, pointers
        "   lea    20(%eax),%eax;"        // pointers+20 -> psig_context
        "   mov    %eax,psig_context;"    // and now we're set to go...
       );

For 64 bits.
Code:
  // Since the first 6 parameters are held in registers in an ix_64 cpu
  // getting the first pointer address can't be done.
    //  int* pi = &signum; <-- will not work in 64 bits
    //  int z = pi[2];     <-- can't be done because of prob above
    //  z += 20; // 5 cells
    //  sig_context* ps = (void*)z;
  
    // translation: get parameter 3.
    // set sig_context pointer to param3 + 5 stack_cells
    // For 64 bits we do it this way...
  
  // so we need another way to get the third parameter, and for this 
  // we'll use a C-callable asm rountine called get_rdx() which just
  // returns rdx is rax.
  
    void**p;
    
    p = (void**)get_rdx();
    p+=5;
    psig_context = (sig_context*)p;
The above are written in C but shouldn't be too hard to adapt to your prefered dialect of asm and from here your code should start working.

But if you also need the 'picture' of the data for the cpu, here are the 32 and 64 bit versions of what we're calling the sig_context.


For 32 bits (longs here are dword size)
Code:
typedef struct _sig_context {
    // this comes in on the stack during an exception
    unsigned short gs, __gsh;
    unsigned short fs, __fsh;
    unsigned short es, __esh;
    unsigned short ds, __dsh;
    unsigned long edi;
    unsigned long esi;
    unsigned long ebp;
    unsigned long esp;
    unsigned long ebx;
    unsigned long edx;
    unsigned long ecx;
    unsigned long eax;
    unsigned long trapno;
    unsigned long err;
    unsigned long eip;
    unsigned short cs, __csh;
    unsigned long eflags;
    unsigned long esp_at_signal;
    unsigned short ss, __ssh;
    void* fpstate; // n/u for now
    unsigned long oldmask;
    unsigned long cr2;
    // theres more if you need it but you will have to find the header 
    // in the kernel dev stuff if you want it
}sig_context;
For 64 bits (longs here are qword size).
Code:
typedef struct _sig_context {
  // this comes in on the stack during an exception
  unsigned long r8;
  unsigned long r9;
  unsigned long r10;
  unsigned long r11;
  unsigned long r12;
  unsigned long r13;
  unsigned long r14;
  unsigned long r15;
  unsigned long rdi;
  unsigned long rsi;
  unsigned long rbp;
  unsigned long rbx;
  unsigned long rdx;
  unsigned long rax;
  unsigned long rcx;
  unsigned long rsp;
  unsigned long rip;
  unsigned long efl;
  unsigned long csgsfs;   /* actually short cs, gs, fs, __pad0.  */
  unsigned long err;
  unsigned long trapno;
  unsigned long oldmask;
  unsigned long cr2;
  // theres more if you need it but you will have to find the header 
  // in the kernel dev stuff if you want it
}sig_context;
Now for the fun part. Some test code.

Let's write a simple function to catch segment violations and return a flag indicating whether a memory location is read only or read-write or neither. And we'll use the 32 bit sig_context so it will work on either CPU type.

Note that the call will always return, so we don't crash or get the glib dump. In fact the test itself will catch two crashes in order to give us the three printouts, only one of which is the result of a mem_test() call that doesn't crash.

No jumpbuf. Nothing up my sleave.
file: src/main.c
Code:
// main.c for seh-test

#include <stdio.h>  // printf() 
#include <signal.h> // signal()

void dbg(){}

void exception_dispatcher(int signum);

static int mem_test_var;

static struct sigaction new_action;

void init_sighandler(void (*fn)(int))
{
  new_action.sa_handler = fn;
  sigemptyset (&new_action.sa_mask);
  new_action.sa_flags = SA_SIGINFO;
  // we only need to catch segfault, but let's anticipate
  // playing with any signal between 1 and 16.
  int i;
  for( i = 1; i < 16; i++ )
  {
    sigaction(i, &new_action, 0);
  }    
}


// sets mem_test_var = 1 if no access, 2 if read only, 3 if read-write.
void mem_test(void* address);
void print_result(const char* name);

int main(int argc, char** argv)
{
  dbg();
  init_sighandler(exception_dispatcher);
  
  const char* ro_test = "some text";   // read only
  char rw_test[4];                     // readwrite
  char* none_test = 0;                 // inaccessable
  
  mem_test((void*)ro_test);
  print_result("ro_test");
  
  mem_test((void*) rw_test);
  print_result("rw_test");
  
  mem_test((void*) none_test);
  print_result("none_test");
  
  return 0;
}

typedef struct _sig_context {
    // this comes in on the stack during an exception
  unsigned short gs, __gsh;
  unsigned short fs, __fsh;
  unsigned short es, __esh;
  unsigned short ds, __dsh;
  unsigned long edi;
  unsigned long esi;
  unsigned long ebp;
  unsigned long esp;
  unsigned long ebx;
  unsigned long edx;
  unsigned long ecx;
  unsigned long eax;
  unsigned long trapno;
  unsigned long err;
  unsigned long eip;
  unsigned short cs, __csh;
  unsigned long eflags;
  unsigned long esp_at_signal;
  unsigned short ss, __ssh;
  void* fpstate; // n/u for now
  unsigned long oldmask;
  unsigned long cr2;
    // theres more if you need it but you will have to find the header 
    // in the kernel dev stuff if you want it
}sig_context;


static sig_context* psig_context; // a pointer we can get at in asm

void asm_section() // creates a place for asm function(s) in a C file
{
  asm(
      // the asm declaration, code section, local (static)
      ".text;\n"
      ".align 4;\n"
      ".local mem_test;\n"
      ".type mem_test, @function;\n"
      "mem_test:\n"                 // (that's a colon)
      "movl $0, %eax;\n"            // clear and get const 0
      "movl %eax, mem_test_var;\n"  // init test type = 0
      "movl 4(%esp), %edx;\n"       // get address
      "incl mem_test_var;\n"        // crash on read = 1
      "movb (%edx), %al;\n"         // read one byte
      "incl mem_test_var;\n"        // crash on write = 2
      "movb %al, (%edx);\n"         // write it back 
      "incl mem_test_var;\n"        // read/write ok = 3
      "ret;\n"
     );
}

void exception_dispatcher(int signum)
{
  // TODO: filter and handle signals by signum.
  
  // This function will always have the stack frame 
  // so if it's written in asm, include that in your
  // code.
  
  asm(
      "   lea    8(%ebp),%eax;"         // int *eax = &signum
      "   mov    8(%eax),%eax;"         // signum, p2, pointers
      "   lea    20(%eax),%eax;"        // pointers+20 -> psig_context
      "   mov    %eax,psig_context;"    // and now we're set to go...
     );
  
  // Model the cpu doing a 'ret' opcode in C.  
  // See the disassembly if you aren't familiar with 
  // how C deals with operand sizes.
  
  void** esp = (void**)psig_context->esp; // stack ptr
  void* eip = *esp++;                     // pop eip
  
  // and update the context from our modeled registers
  psig_context->esp = (long)esp;
  psig_context->eip = (long)eip;  
}

void print_result(const char* name)
{
  const char* result;
  switch(mem_test_var)
  {
    case 1:
      result = "inaccessible";
      break;
    case 2:
      result = "read only";
      break;
    case 3:
      result = "read write";
      break;
    default:
      result = "unknown";
  }
  printf("%s:\tmemory is %s\n", name, result);
}
To test this with a makefile generated by our makefile-creator make sure the source file is in a subdirectory named src.
Code:
makefile-creator seh-test # straight C, output name is seh-test
make clean; make          # generate the app
seh-test                  # and run it
To add debug info change the '-O2' flag to '-g3' in the Makefile and repeat with a debugger.

Here's the printout from when I did this.
Code:
$> seh-test
ro_test:        memory is read only
rw_test:        memory is read write
none_test:      memory is inaccessible
Obviously the first and last tests segfaulted but all three ran and the main() function returns normally.

And there you have it. At least part of it. How you actually implement structured error handling in nested calls save and restore sigaction structs or implement priorities, or handle code where these things may go into a loop is up to you.

And yes, of course it can be used in GUI apps. But I don't know of many asm coders writing GUI apps. (yet.)

:-)
Posted in Uncategorized
Views 929 Comments 0
« Prev     Main     Next »
Total Comments 0

Comments

 

  



All times are GMT -5. The time now is 06:49 PM.

Main Menu
Advertisement

My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration