Buffer overflow in linux

jcafaro10 · 04-06-2009, 11:36 AM

I'm trying to learn about buffer overflow for class from the tutorial located here: http://insecure.org/stf/smashstack.html
but I'm having some trouble getting any of the sample code to work. I'm progamming in Ubuntu and it just seems like everything in memory is in the wrong place.

This is the code I want to get working:

Code:

void function(int a,int b, int c)
{
	char buffer1[5];
	int *ret;
	ret = buffer1+12;
	(*ret) += 8;
}
 
void main()
{ 
	int x;
	x=0;
	function(1,2,3);
	x=1;
	printf("%d\n",x);
}

The problem is that, I'm not sure how memory in linux works...is it word addressable? Is it byte addressable? If it's word addressable then buffer1+12 should take me past buffer (8 bytes) and pass the sfp (4 bytes) to the return address. However this doesn't seem to be the case.

David1357 · 04-06-2009, 01:36 PM

Quote:

Originally Posted by jcafaro10

The problem is that, I'm not sure how memory in linux works...is it word addressable? Is it byte addressable?

No, your problem is that you don't realize that the compiler hides the addressing from you. Even if the target architecture only supports word access, the compiler can generate instructions that effectively give you byte access.

For example, if you wanted to change a byte in a word, the compiler could read the word into a register, use an AND operation to save the bits that will not change, OR your desired value with the bits the mask set to zero, and save the result back to memory. In assembly, it might look something like this:

Code:

    mov r0, [r1]
    and r0, r0, #FFFFFF00
    mov r1, [r1 + 4]
    and r1, r1, #0FF
    or  r0, r0, r1
    str r0, [r1]

Now that is ARM assembly syntax, but it should illustrate the principle.

Automating that process was one of the early motivations for the development of compilers.

Quote:

Originally Posted by jcafaro10

If it's word addressable then buffer1+12 should take me past buffer (8 bytes) and pass the sfp (4 bytes) to the return address. However this doesn't seem to be the case.

Nope, it's not the case. buffer1 is a byte pointer, so you are getting the address of buffer1 plus 12 bytes. If you want the address of buffer1 plus 12 machine words, you need to use one of the following:

Code:

// Add 12 machine words to buffer1 the easy way
ret = ((int *) buffer1) + 12;
// Add 12 machine words to buffer1 the other easy way
ret = buffer1 + (12 * sizeof(int));
// Add 12 machine words to buffer1 the easy to read way
ret = &buffer1[12 * sizeof(int)];
// Add 12 machine words to buffer1 the slightly less easy to read way
ret = ((int *)buffer1)[12];

Figuring out the actual location of the return value on-the-fly may be harder than you think. Even if you figure it out by looking at the generated assembly, the next time you compile your code, you may turn on some option that causes the compiler to optimize away some variable or store the return value in a register. The only way to guarantee the results you want is to write your own assembly language.

Also, if your target is x86, then it supports every type of addressing. That makes your calculations even harder.

bgeddy · 04-06-2009, 02:13 PM

As David1357 said the addressing is handled by the compiler and depends on what you are pointing at. To help answer your questions why don't you just print some debugging values from your code something like this :

Code:

#include <stdio.h>
#include <stdlib.h>

void function(int a,int b, int c)
{
    char buffer1[5];
    char *diff,*ret;
    ret = (char *)buffer1+8;
    printf("\nbuffer is :0x%x",buffer1);
    printf("\nret is :0x%x",ret);
    diff=ret-(char *)buffer1;
    printf("\nDifference in decimal :%d",diff);
//    (*ret) += 8;
}
 
void main()
{ 
    int x;
    x=0;
    function(1,2,3);
    x=1;
    printf("\n%d\n",x);
}

Printing out values is a common debugging technique and can help figure things out. How useful this is for what you are trying is doubtful but it may give you a background. In any case I do hope you are not trying to write cracking code !

jcafaro10 · 04-06-2009, 11:12 PM

In Ubuntu I got it to work by doing this:

Code:

void function(int a,int b,int c)
{
   char buffer[10];
   int *ret;
   ret = &buffer[18];
   (*ret)+=7;
}

void main()
{
  int x;
  x = 4;
  function(1,2,3);
  x = 1;
  printf("%d\n",x);
}

When I looked in the disassembly it showed me that buffer was 14 bytes away from the stack pointer, so by adding 14 bytes, I would be at the stack pointer and by accessing the 18th byte I'm at the return address. Then I change the return address by 7 bytes because that gets me to skip over the correct instruction. The thing I'm still a little confused on is if ret is a pointer to an int, then if I increment ret by 7, won't it try to do 7 words, not 7 bytes? Or did ret change when I set it equal to the address of a byte of the array.

scoban · 04-07-2009, 06:32 AM

Try this article:
http://www.inundation.org/learn/shellcode/

bgeddy · 04-07-2009, 07:54 AM

Quote:

The thing I'm still a little confused on is if ret is a pointer to an int, then if I increment ret by 7, won't it try to do 7 words, not 7 bytes? Or did ret change when I set it equal to the address of a byte of the array.

I don't see your confusion here - ret points to an integer and when you add seven to an integer it increases by seven. You are increasing what ref points to not ref itself.

David1357 · 04-07-2009, 08:47 AM

Quote:

Originally Posted by jcafaro10

In Ubuntu I got it to work by doing this...

Maybe this will help:

Code:

   (*ret)+=7;  // Adds 7 to the value pointed to by ret
   ret += 7;   // Adds 7 * sizeof(int) = 28 to ret

Quote:

Originally Posted by jcafaro10

The thing I'm still a little confused on is if ret is a pointer to an int, then if I increment ret by 7, won't it try to do 7 words, not 7 bytes?

You added 7 to the address stored in the memory location pointed to by ret. You confusion comes from a lack of knowledge of pointers. Keep experimenting and asking questions. One day, the light will turn on.

jcafaro10 · 04-07-2009, 10:19 AM

That's a big help! Thanks that cleared up a lot for me.

One thing thats still a little troublesome though is that depending on which linux distro I'm using, I have different results. In Ubuntu without messing with any of the settings I can do this buffer exploit. In (what I presume to be) RedHat (which I'm remotely logging into) I get a segfault when I try it. I suppose for this project, I better figure out what distro I'm working with.

Sergei Steshenko · 04-07-2009, 10:28 AM

Quote:

Originally Posted by jcafaro10

That's a big help! Thanks that cleared up a lot for me.

One thing thats still a little troublesome though is that depending on which linux distro I'm using, I have different results. In Ubuntu without messing with any of the settings I can do this buffer exploit. In (what I presume to be) RedHat (which I'm remotely logging into) I get a segfault when I try it. I suppose for this project, I better figure out what distro I'm working with.

Regarding distros - I think you are looking at this from a wrong angle.

There is a thing called alignment. In practical terms this means there can be pieces of unused memory between variables/buffers.

So, as long as you haven't overwritten next buffer/variable, you technically (kind of) don't yet have buffer overflow.

Of course, your code is still wrong.

The difference between distros is most likely due to different gcc + linker versions, i.e. different default alignments.

David1357 · 04-07-2009, 11:16 AM

Quote:

Originally Posted by jcafaro10

That's a big help! Thanks that cleared up a lot for me.

You're welcome. Hit that big "thumbs up" button to "thank" me.

Quote:

Originally Posted by jcafaro10

One thing thats still a little troublesome though is that depending on which linux distro I'm using, I have different results.

This is not a distribution problem. It is a compiler problem. It is very hard to predict what code the compiler will generate. You end up having to look at the assembly it produces, and even that will change if the compiler version changes.

If you run "gcc --version" on each target, what does it show? Also, what default flags are being passed to each compiler? There are a lot of variables here. You might get better feedback if you told us exactly what you are trying. However, if what you are trying is to hack a remote box, you will get no further help.

jcafaro10 · 04-07-2009, 11:17 AM

That makes more sense actually. I guess I have to find out the exact gcc+linker version.

Why is my code wrong?

Sergei Steshenko · 04-07-2009, 11:21 AM

Quote:

Originally Posted by jcafaro10

That makes more sense actually. I guess I have to find out the exact gcc+linker version.

Why is my code wrong?

Well, I think you meant your code crosses buffer boundaries. If I'm wrong, then your code is not wrong WRT buffer overflow.

David1357 · 04-07-2009, 11:42 AM

Quote:

Originally Posted by Sergei Steshenko

Well, I think you meant your code crosses buffer boundaries. If I'm wrong, then your code is not wrong WRT buffer overflow.

The more I think about some of the things he has been saying about his local and remote machines and the more I look at his code, the more I think he has the source code for an application running on a remote machine and is trying to build a string that the application will load, overflow its buffer, smash its stack, and replace the return address with his desired return address.

Basically, this stinks like hacking, since he doesn't even know what distro is running on the remote machine. Anyone familiar with RedHat knows how to find the distro version, so that makes it stink even more.

David1357 · 04-07-2009, 11:49 AM

ok. I read some of the text at the link from the original post. This has me really wound up:

"How can we place arbitrary instruction into its address space? The answer is to place the code with are trying to execute in the buffer we are overflowing, and overwrite the return address so it points back into the buffer."

From the section titled "Shell Code".

jcafaro10 · 04-07-2009, 11:54 AM

That's exactly what I'm trying to do!

Except it's not intended to be actually malicious. Our final project for our CS class is to teach us about buffers/socket programming. Our professor set up a vulnerable virtual machine that we have to "hack" into. We have the source code for the server and we have to find a way to take advantage of it.