ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I'm trying to learn about buffer overflow for class from the tutorial located here: http://insecure.org/stf/smashstack.html
but I'm having some trouble getting any of the sample code to work. I'm progamming in Ubuntu and it just seems like everything in memory is in the wrong place.
This is the code I want to get working:
Code:
void function(int a,int b, int c)
{
char buffer1[5];
int *ret;
ret = buffer1+12;
(*ret) += 8;
}
void main()
{
int x;
x=0;
function(1,2,3);
x=1;
printf("%d\n",x);
}
The problem is that, I'm not sure how memory in linux works...is it word addressable? Is it byte addressable? If it's word addressable then buffer1+12 should take me past buffer (8 bytes) and pass the sfp (4 bytes) to the return address. However this doesn't seem to be the case.
The problem is that, I'm not sure how memory in linux works...is it word addressable? Is it byte addressable?
No, your problem is that you don't realize that the compiler hides the addressing from you. Even if the target architecture only supports word access, the compiler can generate instructions that effectively give you byte access.
For example, if you wanted to change a byte in a word, the compiler could read the word into a register, use an AND operation to save the bits that will not change, OR your desired value with the bits the mask set to zero, and save the result back to memory. In assembly, it might look something like this:
Code:
mov r0, [r1]
and r0, r0, #FFFFFF00
mov r1, [r1 + 4]
and r1, r1, #0FF
or r0, r0, r1
str r0, [r1]
Now that is ARM assembly syntax, but it should illustrate the principle.
Automating that process was one of the early motivations for the development of compilers.
Quote:
Originally Posted by jcafaro10
If it's word addressable then buffer1+12 should take me past buffer (8 bytes) and pass the sfp (4 bytes) to the return address. However this doesn't seem to be the case.
Nope, it's not the case. buffer1 is a byte pointer, so you are getting the address of buffer1 plus 12 bytes. If you want the address of buffer1 plus 12 machine words, you need to use one of the following:
Code:
// Add 12 machine words to buffer1 the easy way
ret = ((int *) buffer1) + 12;
// Add 12 machine words to buffer1 the other easy way
ret = buffer1 + (12 * sizeof(int));
// Add 12 machine words to buffer1 the easy to read way
ret = &buffer1[12 * sizeof(int)];
// Add 12 machine words to buffer1 the slightly less easy to read way
ret = ((int *)buffer1)[12];
Figuring out the actual location of the return value on-the-fly may be harder than you think. Even if you figure it out by looking at the generated assembly, the next time you compile your code, you may turn on some option that causes the compiler to optimize away some variable or store the return value in a register. The only way to guarantee the results you want is to write your own assembly language.
Also, if your target is x86, then it supports every type of addressing. That makes your calculations even harder.
Last edited by David1357; 04-06-2009 at 01:39 PM.
Reason: Added note about x86 supporting all types of addressing.
Distribution: slackware64 13.37 and -current, Dragonfly BSD
Posts: 1,810
Rep:
As David1357 said the addressing is handled by the compiler and depends on what you are pointing at. To help answer your questions why don't you just print some debugging values from your code something like this :
Code:
#include <stdio.h>
#include <stdlib.h>
void function(int a,int b, int c)
{
char buffer1[5];
char *diff,*ret;
ret = (char *)buffer1+8;
printf("\nbuffer is :0x%x",buffer1);
printf("\nret is :0x%x",ret);
diff=ret-(char *)buffer1;
printf("\nDifference in decimal :%d",diff);
// (*ret) += 8;
}
void main()
{
int x;
x=0;
function(1,2,3);
x=1;
printf("\n%d\n",x);
}
Printing out values is a common debugging technique and can help figure things out. How useful this is for what you are trying is doubtful but it may give you a background. In any case I do hope you are not trying to write cracking code !
void function(int a,int b,int c)
{
char buffer[10];
int *ret;
ret = &buffer[18];
(*ret)+=7;
}
void main()
{
int x;
x = 4;
function(1,2,3);
x = 1;
printf("%d\n",x);
}
When I looked in the disassembly it showed me that buffer was 14 bytes away from the stack pointer, so by adding 14 bytes, I would be at the stack pointer and by accessing the 18th byte I'm at the return address. Then I change the return address by 7 bytes because that gets me to skip over the correct instruction. The thing I'm still a little confused on is if ret is a pointer to an int, then if I increment ret by 7, won't it try to do 7 words, not 7 bytes? Or did ret change when I set it equal to the address of a byte of the array.
Distribution: slackware64 13.37 and -current, Dragonfly BSD
Posts: 1,810
Rep:
Quote:
The thing I'm still a little confused on is if ret is a pointer to an int, then if I increment ret by 7, won't it try to do 7 words, not 7 bytes? Or did ret change when I set it equal to the address of a byte of the array.
I don't see your confusion here - ret points to an integer and when you add seven to an integer it increases by seven. You are increasing what ref points to not ref itself.
(*ret)+=7; // Adds 7 to the value pointed to by ret
ret += 7; // Adds 7 * sizeof(int) = 28 to ret
Quote:
Originally Posted by jcafaro10
The thing I'm still a little confused on is if ret is a pointer to an int, then if I increment ret by 7, won't it try to do 7 words, not 7 bytes?
You added 7 to the address stored in the memory location pointed to by ret. You confusion comes from a lack of knowledge of pointers. Keep experimenting and asking questions. One day, the light will turn on.
That's a big help! Thanks that cleared up a lot for me.
One thing thats still a little troublesome though is that depending on which linux distro I'm using, I have different results. In Ubuntu without messing with any of the settings I can do this buffer exploit. In (what I presume to be) RedHat (which I'm remotely logging into) I get a segfault when I try it. I suppose for this project, I better figure out what distro I'm working with.
That's a big help! Thanks that cleared up a lot for me.
One thing thats still a little troublesome though is that depending on which linux distro I'm using, I have different results. In Ubuntu without messing with any of the settings I can do this buffer exploit. In (what I presume to be) RedHat (which I'm remotely logging into) I get a segfault when I try it. I suppose for this project, I better figure out what distro I'm working with.
Regarding distros - I think you are looking at this from a wrong angle.
There is a thing called alignment. In practical terms this means there can be pieces of unused memory between variables/buffers.
So, as long as you haven't overwritten next buffer/variable, you technically (kind of) don't yet have buffer overflow.
Of course, your code is still wrong.
The difference between distros is most likely due to different gcc + linker versions, i.e. different default alignments.
That's a big help! Thanks that cleared up a lot for me.
You're welcome. Hit that big "thumbs up" button to "thank" me.
Quote:
Originally Posted by jcafaro10
One thing thats still a little troublesome though is that depending on which linux distro I'm using, I have different results.
This is not a distribution problem. It is a compiler problem. It is very hard to predict what code the compiler will generate. You end up having to look at the assembly it produces, and even that will change if the compiler version changes.
If you run "gcc --version" on each target, what does it show? Also, what default flags are being passed to each compiler? There are a lot of variables here. You might get better feedback if you told us exactly what you are trying. However, if what you are trying is to hack a remote box, you will get no further help.
Well, I think you meant your code crosses buffer boundaries. If I'm wrong, then your code is not wrong WRT buffer overflow.
The more I think about some of the things he has been saying about his local and remote machines and the more I look at his code, the more I think he has the source code for an application running on a remote machine and is trying to build a string that the application will load, overflow its buffer, smash its stack, and replace the return address with his desired return address.
Basically, this stinks like hacking, since he doesn't even know what distro is running on the remote machine. Anyone familiar with RedHat knows how to find the distro version, so that makes it stink even more.
ok. I read some of the text at the link from the original post. This has me really wound up:
"How can we place arbitrary instruction into its address space? The answer is to place the code with are trying to execute in the buffer we are overflowing, and overwrite the return address so it points back into the buffer."
Except it's not intended to be actually malicious. Our final project for our CS class is to teach us about buffers/socket programming. Our professor set up a vulnerable virtual machine that we have to "hack" into. We have the source code for the server and we have to find a way to take advantage of it.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.