ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
The pack function creates a "binary structure" using components indicated by the first parameter out of all the other parameters. In this case, the first param is 'l', therefore the value of $ret is used as a signed long int.
The %ENV global variable holds information about your current environment. Hence, %ENV = () clears all environment variable information, and %ENV{CC} = $shellcode assigns the shellcode variable to the environment variable CC, which is typically used to represent the C compiler.
The exec function replaces the current process image with another one (that of the first parameter) with the given arguments (the rest of the parameters). In this case, exec will run the program /home/san/exploit/vulnerable with eight arguments, each the value of new_retword.
Practically, I'm not sure what it does; maybe something about a buffer-overflow exploitable in the vulnerable program which allows you to access environment variables? In any event, this Perl code is not bullet-proof: for example, the use of the CC bareword in the %ENV{CC} assignment is considered a bad idea because of ambiguity. (Is it a string or a function with no arguments?)
for example, the use of the CC bareword in the %ENV{CC} assignment is considered a bad idea because of ambiguity. (Is it a string or a function with no arguments?)
So Mr.taylor_venable, what do you think to replace 'CC' with?
That Perl code is a demo code to exploit the following C code in 2.4.20-8:
/* vulnerable.c
*
*
* authors: san, alert7, eyas, watercloud
*
* Vulnerable program on the IA32 architecture.
*/
#include <stdio.h>
#include <string.h>
int main (int argc, char *argv[])
{
char vulnbuff[16];
strcpy (vulnbuff, argv[1]);
printf ("\n%s\n", vulnbuff);
getchar(); /* for debug */
}
The value in the scalar $shellcode looks like it is a series of instructions. Architecture specific instructions, since they are (probably) op codes for IA32. Unless it is data.. but from context it looks like op codes.
Well, the shellcode string is 24 characters (bytes in C) and vulnbuff[] has room for 16 characters (again, 16 bytes) which means you've got 8 bytes written in vulnbuff[] which overflows the statically allocated storage (i.e. a "buffer overflow") and which therefore overwrites 8 bytes of memory allocated below vulnbuff[].
On IA32, statically allocated memory is located just after the current stack pointer, which is four bytes (32 bits). Just below the stack pointer is a return address, which indicates to which instruction address control should jump after the current function finishes execution. Again, the address (being on a 32-bit machine) is 4 bytes, so we can presume that the last four bytes of the shellcode will be written to the return address variable.
That means that when your vulnerable program is finished executing, instead of exiting normally, it will jump to whatever address in memory is pointed to by the last four bytes of the $shellcode variable. Then execution will continue at that point.
Are you sure that this array will be allocated static before the return address? If overwriting is the issue, it must be overwriting add/data below it/etc. Otherwise, I would presume this compiler has bad memory management design, as all local arrays would run the risk of overwriting return addresses, which seems possibly more sever that overwriting local data below itself (but eh, who's to say what is 'worse', but at least easier to notice). I have taken a few courses lately that pertained to memory organizations in various languages, and all points I can verify show these as local variables on the stack. Even in Ada, a inner procedure's closure would be contained as local data on the stack.
I am not sure about specifics, however, since it is not clear to me how $shellcode is being passed as a argument to the C procedure. I assumed this is just changing the execution environment by modifying the CC env variable. Clearly, however, the strcpy is a bad choice to fill vulnbuff[16] as there are no checks for overflow. As you mentioned, there will be N-16 bytes of overflow, where N is the length (in bytes) of arg[1]. But at what point does the env variable (CC) become a parameter to the C procedure? I am confused. Perhaps there is something about the calling convention that is unclear to me.
I invite you to take a look at: http://www1.cs.columbia.edu/~sedward...review.9up.pdf search the document for "local arrays". On most organizations I know of, these are allocated in the same order/placement as local variables in the stack frame. Perhaps there is something I am missing...? I think that some array literals are statically allocated in C (e.g. char), but I have never experience the order that you are describing.
I don't understand the point of setting the CC environment variable; I cannot see how that would have an impact on anything. However, I am quite certain about the order of variables on the stack. A program's stack space (for statically-allocated memory) grows "up"; that is, each variable has a higher address than that of the variable which was allocated before it. When a function call occurs, stack space is allocated for the new function, which goes "on top of" the existing stack, preserving the old contents. (This is why recursion works, for example.) The first space newly allocated on the stack is for the function's parameters. Then comes four bytes for the return address, then four bytes for the stack frame pointer. After that go local variables for the current function.
It becomes possible then, using the address of a local variable as a starting point, to write backwards (which is actually forwards, or downwards on the stack) and overwrite the values of the stack frame pointer and return address. That means it's possible to change the execution flow of the program. This is what is known as "stack smashing". A really good way to do this is through a buffer overflow, taking advantage of the fact that, in C at least, you can write to pretty much any memory you control. Here's some example code I wrote:
Code:
#include <stdio.h>
void bar(void);
void foo(void);
int main(int argc, char** argv) {
printf("<< Calling function bar()\n");
bar();
printf("<< Back from function bar()\n");
return 0;
}
void bar(void) {
char s[8];
unsigned int foo_addr = foo;
printf(">> Function bar() starts.\n\n");
printf("s = 0x%X\n", s);
/*
* s[] is only 8 bytes long
* after s comes the frame pointer (32 bit)
* then is the return address (32 bit)
* return addr starts at s + 12
*/
printf("s + 12 = 0x%X\n", s + 12);
printf("retrurn = 0x%X\n", *((unsigned int*)(s + 12)));
printf("main = 0x%X\n", main);
printf("foo = 0x%X\n", foo);
/*
* remember intel endian-ness
* pack them in the opposite order
*/
s[12] = (foo_addr & 255);
s[13] = (foo_addr & (255 << 8)) >> 8;
s[14] = (foo_addr & (255 << 16)) >> 16;
s[15] = (foo_addr & (255 << 24)) >> 24;
printf("new ret = 0x%X\n\n", *((unsigned int*)(s + 12)));
printf(">> Function bar() ends.\n");
}
void foo(void) {
printf("<< Function foo() starts.\n\n");
printf("Looks like your stack was smashed!\n");
exit(1);
}
The main() function calls bar(), which has a statically-allocated eight-byte array. At the beginning of the ninth byte (that is, s + 8) is the stack frame pointer. We can forget about that because it's not what we're trying to accomplish. At the beginning of the 13th byte (hmm, that's unlucky; s + 12) is the return address. This is what we want to change. In the example code, I find the address of function foo() and pick it apart into bytes. Keeping in mind that Intel CPUs are little-endian, I put the bytes into the memory taken by the return address in reverse order (at least, it looks backwards when you read it like a human). Just to make sure, I print out the new return address to make sure it matches the address of foo(). It does, so when function bar() reaches the end of its execution, it does a "return" to that address, which is function foo(). The end result, is that main() calls bar(), which "returns" to foo(). The stack was successfully smashed and the return address for function bar() was altered to direct the program along a new execution path.
That's why I think that the exploit for the vulnerable program was trying to smash the stack to leap out of the program and jump into another address. Then at this new address execution would continue, doing who-knows-what. Knowing that the parameters for a function are allocated at the start of that function's stack, it would be possible to send parameters to an arbitrary function that exists in memory. If, for example, your program used an exec() call, it would be possible to set the parameters for that function, then jump into it by changing a return address, thereby calling exec() with the parameters you provided. If the program were originally running as root, and then compromised in this way because of a buffer overflow, this would result in completely arbitrary execution as UID 0.
Obviously, these kinds of situations (unconstrained assignment into static arrays) should be avoided.
I assumed this is just changing the execution environment by modifying the CC env variable.
elyk1212, you are right. $shellcode is not being passed as a argument to the C procedure. $new_retword x 8 is the argument.
In vulnerable.c, the vulnbuff[16]'s actual size in memory is 24(you can verify using gdb).
$new_record is the address of $shellcode, 0xbffffffc is the top of stack, so "0xbffffffc - (length($path)+1) - (length($shellcode)+1)" is the address of $shellcode, $shellcode is located in the environment, and its address is lower than $path.
The length of $new_retword is 4, so the length of $new_retwordx8 is 32. When the code 'exec "$path",$new_retword x 8' is executed, ($new_retwordx8) will be passed to vulnerable as argv[1] which will be copy to vulnbuff[] by strcpy. Bingo, the actual size of vulnbuff[] in memory is 24, after the memory of vulnbuff is EBP, and after EBP is the reture address of main(). So, if you want to overwrite the return address of main(), 32 bytes are needed. And ($new_retwordx8) did this(its size is 32).
So when the strcpy() returns, the content vulnbuff[] will be "$new_retword$new_retword$new_retword$new_retword$new_retword$new_retword", and the EBP will be "$new_retword", and the return address of main() will be "$new_retword". "$new_retword" is the address of $shellcode in environment.
When main() returns, the program will jmp to the address of $shellcode, then the $shellcode will be executed. A stack overflow will be accomplished.
In the beginning, I don't know some codes in the perl code . After the explanation of taylor_venable, everything is clear.
Ah, so $shellcode was opcode. I figured as much, but the internals were confusing. Thanks chopsticks.
Quote:
However, I am quite certain about the order of variables on the stack. A program's stack space (for statically-allocated memory) grows "up"; that is, each variable has a higher address than that of the variable which was allocated before it. When a function call occurs, stack space is allocated for the new function, which goes "on top of" the existing stack, preserving the old contents. (This is why recursion works, for example.)
Thank you for your explanation, Taylor. I appreciate the time you have given in your response. I actually am familiar with this material, I am a Senior Computer Engineering/Computer Science double Major. It is interesting to note, recursion works without static stack-type allocation also (e.g. Scheme, and other dynamic binding function calls).
But my point was the order of the data on the stack (calling convention). Which, usually has parameters, then return address, then we are usually placing locals on the stack within the subroutine code, itself. Most depictions I have seen of a stack also show this order (or return->params->locals). I guess I have to read up on GCC calling conventions, as I was not aware of the order of local space.
Other than that, I was also asking how CC became a parameter to the main function, as I do not see the connection (but it was assumed in your solution). I wanted to know how you came to this conclusion. Chopsticks has clarified this now, I think. I will also check out your code example, it looks good.
Saves 3 instructions (but they are *probably* only saved at compilation time, since it is likely evaluated and stored as an immediate). Eh, easier to read?
I had to do a lot of embedded systems programming this year with General Purpose IO (GPIO) registers and things, so I am still in masking mode. It had the same byte order.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.