[SOLVED] I need help with understand a recursive assembly program.
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
One last tiny question, and I'm sure I could get this by reading it maybe ... but just in case:
I'm still a little thrown off ... in _start, before the first call to factorial, why only one addl $4, %esp after all that?
The rest of them don't clear the arguments from the stack...? I understand how it's working now, much better ... I'm just not sure I understand this quite yet.
This is just the first book I've ever read on assembly, and I'm still getting used to it, and this example really threw me off, I was learning a lot and really understanding things well up until this.
Code:
movl %ebp, %esp
popl %ebp
ret
what about clearing the stack of the argument?
maybe I'm still confused.
Normally you would fix up the stack pointer at the end of each function by using the frame pointer in bp. The main factorial code does this at the end:
Code:
movl %ebp, %esp
In the initial function, they save a few instructions by not using a frame pointer, and just resetting the stack pointer directly (effectively a pop)
Now, the most obvious question I could possibly ask. How do the values of ALL of the parameters to ALL of the recursive calls get combined into one return value? Then, where do all of the parameters passed in recursively go, before you get back to _start and addl $4, %esp?
Isn't it happening in the opposite direction, then?
Wouldn't it call the factorial function recursively, until the parameter was 1, then 1 is in %eax, then %ebx is multiplied by that %eax in the factorial function with the parameter of 2, then %ebx is multiplied by that %ebx in the factorial function with the parameter of 3, etc.?
It isn't like an actual factorial, 4 * 3 * 2 * 1,
it ends up being 1 * 2 * 3 * 4?
Yes, it is happening in the opposite way to the way you would typically express a factorial. Fortunately multiply operations are commutative (meaning you can swap the order and get the same result)!
They are also associative, which means you can also do
5 * (4 * (3 * (2 * (1)))) and get the same result.
So there are several ways to calculate the answer.
Interestingly, this is not true of all other mathematical objects. The chess game example I gave is one where you would have to be more careful about doing things in the correct order when recursing.
Last edited by neonsignal; 08-12-2009 at 08:54 PM.
I'm sorry to drag this out like this, but I'm very thoroughly under this impression:
the way the stack is for the function, is this:
the argument
the return address
old %ebp <- ebp
When the kaleidoscope of functions here come to an end, this happens:
Code:
movl %ebp, %esp
popl %ebp
ret
for every one.
Every one of them has "the argument" at the highest address in the stack.
then it's this:
Code:
movl %ebp, %esp:
the argument
the return address
old %ebp <- %esp
pop %ebp:
the argument
the return address
ret:
the argument (since ret pops the return address)
This happens for EVERY call to the function, then at the end of ALL of these calls, you use:
addl $4, %esp
So after all those recursive calls to the factorial function, you clear the stack of one argument. The argument for each individual function is not cleared in the end_factorial code. Where did they go?
Thanks for the additional information ... I did know that, though. I know about the commutative and associative and distributive properties, etc. I'm just not really advanced in algebra.
Thanks for the additional information, though.
The argument for each individual function is not cleared in the end_factorial code. Where did they go?
It is cleared each time by the copying of 'ebp' into 'esp' (or, not so much cleared as skipping back over it). 'esp' is the actual stack pointer, so overwriting it causes arguments to be lost from the stack. Actually, all a 'pop' does too is to copy something from the stack and then adjust the stack pointer.
So adjusting (or setting) the stack pointer is the same as one or more pops, while ignoring what was on the stack. Effectively the end_factorial code does simply add 4 to the stack pointer (since the value in ebp is esp+4).
In older architectures, there was no frame pointer, and the stack had to be fixed up explicitly. There are big advantages to having a frame pointer: it makes it easier to access parameters from a known point; and it makes it trivial to restore the stack pointer at the end.
In reality, everything is still in the stack memory, but all that matters is where the stack pointer is pointing to. The next time you do a function call, the old information will get overwritten.
Last edited by neonsignal; 08-12-2009 at 09:07 PM.
They don't all get combined at once. Each level multiplies the parameter into the result, so the result is 'accumulated' in eax.
movl 8(%ebp), %ebx
imull %ebx, %eax
This and the other stuff in this post, and the one I thanked you for before, is what I was really *missing*, where I thought I had no idea what I was looking at.
-----
In this book, it's this way through the whole book:
push the parameter
call the function
push %ebp
mov %esp, %ebp
then the local variables or whatever go here.
this would leave the parameter untouched, if you use:
mov %ebp, %esp
pop %ebp
ret
it changes the stack from:
parameter
return address
old %ebp <- %ebp
local variables, or whatever
to:
parameter
and I've had a good understanding of the stack pointer ... but in this book, it is always mov %ebp, %esp and pop %ebp goes back to where the return address is, then you use ret, then use the addl $4, %esp(or whatever the case may be) to clear the parameter from the calling function. In this case, it's a single addl $4, %esp, after the recursive function having the parameters pushed and not cleared repeatedly. It appears.
Code:
movl 8(%ebp), %eax #This moves the first argument to %eax
#4(%ebp) holds the return address, and
#8(%ebp) holds the first parameter
I've just been thinking that since I have been learning a lot and had very good comprehension of the rest of the examples in this book, so far, maybe I should just read the rest of the book, and worry about recursive functions later, or when I absolutely have to. I don't think there are any other recursive functions in this book, and it has a lot of information in it.
Yeah, I can see that it could be confusing. They haven't cleaned up the stack directly after the call to factorial, because it is taken care of in the end code. It is a common optimization; it doesn't matter how much you push on the stack in the middle of the function, the end code will take care of it.
Interestingly, compilation of some other high level languages had the opposite convention; it was the callee that cleaned up the stack, rather than the caller. The nature of the C language (particular in allowing variable numbers of arguments) has tended to favor the caller cleaning up the stack. There are also some advantages in optimization, since the caller knows what the subsequent code is, so the cleanup can be delayed as much as possible).
But I agree - if you already understand the recursive process in C, you won't learn much more by being able to follow every line of the assembler. Understanding recursion is much more important than being able to follow a few assembler optimizations.
Last edited by neonsignal; 08-12-2009 at 09:37 PM.
Well, thank you for your time and help, but I still don't really see how the parameters are being cleaned up in the factorial function or afterward ... can you give a brief description? It can be a little technical, if I don't really understand I'll just look it up ... but I've been able to understand what's in this book up until this example very easily ... I know it doesn't seem like it, but I guess it's just this example.
The examples of C recursion are easy for me to understand.
Okay, this is the relevant bits of the code. Let us say that esp starts at the value 42.
0. the call to the function pushes the return address. esp is now 38.
1. ebp is saved on the stack. esp is now 34.
2. esp is copied to ebp. both are 34.
3. eax is pushed. esp is now 30.
4. factorial is called. Let us assume for the sake of the argument that it has fixed up the stack at the end. so esp is still 30.
5. ebp is moved to esp. this sets esp to 34 (see step 3)
6. ebp is restored to the calling frame pointer. esp is now 38.
7. the function returns by popping the return address to the program counter. esp is now 42 (back to the original point).
Notice that even if they added 4 to esp after the recursive call to factorial, it would have made no difference, because ebp was used to overwrite esp in step 5 anyway.
Even if the recursive call to factorial totally messed up the stack (it can't, or its return would have failed), the stack would still be fixed up by step 5.
Last edited by neonsignal; 08-12-2009 at 09:59 PM.
I see. It's just that the factorial function uses the stack and the function call in a different way than _start and pretty much all the other examples in this book, other than this one.
Thanks. This was very helpful.
I know a few others in this book, they use the stack and things differently, but for some reason this example really got to me. It was very annoying, and seemed simple at first.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.