LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 03-18-2013, 10:16 PM   #1
Fritz_Doll
LQ Newbie
 
Registered: Dec 2010
Posts: 20

Rep: Reputation: Disabled
x86 Assembly - Byte Accounting


My real issue with the assembly is that there are 88 bytes being allocated on the stack for the local variable(and something else possibly), but the buffer is only 64 bytes which for x86 should not need any padding. So why are there extra bytes lying about? (24 of them)

target.c ;; compiled with `gcc target.c`
Code:
#include <string.h>
void func(const char *str){
  char buff[64];
  strcpy(buff, str);
}
int main(int argc, char **argv){
  func(argv[1]);
  return 0;
}
the output of `objdump -d a.out` for func
Code:
080483e4 <func>:
 80483e4:       55                    push   %ebp
 80483e5:       89 e5                 mov    %esp,%ebp
 80483e7:       83 ec 58              sub    $0x58,%esp
 80483ea:       8b 45 08              mov    0x8(%ebp),%eax
 80483ed:       89 44 24 04           mov    %eax,0x4(%esp)
 80483f1:       8d 45 b8              lea    -0x48(%ebp),%eax
 80483f4:       89 04 24              mov    %eax,(%esp)
 80483f7:       e8 20 ff ff ff        call   804831c <strcpy@plt>
 80483fc:       c9                    leave
 80483fd:       c3                    ret
Perhaps somebody with a better understanding of x86 assembly or gcc can help me understand what is being done here; as to why there are extra bytes lying about.

using:
gcc 4.6.1
objdump 2.21.1.20110627
 
Old 03-19-2013, 05:19 AM   #2
millgates
Member
 
Registered: Feb 2009
Location: 192.168.x.x
Distribution: Slackware
Posts: 852

Rep: Reputation: 389Reputation: 389Reputation: 389Reputation: 389
I think it has something to do with the way gcc optimizes the alignment of the stack.
see the -mpreferred-stack-boundary option in your gcc manual.

Last edited by millgates; 03-19-2013 at 05:55 AM.
 
Old 03-19-2013, 07:12 AM   #3
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
An extra 8 bytes are for passing two parameters to strcpy.

The rest is a 16 byte alignment thing: The stack frame has eip and ebp plus 88 bytes, which is 96 total, so 16 byte alignment is maintained.

The 16 byte alignment occurs just before eip is pushed in a call. So after ebp is pushed, it is 8 bytes off from aligned.

The 8 bytes "wasted" above buff serve to make buff itself 16 byte aligned (I'm not sure why it should be) then the fact that those 8 bytes were wasted means another 8 bytes must be wasted to keep overall alignment.

Last edited by johnsfine; 03-19-2013 at 07:30 AM.
 
Old 03-19-2013, 02:03 PM   #4
Fritz_Doll
LQ Newbie
 
Registered: Dec 2010
Posts: 20

Original Poster
Rep: Reputation: Disabled
Not knowing where those extra bytes were coming from was annoying.

Thanks for the help,
I guess it's one more thing gcc is doing that I am unaware of.
Will have to add that to the list of flags to track.
 
Old 03-19-2013, 03:27 PM   #5
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
16 byte alignment of the stack frame is probably a good thing. Millgates provided the keyword needed to search for documentation and/or to modify the behavior.

I'm more curious about the 16 byte alignment of char buff[64]

I wouldn't know where to look in gcc documentation for the discussion of why that happens. I believe ordinary scalar variables are aligned according to their size (a double is 8 byte aligned, an int is 4 byte aligned, a char is 1 byte aligned). So why/when does an array have stricter alignment than elements of the array need?

Or have I misinterpreted one (B) of the three chunks of eight bytes?

A) One is for parameters to strcpy.
B) Another is either for alignment of buff or I'm confused
C) The third is for alignment of the stack frame only because of the second.
 
Old 03-19-2013, 10:01 PM   #6
Fritz_Doll
LQ Newbie
 
Registered: Dec 2010
Posts: 20

Original Poster
Rep: Reputation: Disabled
Lightbulb

Going back through the code more carefully, and now using the 16-byte alignment it makes more sense.

Here is how I'm reading it now;

[----] := 4-bytes
w := "wasted" bytes

call pushes %eip to the indicated location

[%eip][%ebp][wwww][wwww] [----][----][----][----]
[----][----][----][----] [----][----][----][----]
[----][----][----][---*] [wwww][wwww][*str][buff]


then %ebp gets pushed to the stack, to make the array 16-byte aligned we need to subtract 0x40(char buff[64]) + 0x8(16-byte align stack from ebp and eip);
we get the buffer at -0x48(array start location indicated by *).
Then to maintain the 16-byte alignment for the 2 parameters to strcpy we need to subtract another 0x10 from the stack, thus the 0x58.

Going through the function code the input address gets loaded to the location indicated by *str and the address for the array(*) gets loaded to the location indicated by buff.

Code:
080483e4 <func>:
 80483e4:       55                    push   %ebp
 80483e5:       89 e5                 mov    %esp,%ebp
 80483e7:       83 ec 58              sub    $0x58,%esp
 80483ea:       8b 45 08              mov    0x8(%ebp),%eax         ;; mov *str to eax
 80483ed:       89 44 24 04           mov    %eax,0x4(%esp)         ;; mov eax to [esp + 4]
 80483f1:       8d 45 b8              lea    -0x48(%ebp),%eax       ;; load effective address of "buff" into eax
 80483f4:       89 04 24              mov    %eax,(%esp)            ;; mov eax to [esp]
 80483f7:       e8 20 ff ff ff        call   804831c <strcpy@plt>
 80483fc:       c9                    leave
 80483fd:       c3                    ret
Now main makes sense accounting for 16-byte alignment.
Code:
080483fe <main>:
 80483fe:       55                    push   %ebp
 80483ff:       89 e5                 mov    %esp,%ebp
 8048401:       83 e4 f0              and    $0xfffffff0,%esp  ;; 16-byte align %esp (was wondering what this was for)
 8048404:       83 ec 10              sub    $0x10,%esp        ;; 16-bytes (only 4-bytes are used)
 8048407:       8b 45 0c              mov    0xc(%ebp),%eax
 804840a:       83 c0 04              add    $0x4,%eax
 804840d:       8b 00                 mov    (%eax),%eax
 804840f:       89 04 24              mov    %eax,(%esp)
 8048412:       e8 cd ff ff ff        call   80483e4 <func>
 8048417:       b8 00 00 00 00        mov    $0x0,%eax
 804841c:       c9                    leave
 804841d:       c3                    ret
 804841e:       90                    nop
 804841f:       90                    nop
Figured I'd give learning x86 another go, and that it'd be more effective for me to learn through disassembly than through forward engineering.
So quite a bit of this is new to me.
 
Old 03-20-2013, 07:28 AM   #7
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
Quote:
Originally Posted by Fritz_Doll View Post
Figured I'd give learning x86 another go, and that it'd be more effective for me to learn through disassembly than through forward engineering.
I think dissassembly is a great path to learning x86 asm. But I will suggest that 64-bit x86 asm is more useful to learn than 32-bit. If you care, you can easily find several other threads in which I give more detailed opinions on learning asm.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] how to traverse byte array in assembly using nasm as assembler? tushs Programming 3 06-25-2010 01:18 AM
Recommend a distribution for C/C++ & x86 Assembly development reverse Programming 16 04-28-2007 11:04 AM
x86 assembly programming in Linux XsuX Programming 9 12-01-2004 09:45 AM
x86 Assembly - segmentation fault? jrtayloriv Programming 1 07-05-2004 12:52 AM
x86 assembly: error message mandrake_linux Programming 1 06-12-2001 09:00 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 03:22 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration