LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 03-06-2009, 07:25 AM   #1
dayalan_cse
Member
 
Registered: Oct 2006
Posts: 132

Rep: Reputation: 15
disassemble, assembly question


Hello All,


0x0000002a9b0017c5 <func_instance+661>: mov %r12,%rdi
0x0000002a9b0017c8 <func_instance+664>: lea 0x48(%rsp),%rsi
0x0000002a9b0017cd <func_instance+669>: mov 0x8(%r12),%rax
0x0000002a9b0017d2 <func_instance+674>: callq *0x7b8(%rax)

Can some body help me to understand what the above instructions does?
the above output is from gdb, dissassemble func_instance.

thank you in advance.

Thanks,
Deenadayalan
 
Old 03-06-2009, 08:53 AM   #2
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
At the start of this code r12 points to some object.

mov %r12,%rdi
rdi is used to pass the first argument to a function. But in object oriented programming (which this seems to be) the address of the object is a hidden first argument.
So the first instruction puts the address of some object into rdi.

lea 0x48(%rsp),%rsi
rsi is used for the second argument (first explicit argument for object oriented). The lea instruction sets rsi to the address of some object that is a local variable of the current function (an offset from rsp). Since just the address is loaded, the one explicit argument of the function must be passed by either pointer or reference (in asm code you can't tell the difference between pointer and reference).

mov 0x8(%r12),%rax
This must be getting the vtable address. I don't know why the vtable address is at offset 8 of the object, instead of offset 0. Maybe this isn't C++ (which I'm used to looking at). Anyway, the instruction loads some kind of pointer from offset 8 of the object that is now pointed to by both r12 and rdi.

callq *0x7b8(%rax)
Then the function is called, I think indirectly through the vtable. But if that really is a vtable, it's a very big one (lots of virtual functions). rax points to a vtable or something like a vtable. offset 0x7b8 in that table contains a pointer to a function. That is the function called.

All the above is just an estimate. There is no way to look at disassembly and be certain what it means (in high level programming terms).

Last edited by johnsfine; 03-06-2009 at 11:44 AM.
 
Old 03-06-2009, 11:25 AM   #3
paulsm4
LQ Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
Like johnsfine said, the assembly snippet is reading some local variable/data object from the stack ("lea 0x48(%rsp),%rsi"), dereferencing an address (again, "lea 0x48(%rsp),%rsi"), then calling the subroutine or method it finds at that address ("callq *0x7b8(%rax)").

Definitely sounds like an object method call (possibly C++, possibly not) to me, too.

Also, the "callq" implies that maybe this is a 64-bit executable.

'Hope that helps.. PSM

Last edited by paulsm4; 03-06-2009 at 11:27 AM.
 
Old 03-07-2009, 01:44 AM   #4
dayalan_cse
Member
 
Registered: Oct 2006
Posts: 132

Original Poster
Rep: Reputation: 15
Johnsfine and paulsum

Thank you very much for your valuable information.

I would like to understand and learn this assembly language, can you please point me some online tutorials and whats the best way to learn this assembly language (gnu disassembler assembly).

if you have any assembly tutorials in your computer, please send me to my email id dayalan_cse@gmail.com

Thank you in advance for your help.

Thanks,
Deenadayalan
 
Old 03-07-2009, 07:46 AM   #5
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
There are three major topics you need to understand to read that kind of disassembly:

1) AMD64 assembly language. For this example, it would tell you things like what each instruction opcode does, how the addressing modes work, etc.

2) GNU assembly language. GNU uses certain generic assembly language rules across architectures, so the GNU assembly language for AMD64 is different from AMD64 assembly language in several ways, such as:
Operand order: On some architectures including AMD64 the official assemby language use a TO,FROM operand order. GNU assemby on all platforms uses FROM,TO
% as a prefix for each register name
Size suffixes on some instructions, such as the q on the call instruction.

3) The AMD64 ABI, which tells things such as the use of rdi and rsi as the first two (non floating point) arguments in a function call.

I got all the AMD64 documentation I have from AMD's developer web site (which has been restructured since last I looked, so I can't link to exactly what I have). This is a good (at least today) link for documentation
http://developer.amd.com/documentati...s/default.aspx

In the middle of that page, it says
AMD64 Architecture Programmer's Manual Volume 3: General-Purpose and System Instructions
That is the main resource for my item (1) above.

I got the ABI from AMD's site as well. I can't find it there now, but it is also at x86-64.org
http://www.x86-64.org/documentation/
If you already know 32-bit GNU assembly, there is also a link there to a page that explains the basic changes from 32-bit to 64-bit assembly.

As for item (2) above, tutorials (mainly 32-bit) are linked to from so many threads at LQ, I assume you can do your own search.

Last edited by johnsfine; 03-07-2009 at 08:03 AM.
 
Old 03-08-2009, 08:04 AM   #6
cloud9repo
Member
 
Registered: Oct 2008
Location: Middle TN
Posts: 134

Rep: Reputation: 19
Quote:
Originally Posted by dayalan_cse View Post
Hello All,


0x0000002a9b0017c5 <func_instance+661>: mov %r12,%rdi
0x0000002a9b0017c8 <func_instance+664>: lea 0x48(%rsp),%rsi
0x0000002a9b0017cd <func_instance+669>: mov 0x8(%r12),%rax
0x0000002a9b0017d2 <func_instance+674>: callq *0x7b8(%rax)

Can some body help me to understand what the above instructions does?
the above output is from gdb, dissassemble func_instance.

thank you in advance.

Thanks,
Deenadayalan
The rsi is usually denoted as: Remove Symbolic Instruction, and lea seams to be a user defined intruct.

the additions of memory pokes leads me to believe it's hack code, specifically for destabilizing memory addressing. Could be what some of us call "Ghost Code", offered as a pkg. for chasers of it...
 
Old 03-08-2009, 08:51 AM   #7
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
I don't get the joke.

Quote:
Originally Posted by cloud9repo View Post
The rsi is usually denoted as: Remove Symbolic Instruction, and lea seams to be a user defined intruct.

the additions of memory pokes leads me to believe it's hack code, specifically for destabilizing memory addressing. Could be what some of us call "Ghost Code", offered as a pkg. for chasers of it...
That is such complete nonsense, I checked a bunch of your other posts to see if you were in the habit of posting intentional nonsense. I can't tell for sure, but I don't think you are.

I do have a habit of failing to see the joke in a technical post. I really don't see it this time. I hope no one trying to learn things from this thread takes your post seriously.
 
Old 03-08-2009, 11:02 PM   #8
dayalan_cse
Member
 
Registered: Oct 2006
Posts: 132

Original Poster
Rep: Reputation: 15
Hi Johnsfine,

0x0000002a9b0017c5 <func_instance+661>: mov %r12,%rdi
0x0000002a9b0017c8 <func_instance+664>: lea 0x48(%rsp),%rsi
0x0000002a9b0017cd <func_instance+669>: mov 0x8(%r12),%rax
0x0000002a9b0017d2 <func_instance+674>: callq *0x7b8(%rax)

I really appreciate for your help on the explanation for the above assmebly code.

I have another question, I understand that writing data in to un-allocated memory in the application would cause the core dump but what about the above registers (%rax, %rdi ...). my understanding is that "these are machine registers" not required to allocate memory and writing values in to these set of registers would never make a crash. Is it correct?

Thanks,
Deenadayalan
 
Old 03-09-2009, 06:33 AM   #9
dayalan_cse
Member
 
Registered: Oct 2006
Posts: 132

Original Poster
Rep: Reputation: 15
Hi Johnsfine,


=============================================================
0x0000002a9b0017c5 <func_instance+661>: mov %r12,%rdi
0x0000002a9b0017c8 <func_instance+664>: lea 0x48(%rsp),%rsi
0x0000002a9b0017cd <func_instance+669>: mov 0x8(%r12),%rax
0x0000002a9b0017d2 <func_instance+674>: callq *0x7b8(%rax)
=============================================================

To be precise, i am doing the below.

emp *e1;
e1 = (void*) ( Array[0] | ( (long) Array[1] << 32 ) )

assume Array[0] 0x735ec100
assume Array[1] 0x0
Now ( (long) Array[1] <<32) => should return 0x 00 00 00 00 00 00 00 00

Then i am doing type casting using (void*)
now i assume the e1 assignment becomes $r12 (its object).

What value you would expect in $r12?
Here is my observation is ==> $r12 is 0xff ff ff ff 73 5e c1 00

The crash happens at "mov 0x8($r12), %rax" because i see "0x ff ff ff ff" appended as upper 32-bits. but it should not instead it should be "0x 00 00 00 00 73 5e c1 00".

some thing is wrong, is it gcc/machine/my code?

Please help me to understand.

Thanks,
Deenadayalan
 
Old 03-09-2009, 08:13 AM   #10
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
Quote:
Originally Posted by dayalan_cse View Post
To be precise, i am doing the below.

emp *e1;
e1 = (void*) ( Array[0] | ( (long) Array[1] << 32 ) )
If I understand you correctly, you are saying that is the source code for the part of your function immediately before the part for which you showed the disassembly.

So you didn't show the disassembly for the part for which you showed source and you didn't show source for the part for which you showed disassembly.

I don't know whether to trust your guess that you are even looking at the section of disassembly you think.

Quote:
assume Array[0] 0x735ec100
assume Array[1] 0x0
Now ( (long) Array[1] <<32) => should return 0x 00 00 00 00 00 00 00 00

Then i am doing type casting using (void*)
now i assume the e1 assignment becomes $r12 (its object).

What value you would expect in $r12?
Here is my observation is ==> $r12 is 0xff ff ff ff 73 5e c1 00
That all sounds plausible. Without more context, I can't guess what is wrong.

Quote:
some thing is wrong, is it gcc/machine/my code?
It is safe to assume it is your code. gcc has very few bugs and no hardware problem would act that way.
 
Old 03-09-2009, 12:20 PM   #11
dayalan_cse
Member
 
Registered: Oct 2006
Posts: 132

Original Poster
Rep: Reputation: 15
I don't know whether to trust your guess that you are even looking at the section of disassembly you think.

Yes, i am looking at correct section of disassembly.


Do you have any idea, why the Array[8] and ((long) Array[9]<<32) conversion in e1 (that is $r12) the upper limit has "0x ff ff ff ff" but Array[9] has 0.

Please help me to understand.
 
Old 03-09-2009, 01:45 PM   #12
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
Quote:
Originally Posted by dayalan_cse View Post
Do you have any idea, why the Array[8] and ((long) Array[9]<<32) conversion in e1 (that is $r12) the upper limit has "0x ff ff ff ff" but Array[9] has 0.
If I assume you are incorrect about the value you mentioned (0x735ec100) and assume the true value was at least 0x80000000 and I assume that Array contains some signed type, then I would say the 0xFFFFFFFF came from sign extending Array[8] rather than from shifting Array[9].

But if I must start guessing which thing(s) you have said are incorrect, I could guess almost anything. If you provided more info, I wouldn't need to guess.

The disassembly of the code that computes e1 might help a lot to explain how e1 got the unexpected value.
 
Old 03-09-2009, 11:14 PM   #13
dayalan_cse
Member
 
Registered: Oct 2006
Posts: 132

Original Poster
Rep: Reputation: 15
Hi Johnsfine,

<
assume the true value was at least 0x80000000
>
You are right, its wrong address.
The correct address is 0x89 06 00 70
As you said its signed integer variable.

How does its getting extended for singed integer variable ( i understood, interger size is 4 byte on amd64 and the value 2^32/2 till is signed and if it cross than this value then it would represent as negative value) and how does this getting 0x ff ff ff ff extended for it in the above conversion (Array[8] | ( (long) Array[9] <<32) )?

Please help me to understand.

Thanks,
Deenadayalan
 
Old 03-10-2009, 07:59 AM   #14
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
When you convert a signed 32 bit value to 64 bits, the low 31 bits of the new value will be the low 31 bits of the old value and each of the high 33 bits of the new value will be equal to the high 1 bit of the old value.

Since you don't want that to happen, you need to do your cast a different way. Here are two good choices instead of:

Code:
e1 = (void*) ( Array[8] | ( (long) Array[9] << 32 ) );
1) Do the whole operation in one cast (instead of shifting and adding)
Code:
e1 = *(void**)(Array)[8/2];
Notice I needed to divide the index (8) by 2 because a void* is twice as large as an int, and that only works with that index an even number.

2) Cast to unsigned before the implicit cast to long
Code:
e1 = (void*) ( (unsigned int)Array[8] | ( (long) Array[9] << 32 ) );
Array[9] is cast from int to long with a time wasting sign extension, then shifted left 32 bits (discarding all consequences other than time of that sign extension).
Array[8] is cast from int to unsigned int. Then because it is to be combined with a long across the '|' the compiler implicitly casts it to long. Because it is unsigned, that implicit cast to long happens without the time cost or the error introduced by sign extension.

Last edited by johnsfine; 03-10-2009 at 08:03 AM.
 
Old 03-11-2009, 11:47 PM   #15
dayalan_cse
Member
 
Registered: Oct 2006
Posts: 132

Original Poster
Rep: Reputation: 15
Hi Johnsfine,

I really appreciate for your valuable information. Thank you very much for your information.

<
When you convert a signed 32 bit value to 64 bits, the low 31 bits of the new value will be the low 31 bits of the old value and each of the high 33 bits of the new value will be equal to the high 1 bit of the old value.
>

Do you mean, the new value of 32-63 bits will have by default high as binary value 1 so it becomes 0x ff ff ff ff <0-31 old value> ?

Thanks,
Deenadayalan
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Beginner question for C/Assembly p4nk4j Programming 5 05-18-2007 08:43 PM
Assembly Question aceman817 Programming 1 02-28-2006 01:01 AM
MIPS assembly question Gnute Programming 1 08-24-2004 05:33 PM
C & Assembly question eantoranz Programming 3 04-23-2004 01:18 PM
Assembly Question! wwnn1 Programming 4 06-16-2002 01:18 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 04:26 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration