LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Segmentation Fault when calling ASM from C (https://www.linuxquestions.org/questions/programming-9/segmentation-fault-when-calling-asm-from-c-4175437732/)

HellSinker 11-18-2012 08:10 PM

Segmentation Fault when calling ASM from C
 
Hello, I'm currently in the process of migrating from windows to linux.

I've written a fairly simple pair of programs to help me get aquainted with the differences in the environments.

I have a simple ASM program which I list here

Code:

[BITS 64]
GLOBAL Test
Test:
mov QWORD [rsp+8], r8
mov QWORD [rsp+16], r9
mov QWORD [rsp+24], rcx
mov QWORD [rsp+32], rdx
sub rsp, 40
add rsp, 40
mov rax, 1974
ret 0

For starters, I am curious if we even still store our parameters the same way, ie. on the stack with sub rsp and add rsp.

now I have another program written in C, listed here

Code:

#include <stdio.h>
void main()
{
printf("%u\r\n",Test(0,0,0,0));
}

compiling the ASM with NASM, and the C with CC, linking with GCC produces an executable that works (i.e. it will display the number '1974') but also seg faults on exit.

Could someone explain the reason for this Segmentation Fault?

I am using SUSE 12.2 Studio, with very little more than the JeoS base...

pan64 11-19-2012 03:03 AM

probably you not need add rsp, 40, but I'm not really sure.
You can insert a second line into the c code (another printf) and execute step by step in debug mode

linosaurusroot 11-19-2012 04:47 AM

At the end of main() you have no exit() so it will be trying return - which is perhaps what fails after your stack manipulation.

Really there's no need for assembler in most linux work (a few fields of work excepted). C is the lowest level most people need and they prefer to use higher levels such as Perl and Python.

johnsfine 11-19-2012 07:36 AM

Quote:

Originally Posted by HellSinker (Post 4832285)
For starters, I am curious if we even still store our parameters the same way, ie. on the stack with sub rsp and add rsp.

Store the parameters any valid way you like, but you ought to know that the standard for passing parameters is different. (There is no standard for storing parameters within the asm function.)

AMD defined a parameter passing standard when inventing the X86-64 architecture. Linux still uses that standard. Microsoft decided to ignore the standard and invent their own. In this particular case, Microsoft's version is technically superior to the X86-64 ABI standard (especially regarding floating point), but ignoring a standard (even an inferior one) causes problems for those like you who want to port asm source. It caused an even bigger problem (eventually solved) for the people who created the 64 bit version of mingw.

In integer/pointer part of the Microsoft standard, the first four parameters are passed in rcx, rdx, r8 and r9. Then the function must preserve rdi, rsi, rbx, rbp and r12-r15.

In the integer/pointer part of the X86-64 ABI standard (what you need to use with Linux) the first six parameters are passed in rdi, rsi, rdx, rcx, r8 and r9. Then the function must preserve rbx, rbp and r12-r15.

The floating point side of those standards differs more.

Quote:

Could someone explain the reason for this Segmentation Fault?
Quote:

Originally Posted by HellSinker (Post 4832285)
Code:

[BITS 64]
GLOBAL Test
Test:
mov QWORD [rsp+8], r8
mov QWORD [rsp+16], r9
mov QWORD [rsp+24], rcx
mov QWORD [rsp+32], rdx
sub rsp, 40
add rsp, 40
mov rax, 1974
ret 0


I expect you wanted to sub rsp,40 before using positive offsets from rsp to store (the wrong) registers.

The fact that you have the wrong registers would matter if your asm function tried to do something useful, but doesn't matter yet.

I'm not certain exactly how your main function was compiled so I don't know its exact stack layout. The locations you overwrote (rsp+8 through rsp+39 on entry to your asm function) are parts of the stack of that main function. You obviously overwrote at least one of the saved rbp or rip in main's stack frame, so that when main tries to return you get a seg fault.

Quote:

Originally Posted by linosaurusroot (Post 4832462)
At the end of main() you have no exit() so it will be trying return - which is perhaps what fails after your stack manipulation.

That is correct (and pretty obvious given that the printf worked). I hope the OP read my long post far enough to see why that stack overwrite was wrong, rather than just infer the stack manipulation was wrong as the only plausible reason the return from main() might crash.

Quote:

Originally Posted by linosaurusroot (Post 4832462)
Really there's no need for assembler in most linux work (a few fields of work excepted). C is the lowest level most people need and they prefer to use higher levels such as Perl and Python.

We at LQ answer a lot of "how to" questions with "this is why you shouldn't even want to". I've answered quite a few that way myself. But I think it is a really rotten answer this time.

There are lots of good reasons for the OP to want to do what he asks. Writing ASM functions to be called from C, is a very useful learning exercise that will help you with other programming and debugging even if you never write a line of ASM for production use. The OP seems to have ASM code already written and this is a porting question. So the decision (right or wrong) to use ASM is "water under the bridge". Please answer constructively to help solve the OP's problem, rather than posting non constructive discouragement.

HellSinker 11-19-2012 11:55 PM

Thankyou for your answers.

sundialsvcs 11-20-2012 07:39 AM

The most common reason for "seg-faults during cross language calls" (of any sort) is the so-called calling convention: that is to say, precisely how parameters are passed between them.

Also note that the most-common place to find assembly code is within an asm{} directive within a C/C++ program.

resetreset 11-21-2012 07:06 AM

This is a VERY IMPORTANT question: where did you pick up 64-bit assembly? Are there any books?

resetreset 11-21-2012 07:11 AM

Also, I don't get what the "sub rsp,40" and "add rsp,40" are for.

What is "ret 0"? What's the "0" for?

johnsfine 11-21-2012 07:24 AM

Quote:

Originally Posted by resetreset (Post 4833990)
where did you pick up 64-bit assembly? Are there any books?

Since I already knew x86-32 assembly and many other assembly languages, it was easiest for me to learn X86-64 from reference manuals (pdf files downloaded from AMD's website) rather than look for any resource that teaches assembly using X86-64.

The most useful of those reference manuals is a pdf file with 24594 in its name (inside the pdf file, the document name is AMD64 Architecture Programmer's Manual Volume 3: General-Purpose and System Instructions)

I just did a google search for AMD 24594 and the first hit looks like it is a current link to that pdf file, but I didn't test it. My old links to AMD pages that link to the full set of these manuals are no longer valid.

Other useful pdf files in that set included 24592, 24593, 26568 and 26569.

Quote:

Originally Posted by resetreset (Post 4833995)
I don't get what the "sub rsp,40" and "add rsp,40" are for.

In the exact code quoted earlier in this thread, those instructions are of course useless.

If the sub rsp,40 had been in its intended place before the instructions that use positive offsets of rsp for storage, then its purpose is to make those positive offsets OK to use.

In user mode X86-64, you can safely use stack space via negative offsets of rsp. So you might never need the sub from rsp at all. But if your asm function calls any other functions, you need to adjust rsp to account for your stack use before calling another function.

If you change rsp that way, you need to restore it before returning, so that is the purpose of the add rsp,40.


Quote:

What is "ret 0"? What's the "0" for?
The 0 there is useless. A nonzero value there would change the meaning to a form of return not used by the standard calling convention. See 24594 for specifics.

Without the 0, the ret instruction would perform the same operation as with the 0. With the 0, that asm syntax implies a three byte instruction that does the same job as the usual one byte ret instruction. I don't know whether the nasm assembler optimizes that. There are a few cases where an assembler emits the shorter version of an instruction even when the programmer has coded a functionally equivalent longer instruction.

resetreset 11-21-2012 07:27 AM

Actually I was asking the OP, not you John :) But anyway, that's bad news, cuz I really prefer to get my info from books, preferably ones written in a conversational style (was spoiled by Michael Abrash :) ).

Also, could you answer my question above about the "sub rsp, 40" and the "ret 0"?

johnsfine 11-21-2012 07:56 AM

Quote:

Originally Posted by resetreset (Post 4834005)
Actually I was asking the OP, not you John

Maybe the OP will also answer. But based on the first post, I would not estimate the OP ever did "pick up 64-bit assembly". I didn't bother to ask how the OP came to be porting asm code he doesn't really understand. I didn't need to know why in order to give appropriate help.

I expect the OP's partial understanding of the subject at some point came from some online tutorial. But maybe there is some book. I've never looked for one.

Quote:

But anyway, that's bad news, cuz I really prefer to get my info from books, preferably ones written in a conversational style
I think the audience for a book on x86-64 assembly would be much smaller than it was (even than it still is) for books on obsolete architecture assembly language.

Very few programmers learn x86-64 assembly and I think most of those are self motivated enough to get what they need from reference manuals.

Surprisingly many college courses still teach asm. But every one of those classes I've ever heard of teaches an obsolete architecture. This sustains the market for books on the obsolete architectures. I assume the instructors are neither self motivated enough nor intelligent enough to learn x86-64 asm themselves. They teach obsolete stuff because that is all they know how to teach.

Quote:

Also, could you answer my question above about the "sub rsp, 40" and the "ret 0"?
I tend to press "submit" part way through typing a post to avoid loosing too much in case of malfunction, then I edit to finish typing. So you saw my post before I finished. Reread my earlier post to see the rest of the answer.

resetreset 11-21-2012 08:02 AM

Gosh, you're a lifesaver John, thanks :)

By the way, may I PM you? I'd like to make friends.... :)

johnsfine 11-21-2012 08:29 AM

Quote:

Originally Posted by resetreset (Post 4834025)
By the way, may I PM you?

OK, but I usually prefer discussions in forums to private discussions. On technical subjects, it is nice to leave a trail that others who search for it later might find.

johnsfine 11-21-2012 09:03 AM

Quote:

Originally Posted by johnsfine (Post 4834023)
But every one of those classes I've ever heard of teaches an obsolete architecture.

It is so easy to use google to prove myself wrong. Maybe I should have tried that google search earlier.

I just found
http://www.cs.uaf.edu/2012/fall/cs30..._21_stack.html

I tried searching something from the first post that I thought would be an exact match in the online tutorial that I was guessing was the source of the OP's coding style. Zero hits!

Trying a slightly looser search for something similar, I still found no online tutorial on x86-64 asm (which doesn't mean it doesn't exist, just that I'm not searching well).

But that looser search did find the above page that seems to be the lecture notes from a lecture in the middle of an asm class that uses (some) x86-64 asm (what I said I'd never heard of). The top level link is
http://www.cs.uaf.edu/2012/fall/cs301/

The start of that series seems to focus on PowerPC asm and only uses x86-64 for contrast. I didn't read all the notes to see how much of the later focus is on x86-64

resetreset 11-22-2012 12:04 AM

John, I PM'ed you..... no reply? Please PM me back, don't email, cause I lost the password to that address long time ago.....


All times are GMT -5. The time now is 06:23 AM.