syntax converter for inline assembly?

maxreason · 02-04-2008, 10:03 PM

I am porting my 3D simulation/graphics engine from windoze/OpenGL to fedora8-linux/OpenGL (with the eclipse IDE). The latest set of errors are caused by the SIMD/SSE2 inline assembly language I wrote for the most speed-critcal routines: (1) multiply two 4x4 matrices; (2) multiply 4x4 matrix times 4x1 vectors; (3) misc.

The inline assembly language syntax is wildly different, which makes me wonder. Has anyone written a converter from conventional "intel" syntax to "gcc" syntax?

osor · 02-05-2008, 11:54 AM

Quote:

Originally Posted by maxreason

The inline assembly language syntax is wildly different, which makes me wonder. Has anyone written a converter from conventional "intel" syntax to "gcc" syntax?

There’s two things needed here:

Convert “intel” syntax to “AT&T/as” syntax.
Convert MSVC inline-assembly to gcc inline-assembly.

The first one can be done quite easily. In fact, you can even use intel syntax (where the destination comes before the source) instead of ATT syntax (where the destination comes after the source) inside gcc inline assembly. For example, this:

Code:

asm("movl %ecx, %eax");

and this:

Code:

asm(".intel_syntax noprefix\n\t"
    "mov eax, ecx\n"
    ".att_syntax prefix");

are equivalent within gcc. Of course other people reviewing your code might get confused, since it is conventional to use as syntax. Also, I am not sure what how to accomplish the same using extended inline assembly (where registers usually have two ‘%’ prefixing them). There are also programs which will convert asm source files from one syntax to the other. In this case, you would need to extract all the inline assembly to a real assembly source file, convert it, and then reformat the converted output to be suitable for inlining in gcc.

As for 2., I am not aware of any tool which does this automatically. Personally, I am unfamiliar with using inline assembly in microsoft compilers.

sundialsvcs · 02-05-2008, 06:02 PM

Within, say, the Linux source-code ("arch" subdirectory) there are plenty of examples of asm-routines in gcc.

One issue that you need to be very mindful of in Linux-land is the plethora of architectures that are supported! One version of your "speed-critical routines," available as a configure-option, must be straight-C that is guaranteed to work.

You might also find that this version, with gcc, is fast enough...

I trust that there will be no difficulties using your software with a usual configure/make sequence on my ... PowerPC based Macintosh. Or my 64-bit Intel uber-system.

Dox Systems - Brian · 02-06-2008, 08:05 AM

Quote:

Originally Posted by osor

In fact, you can even use intel syntax (where the destination comes before the source) instead of ATT syntax (where the destination comes after the source) inside gcc inline assembly.

Does gcc do the conversion, or is that passed through to gas? Not too familiar with the inner workings of the GNU compiler/assembler setup.

Would like to use them, but having grown up and spent many, many years with Intel syntax, I find AT&T quite unwieldy :-) Would be nice to be able to get the tools to work the way I want them to...

osor · 02-06-2008, 11:27 AM

Quote:

Originally Posted by Dox Systems - Brian

Does gcc do the conversion, or is that passed through to gas? Not too familiar with the inner workings of the GNU compiler/assembler setup.

All gcc inline assembly is passed directly to gas. This is the reason you need to have explicit newlines and tabs, because otherwise, you have a bunch of strings concatenated, which is meaningless to gas. So gas itself has the .intel_syntax directive (which is documented here).

osor · 02-06-2008, 11:37 AM

Additionally, gcc has the “-masm=intel” switch which says to make the intermediary gcc output itself in intel syntax. I am not aware if there is clashing if you don’t specify this switch to gcc when you do as I showed in post 2. I think a more correct version (if you don’t use that switch) would be to tell gas that the rest of the code should be in att syntax. E.g.,

Code:

asm(".intel_syntax noprefix\n\t"
    "mov eax, ecx\n"
    ".att_syntax prefix" /* This line tells as that the following code (emitted by gcc) is in att syntax */
);

I am not sure if that line is needed or not as I haven’t really tried using intel syntax inside gcc. I just assumed it was possible because it is possible using plain gas.

Edit: that line is required.

osor · 02-06-2008, 12:18 PM

Okay, I did some experimenting with simple inline assembly, and here’s what I found out. I will give four different versions of the same program.

inline1.c

Code:

int main()
{
	asm("movl $1, %ebx\n\t"
	    "movl $1, %eax\n\t"
	    "int $0x80");

	return 0;
}

inline2.c

Code:

int main()
{
	asm(".intel_syntax noprefix\n\t"
	    "mov ebx,1\n\t"
	    "mov eax,1\n\t"
	    "int 0x80\n"
	    ".att_syntax prefix");

	return 0;
}

inline3.c

Code:

int main()
{
	asm(".intel_syntax noprefix\n\t"
	    "mov ebx,1\n\t"
	    "mov eax,1\n\t"
	    "int 0x80"
	    //"\n.intel_syntax prefix"
	);

	return 0;
}

inline4.c

Code:

int main()
{
	asm("mov %ebx,1\n\t"
	    "mov %eax,1\n\t"
	    "int 0x80");

	return 0;
}

The first program is the usual way of doing things with gcc inline assembly. The program will call sys_exit with exit code 1, and will terminate before it ever reaches “return 0”.

The second version will do the exact same thing, but will temporarily change to intel syntax without prefixes, and subsequently change back to att syntax with prefixes.

The third one will change to intel syntax without register prefixes, so in order to use it, you need to pass “-masm=intel”. Without that switch, the code will not correctly assembly. There is a commented line which tells the assembler that the following code has prefixed registers. This line might make the program more correct, but is not necessary, since a prefixed register is still accepted even in noprefix mode.

The fourth program uses the compiler’s -masm=intel switch, which means that it will agree with all the code emitted by the compiler. Notice that I did use prefixes here, since -masm=intel only changes the addressing and operand order, but does not (by itself) remove prefixes. Had I neglected the prefixes, the assembler would have errored out.

So if you followed that, these are some variations on simple gcc inline assembly in various syntaxes. I am not sure how this plays with extended inline assembly (where you normally have doubly-prefixed registers), but I suspect you use the form of the fourth program.

maxreason · 02-08-2008, 03:33 AM

Quote:

Originally Posted by osor

Okay, I did some experimenting with simple inline assembly, and here’s what I found out. I will give four different versions of the same program.
[snip]
So if you followed that, these are some variations on simple gcc inline assembly in various syntaxes. I am not sure how this plays with extended inline assembly (where you normally have doubly-prefixed registers), but I suspect you use the form of the fourth program.

Thanks! Since I have such long stretches of intel-syntax inline assembly in certain places, it will help me considerably to be able to leave the code in intel syntax. Unless it becomes too much effort, I want to keep a single set of code that compiles on both windoze and linux. Though I plan to more-or-less abandon windoze, I like being able to compare performance of the two versions for a couple years, at least.

Here is a question that may (or may not) make the "extended syntax" issue easy for me to deal with.

I only have assembly language a few places, and every one is a long section of SIMD/SSE2+ math. Nonetheless, the first thing each section does is to load the addresses of 2 or 3 input arrays and 1 or 2 output arrays into registers, along with a couple counts. These are all addresses or values of C variables, assigned to the registers at the beginning of the assembly language section. Thus, all 6 of the x86 registers are clobbered (except ebp and esp, of course).

I really don't know what I'm talking about here, so I want to ask whether the following makes sense.

It seems to me that I can avoid the excrutiating hassle of converting ANY of my assembly language to "extended syntax" in one of two ways:

1: At the very end of the section, just add one assembly language statement with a "nop" --- but specify all 6 registers as "clobbered". Not sure whether this means I need to put all the code inside one literal string inside one one single asm("xxx") or not, but I would prefer NOT (I prefer doing a mass edit that puts one asm("xxx") on each line of assembly - though frankly, it should be okay either way).

2: These sections of assembly language are large (dozens of lines of assembly), AND typically execute/loop many times before they complete (transforming dozens of vertex coordinates/normal-vectors/tangent-vectors/bitangent-vectors from one coordinate system to another). Therefore, the overhead added to push all 6 registers at the beginning, then pop them back into place at the end imposes trivial overhead --- in which case the whole block of code clobbers no registers at all.

Still, I may need to write a few lines of "extended assembly" at the beginning, since the C variables I need to put into registers are either local variables, or structure elements inside an array element (complex to compute in assembly). But that is a small price to pay, compared to converting the whole pile to "extended assembly".

What say you about these ideas? Oh, I do need to make sure the C compiler doesn't try to jam any other code inside my assembly language OR change where the assembly language code happens inside my function! Hopefully it would not be so naughty as to do such a thing. Or would it?

OOPS. Later worry...

Will the .intex_syntax directive make the assembler correctly interpret all the other aspects of intel assembly language in the code - like array/offset addressing, etc? For example, will it assemble the following lines?

<code>
mov edi, r // r is a local variable
mloopasm: // a label
movddup xmm0, [edx + 0x20] // offset addressing
add edx, 0x20 // hexadecimal value
jnz mloopasm // label reference
</code>

osor · 02-08-2008, 08:29 PM

Quote:

Originally Posted by maxreason

Unless it becomes too much effort, I want to keep a single set of code that compiles on both windoze and linux. Though I plan to more-or-less abandon windoze, I like being able to compare performance of the two versions for a couple years, at least.

You can and should keep your code as portable as possible. As sundialsvcs said, always make sure you have a plain-C fallback for your code, and that the inline assembly is conditionally used depending on the architecture. Also, you can use gcc inline assembly where the target is windows (either with a cross-compiler or natively with MinGW-gcc). Always attempt to keep the target space for your program as large as possible without sacrificing ease of implementation.

Quote:

Originally Posted by maxreason

Here is a question that may (or may not) make the "extended syntax" issue easy for me to deal with.

I looked into the extended assembly usage, and it turns out that the role of the `%%’ prefixing a register is to let gcc know that it should be passed to the assembler as a literal, single `%’ prefix. So if you have prefixing enabled (as in inline1.c and inline4.c), you will need to double-prefix your registers. If you don’t have prefixing enabled (inline2.c and inline3.c), you should be fine with no prefixing whatsoever. The rest of extended assembly should work as intended (in particular, the prefix on a clobbered register is optional, no matter what mode you’re in).

Quote:

Originally Posted by maxreason

It seems to me that I can avoid the excrutiating hassle of converting ANY of my assembly language to "extended syntax" in one of two ways:

1: At the very end of the section, just add one assembly language statement with a "nop" --- but specify all 6 registers as "clobbered". Not sure whether this means I need to put all the code inside one literal string inside one one single asm("xxx") or not, but I would prefer NOT (I prefer doing a mass edit that puts one asm("xxx") on each line of assembly - though frankly, it should be okay either way).

2: These sections of assembly language are large (dozens of lines of assembly), AND typically execute/loop many times before they complete (transforming dozens of vertex coordinates/normal-vectors/tangent-vectors/bitangent-vectors from one coordinate system to another). Therefore, the overhead added to push all 6 registers at the beginning, then pop them back into place at the end imposes trivial overhead --- in which case the whole block of code clobbers no registers at all.

Creating extended syntax inline assembly might not be as excruciating as you think (see above). I don’t see 1 working the way you think it does. In particular, if you have your code in normal inline assembly, followed by a single nop in extended assembly (with all registers being clobbered), gcc will only save the registers before the nop, and will not know to do so before the real code.

As for 2, it seems to accomplish what you want, but if you could accomplish the same thing by putting everything in extended assembly.

Quote:

Originally Posted by maxreason

Still, I may need to write a few lines of "extended assembly" at the beginning, since the C variables I need to put into registers are either local variables, or structure elements inside an array element (complex to compute in assembly). But that is a small price to pay, compared to converting the whole pile to "extended assembly".

You can also force a particular local variable to occupy a particular register:

Code:

register int *foo asm ("edi");

But I am not sure if this is safe to use with optimization. I think extended inline assembly is safer.

Quote:

Originally Posted by maxreason

What say you about these ideas? Oh, I do need to make sure the C compiler doesn't try to jam any other code inside my assembly language OR change where the assembly language code happens inside my function! Hopefully it would not be so naughty as to do such a thing. Or would it?

I don’t think it would. Over all, I think you should consider using the extended syntax—most of your code will remain the same (if you are using noprefix), and you won’t have to resort to trickery.

Quote:

Originally Posted by maxreason

Will the .intex_syntax directive make the assembler correctly interpret all the other aspects of intel assembly language in the code - like array/offset addressing, etc?

Yes, I the different addressing modes are interpreted correctly when using .intel_syntax.

aonks · 06-06-2008, 12:35 PM

can someone please tell me the inline assembly code to find the absolute value of a double variable ??

it is very urgent ...

reply soon

osor · 06-06-2008, 03:50 PM

Quote:

Originally Posted by aonks

can someone please tell me the inline assembly code to find the absolute value of a double variable ??

Since the sign bit (which is set for negative numbers) is the first in a double (on most machines), you can just do a bitwise AND with a value which has all bits except the first set.

aonks · 06-06-2008, 04:09 PM

i know, i thought of it.
if i and the number as follows:
Number = Number AND $7FFFFFFFFFFFFFFF
.. i'll probably get the result

but i dont have any clue of the syntax ..
could u help me out with that ?

TB0ne · 06-07-2008, 10:23 AM

Quote:

Originally Posted by aonks

i know, i thought of it.
if i and the number as follows:
Number = Number AND $7FFFFFFFFFFFFFFF
.. i'll probably get the result

but i dont have any clue of the syntax ..
could u help me out with that ?

Wow...the very same question, but in another thread.