Can't figure out how to fix "Invalid mix of operators and operands" error in assembly

steviebob · 07-04-2010, 09:34 PM

Hello, I have been trying to learn assembly lately, and I have to say, I like it a bit more than C, but that's just my tastes. Now on to the question, pardon if I sound like a n00b to this, but I cannot for the life of me figure out what is wrong with my code here, which is just a simple assembly equivalent of a C/C++ IF statement, I'm trying to create a simple little loop until the variable "num" reaches "10" and then prints out "

" and quits the program. I hope my code will clear things up a bit more: (I get the error on line 20, which I have marked for you)

Code:

;loop from 1 to 10
	global _start
_start:
section .data
	m:	db	":D",10
	mL:	equ	$ - m
	ten:	equ	"10"
	num:	db	"0"
	numL:	equ	$ - num
	one:	equ	"1"
section .text
	mov	AX, num
	cmp	AX, ten
	jne	addOne
	jng	endProg
addOne:
	mov	eax, 4
	mov	ebx, 1
	mov	ecx, num
	add	num, 1		;line 20
	mov	edx, numL
	int	80h
endProg:
	mov	eax, 4
	mov	ebx, 1
	mov	ecx, m
	mov	edx, mL
	int	80h

	mov	eax, 1
	mov	ebx, 0
	int	80h

A few notes about my code and my system: Yes, I am using linux (and proud of it!) and I am writing this code in the Intel syntax. I hope I can get this sorted out.
Also, if you could, would you proofread and suggest any, how could I say, "tricks" to optimizing this code? I am rather enjoying learning Assembly and want to get good at it! Many thanks ahead of time,

~Steve AKA The Angry Banker Man

johnsfine · 07-05-2010, 06:34 AM

I assume you're using nasm. I haven't used NASM for anything real in so long I'd forgotten its syntax rules. Meanwhile I knew at one time or other so many different versions of Intel syntax that now they all run together.

I just tried assembling your code with nasm. Even most of the instructions that don't generate errors mean something different from what you intend.

I read the nasm man page. I didn't find any options that look like they give you closer to the masm syntax you seem to be using (but IIRC, what you're doing is wrong in other ways if it were masm syntax).

In the default nasm syntax, the instruction you want is properly written

Code:

   add  byte [num], 1

but remember most of your other instructions are also wrong for the default nasm syntax.

If I made the wrong assumption about your assembler, please tell us exactly the command line used to assemble your code.

steviebob · 07-05-2010, 01:08 PM

I modified the code accordingly, compiling went fine, but it generated an error. But first, I would like you to evaluate what you meant about:

Quote:

Originally Posted by johnsfine

I just tried assembling your code with nasm. Even most of the instructions that don't generate errors mean something different from what you intend.

Now for the error code:

Code:

loop.o: In function `_start`:
loop.asm:(.text+0x2):relocation truncated to fit R_386_16 against `data`

johnsfine · 07-05-2010, 03:25 PM

Some of your code is incorrect enough I can't guess what you intended it to do.

Other lines, I might guess and correct.

Code:

	ten:	equ	"10"
	num:	db	"0"

ten is the two byte value "10".
but num is the address of a byte containing '0'.

Code:

	mov	AX, num
	cmp	AX, ten

This tries to move the address num into AX and compare it to the value ten. The compare is OK, but since you are in 32 bit mode you can't move an address into AX. Obviously you intended to move the value stored at num rather than the address. But the value stored at num is only one byte.
mov AX, [num]
would move that byte together with whatever happens to follow it in memory into AX. So that might be what you want, but only after fixing other things.

steviebob · 07-05-2010, 08:52 PM

Oh, okay. I see what you mean. I guess since it was close to one in the morning I wasn't thinking straight and I never got around to looking at the code today, due to family being over. Haha.. and one question: could I assign "num" as a double byte, or should I keep it a single byte, since it will be changing due to the code?

johnsfine · 07-06-2010, 05:58 AM

Certainly you could change num to be two bytes. But

1) What would you want in the two bytes while the value represents a single digit number?

2) How would you increment it? If you increment '9' by any simple method, you get ':', not "10".

steviebob · 07-06-2010, 02:52 PM

Excellent questions. As you should already know, I'm rather new to assembly, so some of this stuff doesn't occur to me that fast. But, I have an answer to your second question:

Since this is a simple program that will increment num by one until it reaches ten, I would just use the 'inc num' command, and then jump back to the beginning of the program, am I correct? As I said earlier, I'm kind of new to assembly, so excuse me if I sound noobish haha.

johnsfine · 07-06-2010, 03:09 PM

Quote:

Originally Posted by steviebob

Since this is a simple program that will increment num by one until it reaches ten, I would just use the 'inc num' command, and then jump back to the beginning of the program, am I correct?

Do you understand the difference between 0 and "0" in nasm?

If you increment 0 up to 9 and then increment it again, the next value is 10.

If you increment "0" up to "9" and then increment it again, the next value is ":".

But I am not saying you should use 0 instead of "0". Because "0" is something you can directly print, while 0 needs to be translated when printing it, in order to get meaningful output.

Quote:

As I said earlier, I'm kind of new to assembly, so excuse me if I sound noobish.

But do you understand some other programming language, such as C? The difference between 0 and "0" in nasm is the same as the difference between 0 and '0' in C ('0' vs. "0" in nasm vs. C is a more complicated question).

steviebob · 07-06-2010, 04:49 PM

I've messed around with C, yes. Thanks for the answer, anyway! You've helped quite alot, as I hope to become better at assembly. I have nothing better to do right now, anyway, it's my holiday from work. Haha.

ArthurSittler · 07-06-2010, 07:47 PM

The other comments about the semantics of your code are probably relevant. Please do not give up with assembly, but you may need to consult a tutorial or programmers more familiar with assembly language. Which is why you are here, I guess!

Code:

section .data
	m:	db	":D",10
	mL:	equ	$ - m
	ten:	equ	"10"
	num:	db	"0"
	numL:	equ	$ - num
	one:	equ	"1"

You defined num as an address in data segment. So num is the address of a byte that contains "0" which is 0x30.
I expect numL should equal 0x00000001, because numL was defined as the difference in the data segment counter after you defined num and the location of num. I do not know the exact syntax used in the assembler. It might actually create a nul terminated string 0x30, 0x00 for "0". In that case numL would equal 00000002. Such behavior would be unusual for an assembler.
The assembler might choose some other size of storage for purpose of alignment. Physical memory is actually 32-bit or 64-bit words on most computers today. If you want to access a string of bytes it may be convenient to align the start of the string on a memory word boundary. There is not too much advantage for doing this with x86 architecture because the memory interface in the CPU includes hardware for fiddling with bytes that are not word aligned.

Code:

addOne:
	mov	eax, 4
	mov	ebx, 1
	mov	ecx, num
	add	num, 1		;line 20
	mov	edx, numL
	int	80h

In line 20 you tried to add 1 to num. You can not do this because num is a label of an address. A label of an address is a constant that exists at assembly time. You can not add 1 to num because you do not have any destination (the first operand) that can be used to receive the result of the addition.
In the following discussion I am assuming that you were trying to add one to the contents of memory at label num, but there are other possibilities.
The x86 architecture must use a register to manipulate data. Some architectures permit manipulating memory contents by putting the address of the memory location in a register and manipulating the memory contents via this pointer indirection. I have less recent experience with x86 assembly than with some other architectures, but I think it might support such access.
The most likely way to be sure to work is to load the memory contents to a register, manipulate the register, and store the memory contents back to memory. I would replace the code around line 20 as below:

Code:

addOne:
	mov	eax, 4
	mov	ebx, 1
	mov	ecx, [num]      ;fetch data from address num
	add	ecx, 1		;add 1 to the data
	mov	edx, numL
	int	80h

johnsfine · 07-06-2010, 08:47 PM

Quote:

Originally Posted by ArthurSittler

I do not know the exact syntax used in the assembler. It might actually create a nul terminated string 0x30, 0x00 for "0".

I didn't know that either. I expected just 0x30 without the null, but I wasn't sure. So when I tested the code with nasm, I used the -l option to create a listing file to see what it actually did.

Quote:

The x86 architecture must use a register to manipulate data.

Not so. The following instruction works

Code:

add byte [num],1

Under some conditions on some models of cpu, an instruction like that might take as long to execute as six simple instructions. Using a register, you could do the same work with three simple instructions. So the single instruction version might take twice as long to execute.

If the objective is to learn asm, using fewer instructions even if they are slower, exercises a wider range of the instruction set and helps you learn more.

If the objective is to do better than a C compiler at generating fast code, you can't apply any general rules about selecting between few complex instructions and more simple instructions. You need to understand the specific timing of the specific code sequence on the specific model of CPU. Even an expert won't beat the compiler at that for ordinary code. So you need to also know more about the algorithm you are coding than C (even with extra __attributes) can communicate to the optimizer.

Quote:

Some architectures permit manipulating memory contents by putting the address of the memory location in a register and manipulating the memory contents via this pointer indirection. I have less recent experience with x86 assembly than with some other architectures, but I think it might support such access.

It can do that as well.

Code:

mov eax, num
add byte [eax], 1

but there is no advantage to that.

Quote:

I would replace the code around line 20 as below:

I think you misunderstand the intent of the original code there.

steviebob · 07-06-2010, 10:33 PM

Wow, that's quite a mouthful. But I'll try to reply as completely as possible on this.

I am doing this because I am trying to learn asm and want to write fast, efficient code, but I have to start somewhere, no? I only started learning a few days ago, too, haha.

Quote:

Originally Posted by johnsfine

Code:

mov eax, num
add byte [eax], 1

Couldn't I just do

Code:

 mov ecx, num
inc ecx

If what little I understand about asm right now, inc just increases it by one, does it not?

Code:

section .data
	m:	db	":D",10
	mL:	equ	$ - m
	ten:	equ	"10"
	num:	db	"0"
	numL:	equ	$ - num
	one:	equ	"1"

Would this section make more sense if I just got rid of the quotes/would it make it more appropriate for manipulating (addition/subraction and the like)

Quote:

Originally Posted by ArthurSittler

The other comments about the semantics of your code are probably relevant. Please do not give up with assembly, but you may need to consult a tutorial or programmers more familiar with assembly language. Which is why you are here, I guess!

Thank you for supporting me, not many people seem to do that, for some reason :/
And yes, I am here for help from more experienced programmers ^^

Here is a revised version that compiled and linked fine, but for some doesn't seem to want to do anything.

Code:

;I'm trying to make the equivalent of an "IF" statment in C, I marked the
;revised areas
	global _start
_start:

section .data
	m:	db	":D",10
	mL:	equ	$ - m
	ten:	db	10	;<-------Here
	num:	db	0	;<-------And here
	numL:	equ	$ - num
	one:	equ	1	;<-------And here, too
section .text
	mov	esi, num
	cmp	esi, ten
	jne	addOne
	jng	endProg

addOne:
	mov	eax, 4
	mov	ebx, 1
	mov	ecx, num	;<-------Also here
	inc	ecx		;<-------And here
	mov	edx, numL
	int	80h

endProg:
	mov	eax, 1
	mov	ebx, 0
	int	80h

I tend to really scratch my head and go, "Hm, I wonder what could be wrong, now?" when things like this happen, haha.

Also, thank you and thanks again for helping me. Plus some thanks in advance. I really enjoy programming in asm, it's quite fun, really. Haha.

~Steve

johnsfine · 07-07-2010, 05:17 AM

Quote:

Originally Posted by steviebob

Couldn't I just do

Code:

 mov ecx, num
inc ecx

First, you need to understand that num is an address. [num] is the value stored at that address.

So the code you suggested moves that address into ecx then increments the address in ecx. I assume you wanted to increment the contents.

Quote:

Would this section make more sense if I just got rid of the quotes/would it make it more appropriate for manipulating (addition/subraction and the like)

Maybe. But as I said earlier, that makes the values harder to print.

Code:

	mov	esi, num
	cmp	esi, ten

That puts the address num into esi, then compares that address to the address ten. They will never be equal.

Code:

	mov	eax, 4
	mov	ebx, 1
	mov	ecx, num	;<-------Also here
	inc	ecx		;<-------And here
	mov	edx, numL
	int	80h

I haven't looked up the meaning of the system calls your code makes. (I seem to have misplaced the URL where I last found that documentation). So I don't know how much of the above is wrong. I'm sure what you're doing there with ecx is wrong, but I don't know about the rest.

steviebob · 07-07-2010, 08:09 PM

The system call [4] is the write system call, used to write the things onto the screen. ECX is for what to put onto the screen and EDX is how much of it to put onto the screen, the length of whatever is in ECX.

Would I be able to fix some of the code if I put it like this?

Code:

mov esi, [num]
cmp esi, [ten]

From what I am hearing (and understanding, I think) is that the brackets would mean I'm talking about the contents, so it would make more sense?

Also, if the above is true, then this would fix some of the other errors, too, as well?

Code:

	mov	eax, 4
	mov	ebx, 1
	mov	ecx, [num]	;<-------Also here
	inc	[ecx]		;<-------And here
	mov	edx, numL
	int	80h

Here is the list of system calls. I only have experience with 1 and 4, so don't ask me for specifics

ArthurSittler · 07-08-2010, 12:45 AM

This little commentary involves transfer of control in assembly language. Transfer of control involves looping, branching, function call, and return. These correspond to the if, for, while, and function invocation in high level languages. Actually, for now, I will defer function calls.
First, a little note about the status flags. While it is true that cmp sets flags, it is also true that all other arithmetic instructions affect the flags. Logical instructions also affect some flags. These flags are what the conditional jump and call instructions use to decide to activate or not.
Looping for fixed number of iterations, like the "for" keyword in high level languages, usually is done by setting the number of iterations in the C register (also called the Count register on the x86), then decrementing the C register and branching back to the start of the loop if the C register has not reached zero. The x86 includes a loop mnemonic which does this in a single opcode, but its use is deprecated in '386 and later because it can stall the execution pipeline. The following code snippet loops ten times.

Code:

        
        mov   ecx,10          ;iteration counter in C register
loop1:  ...                   ;code to execute in loop
        ...
        dec   ecx
        jne   loop1
        ...                   ;code following loop

"While" loops are sort of similar, except that they typically use two jumps. The condition is evaluated at the top of the loop. Then a conditional branch skips over the loop if the condition is false. An unconditional branch at the end of the loop returns to the condition evaluation. An improvement in efficiency is to put the condition evaluation at the bottom of the loop and jumping unconditionally into the condition evaluation.

Code:

        
        jmp     test
loop2:  ...                  ;instructions in loop
        ...            
test:   ...                  ;instructions to evaluate condition
        jxx     loop2        ;jump if condition true to repeat
        ...                  ;instructions following while loop

A simple if construct is surprisingly touchy. We typically evaluate the condition and skip across the conditional code if the condition is false! For arithmetic tests, remember that the opposite of a > b is a <= b, not a < b, etc.! Also, if there is an "else" clause, the code executed when the condition if true must end by jumping past the code in the else clause.

Code:

            ...                 ;evaluate condition
            jxx     if_false    ;jump over "then" clause
            ....                ;instructions executed if true
            jmp     if_exit     ;jump over "else" clause
if_false:   ...                 ;instructions in "else" clause
            ...
if_exit:    ...                 ;code following conditional