Machine Language/Code

khronosschoty · 01-21-2009, 05:25 PM

If any one could help? I am trying to go about learning "Machine Language" and I am at a loss.

Machine Language is "Machine code or machine language is a system of instructions and data executed directly by a computer's central processing unit" (wiki)

I am under the impression that at the very core of it all the computer works in Binary: zeros and ones? So why is it that when programing in "Machine Language" you would end up with something like this?

Code:

Machine Language
       169 1 160 0 153 0 128 153 0 129 153 130 153 0 131 200 208 241 96

   BASIC
       5 FOR I=1 TO 1000: PRINT "A";: NEXT I

     These two programs both print the letter "A" 1000 times on the screen.

Is there a difference in programing in "Machine Language" and programing in Binary. I would think the two were the same?

Any help would be great! Thanks in Advance!!!

BTW: Any location of good material or the way of good direction would be so very appreciated. Thank you so much!!!

Sources = http://en.wikipedia.org/wiki/Machine...e_instructions
http://www.atariarchives.org/mlb/introduction.php

chrism01 · 01-21-2009, 06:36 PM

Well, indeed actual machine code is binary, but for humans we usually use 'assembler' code, which is in hexadecimal as per you example. However, its also loosely referred to as machine code sometimes.
You may find this table relevant: http://www.asciitable.com/
Here's a good article: http://en.wikipedia.org/wiki/Assembler_code

Matir · 01-21-2009, 06:57 PM

Binary is just a representation of data. Programs are not written by typing in "0101010111011001". You might, however, if you wanted to write in machine language, write in octal like you have displayed.

jschiwal · 01-21-2009, 07:04 PM

I think that us old timers (pre-PC) had it a lot easier learning assembly language. The processors where simpler back in the Apple, Commodore days. Fewer registers in the 6502 (if I remember the chip name correctly), and we didn't have to mess with little endianess. Today, you look at a debugger or hexdump, and you have to swap long words around, then words around and then the bytes, to read the actual number. Back then you would load or save memory directly, instead of 2 or 3 levels of indirection.

But then having 1 GZ instead of 1 MHz clock speeds does have its advantages.

markjuggles · 01-21-2009, 09:59 PM

Hi,

It may help us to know why you want to learn machine language.

If your interest is philosophical and any machine will do, you might check out some of the simple microcontrollers that have simulators. I was recently doing some assembly programing of an ATtiny 25. This is a $2.00 computer in an 8 pin package.

There are free tools which allow you to assemble/compile and simulate. The simulation is very nice for following the results of each command.

Some general definitions (as I was taught):

Machine Language -- raw binary, usually shown in hexidecimal, which controls a computer.

Assembly Language -- a symbolic language which translates directly into the binary or hexidecimal. Example: "CLR A" might become 0x12.

The x86 assembly language isn't my favorite, but there are worse ones.

Mark

Sergei Steshenko · 01-21-2009, 10:05 PM

And, if we go further/deeper, one may remember that modern CISC processors are internally RISC ones, so there is not only the outer machine code, but also the inner one.

Oh boy, the OP must be completely confused now :-).

ErV · 01-22-2009, 02:42 AM

Quote:

Originally Posted by empcrono

Any help would be great! Thanks in Advance!!!

You don't want to program using pure binary codes. Typing in hex codes certainly possible, and some people even can do that, but there is little sense in doing that. If you want "machine programming", learn assembler instead - it will make things easier.

Sergei Steshenko · 01-22-2009, 04:24 AM

Quote:

Originally Posted by ErV

You don't want to program using pure binary codes. Typing in hex codes certainly possible, and some people even can do that, but there is no little sense in doing that. If you want "machine programming", learn assembler instead - it will make things easier.

Well, somewhat expanding the subject.

Somebody also has to design the CPUs - while doing this the person has to deal with machine code implementing the HW in an HDL language.

An HDL language is, essentially, a parallel programming language having constructs which easily map onto HW.

Well known examples of HDLs are Verilog and VHDL.

ErV · 01-22-2009, 05:46 AM

Quote:

Originally Posted by Sergei Steshenko

An HDL language is, essentially, a parallel programming language having constructs which easily map onto HW.

Sorry, HDL is outside of my area of expertise.

I just meant that typing in hex codes for developing makes little. People that spent a lot of time debugging and disassembling will evntually remember some codes, but this will be more useful for hacking software than for making it.
Working with binary directly will make some sense only when people want to program for certain old/simple cpu - various microcontrollers, z80 cpu, mayb ROM programming, etc.
I did some programming in this way on old Russian BK-0010SH machine (typing in octal codes), but modern cpu instruction set is much larger and more complicated than BK-0010SH instruction set, and in protected mode, with complicated system API working with binary directly make little sense.

P.S. This thread reminds me this story about rm -rf.

khronosschoty · 01-22-2009, 10:46 AM

Thank you! Every one gave good info and I'm still digesting it.

My question is then what is the bare min required to program in. Not for the sake of ease or that doing it one way or a different way does not make sense because of such and such. But really what is it at the bottom of it all.

I want to, NOT just in theory, know how, but I want be able to actually look at the Machine Instruction set for a CPU and then start programing using nothing but the native binary.

I think there is tremendous value in knowing how the computer really works, in more then an abstract way.

How to think about programing in purely binary terms for the sake of math. That is I hope to achieve the ability to read binary that a computer spits out and understand it.

I'm looking for MATERIAL that would bring me to namely this: How to comprehend and understand computer binary, or at the very least bring me closer to that goal.

paulsm4 · 01-22-2009, 10:59 AM

Hi, Empcrono -

Take a look at Jonathan Bartlett's excellent "Programming from the Ground Up"; the .pdf is publicly licensed as a free download, and it deals with *precisely* the kind of (very good!) questions you're asking:

http://savannah.nongnu.org/projects/pgubook/

'Hope that helps .. PSM

johnsfine · 01-22-2009, 11:00 AM

Quote:

Originally Posted by ErV

If you want "machine programming", learn assembler instead - it will make things easier.

I understand how, from the point of view of someone who doesn't yet know assembler, the above advice may sound like an unnecessary side track.

But you should trust the experts on this one.
1) When you think you want to learn Machine language, what you actually want to learn is probably assembler.
2) If you really really want to learn machine language, it will take less total time and effort to first learn assembler and then learn machine language, than to try to learn machine language without learning assembler.

Quote:

Originally Posted by empcrono

I'm looking for MATERIAL that would bring me to namely this: How to comprehend and understand computer binary

If you understand even the basic concepts of assembler programming (even far short of knowing it well enough to write useful programs) you should be able to read the instruction set documentation for the CPU and from that understand the computer binary machine language.

Almost all cpu instruction set documentation includes the binary encoding of the instructions. That tells you in human readable form what the machine language is.

theNbomr · 01-22-2009, 11:29 AM

To understand what is going on, you should research how a CPU operates at the level of the machine instruction. Each CPU will have its own instruction set, timing, bus elements and behavior, etc.
In general, the CPU fetches an instruction (a byte or multi-byte word) from a memory address that is specified on the address bus (derived from a program counter register in the CPU). That byte/word that is fetched would be a byte created by you as a programmer, either through a high level language compiler, an assembler, or manually by assembling the binary data somehow. The instruction opcode tell the CPU how to proceed, and is probably followed by more instruction fetches, and/or some data fetch that completes the instruction. The instruction is then executed, and the next bytes in the list are then fetched and executed, ad inifinitum. The instructions can be classified into general categories, such as data moves, arithmetic & logic, branching & jumping, and some others. Most, if not all CPUs will have instructions that fall into these categories, and once one learns something about the classes of instructions, it is not difficult to transfer that knowledge to other instruction sets.
I think a good way to understand what is going on would be to sit down with a source-level debugger that allows you to observe the disassembled object code to the machine instruction level, and single-step through a code fragment. While doing so, consulting the CPU instruction set documentation would be highly instructive. This can be followed up by writing some assembler code, assembling and debugging it in the same way. At some point you may reach a level of proficiency where you can modify the binary object code on the fly using the debugger, and perform bug fixes, add functionality, etc. Using the debugger to browse through memory, decomposing it in various ways will help reveal how a program is composed of code and data, and something about the relationship between them.
Modern CPUs running protected mode OS's are less friendly for this kind of activity than, say, real-mode DOS PCs or other smaller hosts running smaller CPUs. MS-DOS running in real mode, and using 'debug' can be instructive, although it is not a source level debugger. gdb may also allow you poke around in the CPU's workings in similar ways. If so, it would be a convenient tool.
Once you get to the point of being able to put a sequence of bytes into memory, and have the CPU execute them in a way that does what you want, you will have accomplished your goal. In the mean time, come back here for help with specific problems.
---- rod.

bgeddy · 01-22-2009, 02:19 PM

Here's some examples to help illustrate the concepts involved. First we have a liitle program written in assembler. This is in nasm and may be compiled with nasm - the Netwide assembler which comes with Slackware. First here is the program - this is what's known as assembler. This is for a i386 architecture. Other architectures have different dialects. Here i have called the file "hello.asm".

Code:

section .text
	global _start
_start:
	mov edx,len
	mov ecx,msg
	mov ebx,1
	mov eax,4
	int 0x80
	mov ebx,0
	mov eax,1
	int 0x80
section .data
msg 	db	"Hello World!",0x0a
len 	equ	$ - msg

Then you may run :

Quote:

nasm hello.asm

To create an executable name hello. If you create a listing with :

Code:

 nasm -lhello.lst hello.asm

You will see the hexadecimal values of the generated machine instructions. Here is the file :

Code:

1                                  section .text
     2                                  	global _start
     3                                  _start:
     4 00000000 66BA0D000000            	mov edx,len
     5 00000006 66B9[00000000]          	mov ecx,msg
     6 0000000C 66BB01000000            	mov ebx,1
     7 00000012 66B804000000            	mov eax,4
     8 00000018 CD80                    	int 0x80
     9 0000001A 66BB00000000            	mov ebx,0
    10 00000020 66B801000000            	mov eax,1
    11 00000026 CD80                    	int 0x80
    12                                  section .data
    13 00000000 48656C6C6F20576F72-     msg 	db	"Hello World!",0x0a
    14 00000009 6C64210A           
    15                                  len 	equ	$ - msg
    16                                  
    17                                  
    18

The first column is the source file line number, the second column the machine address offset and the third is the hexadecimal representation of the machine code. The rest is the mnemonics that are cslled assembly language. Sometimes assembly laguage and machine language are used to refer to the same thing.

The next step is to understand what's going on when the processor executes an instruction. This is part of the process of learning assembler and I think this is what you want to do. Working out hex values for instructions in your head is not practical and serves no purpose. The study of assembly language and its related fields (CPU/memory/io architecture etc) are very valid in my opinion.

khronosschoty · 01-22-2009, 02:43 PM

Quote:

Originally Posted by markjuggles

Machine Language -- raw binary, usually shown in hexidecimal, which controls a computer.

Assembly Language -- a symbolic language which translates directly into the binary or hexidecimal. Example: "CLR A" might become 0x12.

Nice! I think I may have started to get it now. Still if the computer only understanding binary, its self, then wares the * hex * come in? But any way I'm beginning to see the usefulness of "Assembly Language", (i think). I must confess that I had in my minds eye this. I would theoretically write programs by creating a "symbolic language" my self and then from there create a central file of some sort and some how map the created "symbolic language" to their intended binary or "Machine Code" equivalence: in a manner of speaking all the "symbolic Language" would be then is a very long list of aliases.

Is that a good idea of what "Assembly Language" is?

Quote:

Originally Posted by Sergei Steshenko

Well, somewhat expanding the subject.

Somebody also has to design the CPUs - while doing this the person has to deal with machine code implementing the HW in an HDL language.

An HDL language is, essentially, a parallel programming language having constructs which easily map onto HW.

Well known examples of HDLs are Verilog and VHDL.

This may be what I was aiming at or at least the right direction. Thanks

Quote:

Originally Posted by paulsm4

Hi, Empcrono -

Take a look at Jonathan Bartlett's excellent "Programming from the Ground Up"; the .pdf is publicly licensed as a free download, and it deals with *precisely* the kind of (very good!) questions you're asking:

http://savannah.nongnu.org/projects/pgubook/

'Hope that helps .. PSM

Not what I had in mind but still non the less it may prove useful. I think maybe --------------------------->

Quote:

Originally Posted by johnsfine

I understand how, from the point of view of someone who doesn't yet know assembler, the above advice may sound like an unnecessary side track.

-------------------> right.

Thanks every one. the "Programming from the Ground Up" and every one was real helpful. I'll keep at it!!

Quote:

Originally Posted by bgeddy

The next step is to understand what's going on when the processor executes an instruction. This is part of the process of learning assembler and I think this is what you want to do. Working out hex values for instructions in your head is not practical and serves no purpose. The study of assembly language and its related fields (CPU/memory/io architecture etc) are very valid in my opinion.

You posted just before I did. So any way that was really helpful. But I'm not sure that "Working out hex values for instructions in your head is not practical and serves no purpose." is true. But I take it that "CPU/memory/io architecture etc) are very valid" is true. I want to figure out the relationship between the two. After all in the end the computer is just a over glorified calculator right? I mean when Charles Babbage's and them created there computers it was just for calculation. Are modern computers just the same: just ways of calculating?