LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Coding in Machine Code? (https://www.linuxquestions.org/questions/programming-9/coding-in-machine-code-4175524855/)

chrstrbrts 11-09-2014 12:41 AM

Coding in Machine Code?
 
Hi,

Is it possible to code in machine code?

There are text editors that record 0's and 1's as 0 and 1 in binary and not in their ASCII binary forms.

Could you use those text editors to write a program in machine code?

If so, how would you run it? There wouldn't be a compiler needed at all.

But, you would somehow have to get it into RAM for execution. How would you do that?

Thanks.

genss 11-09-2014 05:41 AM

yes
binary code are the raw machine instructions
so same as writing pure assembly, just without a translator
here is a bit in the logic behind x86 encoding
here a bit more academic explanation

you need to write an elf header to make it runnable in linux
(a.out or similar format would work too)
on exec() the kernel copies the file into RAM and, amongst other things, sets the Relative Instruction Pointer to whatever the ELF header says

chrstrbrts 11-09-2014 11:57 AM

Quote:

Originally Posted by genss (Post 5267082)
yes
binary code are the raw machine instructions
so same as writing pure assembly, just without a translator
here is a bit in the logic behind x86 encoding
here a bit more academic explanation

you need to write an elf header to make it runnable in linux
(a.out or similar format would work too)
on exec() the kernel copies the file into RAM and, amongst other things, sets the Relative Instruction Pointer to whatever the ELF header says

Thanks very much.

rtmistler 11-14-2014 12:33 PM

In some ways this is fundamental Computer Engineering, from the perspective of an Electrical Engineering curriculum. You learn boolean logic, then learn how to make gates, grow that into the point where you make a small CPU, on paper and finally micro-instructions. Further, FPGA people do this a lot too, but it depends what tools they have for design and whether or not they happen to be writing huge filters or just simple functional things.

I'm curious about the motivation for the question. Are you studying systems and thinking about the topic in general? Or were you thinking to get highly efficient execution and that this is a neat idea because one would not need to go through all the steps of using an IDE, writing all kinds of high level code, finding libraries to link with, and etc.

Here's an online reference I found: https://filebox.ece.vt.edu/~jgtront/introcomp/

jailbait 11-14-2014 03:17 PM

Quote:

Originally Posted by chrstrbrts (Post 5267030)
Hi,

Is it possible to code in machine code?

How would you run it? There wouldn't be a compiler needed at all.

But, you would somehow have to get it into RAM for execution. How would you do that?

Yes, I have coded in machine code. The reason I did so years ago was to fix software bugs in code that I did not have the source code for. I would run the bug fix by overwriting the buggy portions of the binary program on disk with my bug fix.

As to writing a complete program in machine code, that just isn't necessary. I have always used an assembler.

---------------------
Steve Stites

exvor 11-14-2014 04:04 PM

Quote:

There are text editors that record 0's and 1's as 0 and 1 in binary and not in their ASCII binary forms.
Most likely you would want to write and display the raw binary in Hex. So any hex editor would allow you to write something like this. A simpliar approach then writing something for an existing OS would be to cobble together enough bits to get a small boot loader OS going in a VM or something. I think I can bang out a bootable 512 byte OS in like 20 min. :P Its not hard esp if your relying on bios calls.

schneidz 11-14-2014 06:15 PM

when i was in college some of our pencil and paper exams were given us a few lines of western design 65c02 assembly and translating it into its equivalent machine instruction. many years but i think an example would be:
Code:

assembly    machine
lda #05  ->  a505


chrstrbrts 11-15-2014 10:11 AM

Quote:

Originally Posted by rtmistler (Post 5269790)
I'm curious about the motivation for the question. Are you studying systems and thinking about the topic in general? Or were you thinking to get highly efficient execution and that this is a neat idea because one would not need to go through all the steps of using an IDE, writing all kinds of high level code, finding libraries to link with, and etc.

I'm simply trying to learn everything I can about computers, and I have a ton of questions.

NevemTeve 11-15-2014 11:19 AM

First try C language. Then Assembly. And after that machine code. The order is important

chrstrbrts 11-15-2014 01:38 PM

Quote:

Originally Posted by NevemTeve (Post 5270182)
First try C language. Then Assembly. And after that machine code. The order is important

Thanks for the advice.

I'm finishing Java right now. I decided to make Java my first language, but I think that that was a mistake.

I've read critiques regarding the common approach to learning programming by starting with Java. The argument is that Java and other high level languages do too much for you, that they obscure the inner workings of the machine and hide the software-hardware interaction.

I've certainly noticed this. I have the Java 8 runtime environment, and it's huge. There are over 1,000 predefined classes with countless methods and instance variables.

I feel like I have no idea how to really control the machine because all I'm doing is calling predefined methods to do my dirty work for me.

I've gone too far to stop here, though. When I finish, I'm moving over to C.

Why do you say that I should start with C and then move "down" to machine?

Why not start at the "bottom" with machine and then move up to C?

Thanks.

Pearlseattle 11-15-2014 02:49 PM

Quote:

Why not start at the "bottom" with machine and then move up to C?
I'm most probably not qualified to answer this question but please let me give it a try: I think it's about the risk of waisting your time.
I would personally stick with NevemTeve's advice being the less-risky option ("First try C language. Then Assembly. And after that machine code.")

Learn first C and play around with memory addresses/pointers, bit operators, etc... .
Already at this point you might get bored or excited deciding to take different routes. E.g. you might like buffer overflows and will therefore take the "security" path or you might like the compiler optimizations therefore taking the "performance" path, etc... .
There are a lot of temptations.

If, after learning C you're still stubborn at continuing your quest, you'll end up with Assembler (nasm, etc...).
At this point, looking at registers, optimized routines, you WILL start questioning about how processors/platforms are supposed to work and will end up comparing things with other architectures like MIPS, SPARC, WHATEVER and you will end up choosing your favourite one.
Only at this point you will know where your heart is really pointing to :)

This way, if you stop at any point, you can still use the knowledge gathered so far for the above layers.

genss 11-16-2014 01:25 PM

some nice examples to start assembly
http://cs.lmu.edu/~ray/notes/nasmexamples/
google has more
fasm comes with similar examples, other assemblers probably do too

talking only to the kernel can be hard when coming from higher level languages
for example writing strings of text to stdout instead of calling printf()
so you can call C libraries to make it easier

http://www.tortall.net/projects/yasm...-registers.png as a reference for registers (it makes sense)

64bit is better for beginners
http://www.logix.cz/michal/devel/amd64-regs/ for calling conventions (amd64 linux, C and kernel)
http://www.x86-64.org/documentation/abi.pdf as a proper documentation on sys-V amd64 calling conventions
long story short, integers and pointers go in general purpose registers while floats go in xmm registers

http://en.wikipedia.org/wiki/X86_instruction_listings for most of the instructions available

in the kernel source the file arch/x86/include/generated/uapi/asm/unistd_64.h is a list of syscall numbers (and syscalls)
in the source include/uapi/linux/ there is a bunch of header files with numbers for various definitions

for 32bit linux there is a nice documentation of most of the calls
http://fresh.flatassembler.net/lscr/

nasm, yasm and fasm assemblers have good documentation
yasm is a clone of nasm and has different error output, that can be useful for starters (they take the same files in)
fasm has better macros (and is faster)


oh ye
you don't have to follow anyone's coding style
for example i put data before code and don't indent much
look at the fasm source code for yet another stile

chrstrbrts 11-16-2014 02:28 PM

Thanks, guys.

rtmistler 11-17-2014 12:28 PM

Quote:

Originally Posted by chrstrbrts (Post 5270154)
I'm simply trying to learn everything I can about computers, and I have a ton of questions.

I'd recommend you try the suggestions offered related to C and assembly language and then determine what areas interest you the most.

Someone mentioned that they coded in assembly to fix a bug where they didn't have source, but were able to fix the problem via those means. My thinking is that unless you're doing something specialized (like that example), checking how the compiler interpreted a set of instructions, or programming to an embedded processor directly, there may be little need to use assembly.

Not saying it isn't something to be aware of and understand, but I am saying that it is a very involved topic. A person can write an assembly language program and get it to run, but I'm thinking that it takes a bit of a learning curve to get to a point where you're writing productive code that does more than simple experimentation.

I've long since moved away from assembly because there are plenty of compilers which work acceptably well. And then the only reasons I've used assembly in recent memory are to determine how some code was operating because I could not discern why behavior wasn't what I expected, or to add inline assembly because I needed to write an exact memory address. And adding inline assembly was also very long ago, it was more that I and my peers didn't yet trust C well enough.

The last case would be some embedded micro where they have a lousy compiler or no compiler. That's rare for me, I guess because our company is not going to use a micro so unsupported as that. Funnily enough in reverse, we would insist that we got the instruction set (assembly instructions and register map) so that we could cross check questionable behaviors where we weren't sure what the compiler did.

Sorry for the length, I realize what I'm saying is that we debug in assembly, we don't write in assembly. At least the teams I work with and the projects we do.

jefro 11-17-2014 09:03 PM

MenuetOS

http://www.menuetos.net/


At one time that was all one could do was write in machine code.


All times are GMT -5. The time now is 01:38 AM.