ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
yes
binary code are the raw machine instructions
so same as writing pure assembly, just without a translator here is a bit in the logic behind x86 encoding here a bit more academic explanation
you need to write an elf header to make it runnable in linux
(a.out or similar format would work too)
on exec() the kernel copies the file into RAM and, amongst other things, sets the Relative Instruction Pointer to whatever the ELF header says
yes
binary code are the raw machine instructions
so same as writing pure assembly, just without a translator here is a bit in the logic behind x86 encoding here a bit more academic explanation
you need to write an elf header to make it runnable in linux
(a.out or similar format would work too)
on exec() the kernel copies the file into RAM and, amongst other things, sets the Relative Instruction Pointer to whatever the ELF header says
In some ways this is fundamental Computer Engineering, from the perspective of an Electrical Engineering curriculum. You learn boolean logic, then learn how to make gates, grow that into the point where you make a small CPU, on paper and finally micro-instructions. Further, FPGA people do this a lot too, but it depends what tools they have for design and whether or not they happen to be writing huge filters or just simple functional things.
I'm curious about the motivation for the question. Are you studying systems and thinking about the topic in general? Or were you thinking to get highly efficient execution and that this is a neat idea because one would not need to go through all the steps of using an IDE, writing all kinds of high level code, finding libraries to link with, and etc.
How would you run it? There wouldn't be a compiler needed at all.
But, you would somehow have to get it into RAM for execution. How would you do that?
Yes, I have coded in machine code. The reason I did so years ago was to fix software bugs in code that I did not have the source code for. I would run the bug fix by overwriting the buggy portions of the binary program on disk with my bug fix.
As to writing a complete program in machine code, that just isn't necessary. I have always used an assembler.
There are text editors that record 0's and 1's as 0 and 1 in binary and not in their ASCII binary forms.
Most likely you would want to write and display the raw binary in Hex. So any hex editor would allow you to write something like this. A simpliar approach then writing something for an existing OS would be to cobble together enough bits to get a small boot loader OS going in a VM or something. I think I can bang out a bootable 512 byte OS in like 20 min. :P Its not hard esp if your relying on bios calls.
when i was in college some of our pencil and paper exams were given us a few lines of western design 65c02 assembly and translating it into its equivalent machine instruction. many years but i think an example would be:
I'm curious about the motivation for the question. Are you studying systems and thinking about the topic in general? Or were you thinking to get highly efficient execution and that this is a neat idea because one would not need to go through all the steps of using an IDE, writing all kinds of high level code, finding libraries to link with, and etc.
I'm simply trying to learn everything I can about computers, and I have a ton of questions.
First try C language. Then Assembly. And after that machine code. The order is important
Thanks for the advice.
I'm finishing Java right now. I decided to make Java my first language, but I think that that was a mistake.
I've read critiques regarding the common approach to learning programming by starting with Java. The argument is that Java and other high level languages do too much for you, that they obscure the inner workings of the machine and hide the software-hardware interaction.
I've certainly noticed this. I have the Java 8 runtime environment, and it's huge. There are over 1,000 predefined classes with countless methods and instance variables.
I feel like I have no idea how to really control the machine because all I'm doing is calling predefined methods to do my dirty work for me.
I've gone too far to stop here, though. When I finish, I'm moving over to C.
Why do you say that I should start with C and then move "down" to machine?
Why not start at the "bottom" with machine and then move up to C?
Why not start at the "bottom" with machine and then move up to C?
I'm most probably not qualified to answer this question but please let me give it a try: I think it's about the risk of waisting your time.
I would personally stick with NevemTeve's advice being the less-risky option ("First try C language. Then Assembly. And after that machine code.")
Learn first C and play around with memory addresses/pointers, bit operators, etc... .
Already at this point you might get bored or excited deciding to take different routes. E.g. you might like buffer overflows and will therefore take the "security" path or you might like the compiler optimizations therefore taking the "performance" path, etc... .
There are a lot of temptations.
If, after learning C you're still stubborn at continuing your quest, you'll end up with Assembler (nasm, etc...).
At this point, looking at registers, optimized routines, you WILL start questioning about how processors/platforms are supposed to work and will end up comparing things with other architectures like MIPS, SPARC, WHATEVER and you will end up choosing your favourite one.
Only at this point you will know where your heart is really pointing to
This way, if you stop at any point, you can still use the knowledge gathered so far for the above layers.
talking only to the kernel can be hard when coming from higher level languages
for example writing strings of text to stdout instead of calling printf()
so you can call C libraries to make it easier
in the kernel source the file arch/x86/include/generated/uapi/asm/unistd_64.h is a list of syscall numbers (and syscalls)
in the source include/uapi/linux/ there is a bunch of header files with numbers for various definitions
nasm, yasm and fasm assemblers have good documentation
yasm is a clone of nasm and has different error output, that can be useful for starters (they take the same files in)
fasm has better macros (and is faster)
oh ye
you don't have to follow anyone's coding style
for example i put data before code and don't indent much
look at the fasm source code for yet another stile
I'm simply trying to learn everything I can about computers, and I have a ton of questions.
I'd recommend you try the suggestions offered related to C and assembly language and then determine what areas interest you the most.
Someone mentioned that they coded in assembly to fix a bug where they didn't have source, but were able to fix the problem via those means. My thinking is that unless you're doing something specialized (like that example), checking how the compiler interpreted a set of instructions, or programming to an embedded processor directly, there may be little need to use assembly.
Not saying it isn't something to be aware of and understand, but I am saying that it is a very involved topic. A person can write an assembly language program and get it to run, but I'm thinking that it takes a bit of a learning curve to get to a point where you're writing productive code that does more than simple experimentation.
I've long since moved away from assembly because there are plenty of compilers which work acceptably well. And then the only reasons I've used assembly in recent memory are to determine how some code was operating because I could not discern why behavior wasn't what I expected, or to add inline assembly because I needed to write an exact memory address. And adding inline assembly was also very long ago, it was more that I and my peers didn't yet trust C well enough.
The last case would be some embedded micro where they have a lousy compiler or no compiler. That's rare for me, I guess because our company is not going to use a micro so unsupported as that. Funnily enough in reverse, we would insist that we got the instruction set (assembly instructions and register map) so that we could cross check questionable behaviors where we weren't sure what the compiler did.
Sorry for the length, I realize what I'm saying is that we debug in assembly, we don't write in assembly. At least the teams I work with and the projects we do.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.