ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
if the computer only understanding binary, its self, then wares the * hex * come in?
You could describe the "machine language" for the instruction int 0x80 as "11001101 10000000", but that string of ascii characters representing a sequence of ones and zeros isn't really the binary as it exists in the computer. It is a human readable representation of the binary.
Machine language programmers discovered long ago that hex numbers make a better "human readable binary" representation than ascii ones and zeroes.
Once you have a little experience working with hex numbers, you know that C is 1100, that D is 1101, that 8 is 1000, etc., so when your eyes see CD80 your mind can instantly understand 1100110110000000. But when your eyes see 1100110110000000 your mind can't get it as quickly, because it runs together too much.
So an experienced Machine language programmer would say CD80 is the machine language for "int 0x80".
Quote:
Originally Posted by empcrono
in a manner of speaking all the "symbolic Language" would be then is a very long list of aliases.
Is that a good idea of what "Assembly Language" is?
That is the most obvious, but not the most important, advantage of Assembly Language over Machine Language. Location labels and all the associated mechanism are more important.
This all brings me to a different thought I've had. When I was just a very small child the first computer I was introduced to only ran one program at a time. What you had to do was stick those very big floppies in and boot the computer from a fresh boot and the computer ran the program. Every time you wanted to use the computer for something else you had to turn the computer off and stick in the other floppy disk, for the other program and boot anew.
The reason I bring all this up is that in "Programming from the Ground Up" and so many other sources they bring up Unix, Linux, Windows and so on. However back in the day when I used those very first computers I am speaking of now what did they use? I do realize that people are not mind readers but I am sure so many different computers ran the same way. That is namely, like I've said, "the first computer I was introduced to only ran one program at a time. What you had to do was stick those very big floppies in and boot the computer from a fresh boot and the computer ran the program."
My Question amounts to this: Was each program its own OS? If that is the case would Assembly give me the ability to do that, if I so chose to?
My Question amounts to this: Was each program its own OS? If that is the case would Assembly give me the ability to do that, if I so chose to?
I am going to lean towards "no".
If you're referring to systems like the Commodore 64, then I know where you are coming from. The operating system on those computers were extremely simple and, more than likely, were hard-coded to initialize the hardware for use and then boot a binary file off of the floppy. It is the same principle as the "old" cartridge video game consoles like a Coleco, Nintendo Entertainment System, Sega Genesis, TurboGrafix16, etc.
I don't think it was considered an operating system at that time, though - it might have been considered an EEPROM, in which case the computer would start and the EEPROM would run the binary which would have been written in the assembly language for that processor and then run through an assembler.
At the core, digital electronics is really just two electrical logic levels. For we humans to be able to think about, document, and converse about this arrangement of logic levels, we assign names to them: '1' & '0'. These are the basic binary digits (bits, for short). A collection of eight ordered bits is called a byte. In order to describe a byte, we can describe each of its bits in order, such as '00110101'. Depending on the context, this might also be represented as a decimal, rather than binary, notation. The example byte would, by the usual conventions, have the decimal value '53'. Other ways of representing an ordered collection of 8 bits might be to use hexadecimal or octal notation. In these cases, our example byte would be '35' or '065' respectively. No matter what the representation, it is still the same value, and at the digital logic level, it is still the same collection of discrete logic states.
Why have I explained all of this? It is fundamental to the basis for the term 'binary machine code'. A program stored in memory is a series of addressable bytes, each one having eight ordered bits, and each bit is identified to the CPU as an electrical logic level on a bus conductor. A program is composed of a collection of such bytes, and the way we represent them depends on the context in which we wish to view them. At some level, it makes sense to see them as discreet '1's and '0's. At a slightly higher level, they are better represented as bytes, or multi-byte words. In the context of machine instructions, we can group the bytes in arrangements that we can correlate with the way the CPU will use them as instruction code and data. Eventually, we can abstract the numerical representation completely out of the picture, and assign names like 'mov', 'jump', 'shift', 'add', 'push', etc. to certain bytes or sequences of bytes. At the same time we can assign names to numeric addresses that are used as target addresses. We do this because it is easier for us to think about symbolic names than purely numeric values. This is the domain of the 'assembler', where we write code that is basically one level removed from the bare numeric machine opcodes and data bytes. An assembler can usually create a form of listing file that shows the numeric instruction code and data that it has generated from your assembler code. Studying this can be revealing in understanding how the CPU see the binary data as sensible code.
These are concepts which I feel do not get enough explanation, and in fact are often explained wrongly. Understanding them should provide some insight into the original poster's question. Sorry for having gotten so long-winded about the whole thing.
--- rod.
Edit: Much of this message was explained or exemplified in other posts while I was composing this. Sorry for any duplication.
Nice! I think I may have started to get it now. Still if the computer only understanding binary, its self, then wares the * hex * come in?
I think that you still haven't totally digested and understood the consequences of one of the earlier comments:
Quote:
Binary is just a representation of data.
If, just for the moment, you consider something abstract which I'll call 'the data' you can represent that in a number of different ways.
If, for example, you chose to represent that in binary, you would have a simple 'list' of the voltages that you would see on an oscilloscope if you could (easily) wire up an oscilloscope to look at the voltage on the bus. You might, for example, regard this as meaning that binary had a reality because of its close association to voltages (although note that this only applies with circuits that have two states and even then data is frequently inverted).
But you do have to be aware that this close mapping of binary to voltages might not be the most important thing to you in many contexts.
In may be more important to have something that you can work with easily; in this case, you might find that you prefer to represent 'the data' in hexadecimal; its more compact, its easier to remember and its much easier to spot when one bit changes and its much easier to do math with (when you get it...actually, this is an oversimplification; if you try to do operations like XOR, you might prefer to look at binary).
Now, at the danger of confusing you, for a short while, consider the Z80 (and similar devices, like the 8080). This is an ancient 8/16 bit processor and everyone used to use hex when they wrote programs for it. However, in this case, if you look at the opcodes (instructions) in Octal, which is not the conventional thing to do, you can see how the processor works. Because the instruction set uses two groups of bits to specify 'source register, or register pair' and 'destination register, or register pair' then taking this unconventional approach to the instruction set lets the processors selection of which register to use jump off the page at you in a way that just doesn't happen if you look at the instruction set in hexadecimal.
So, the point I'm trying to get across is this; there is the data. You choose how to represent it. If you don't choose to represent it in the most convenient way (whatever that is in a particular context, and that may not be clear to you before you start), you unnecessarily make your life harder. Why would you want to do that?
Quote:
But any way I'm beginning to see the usefulness of "Assembly Language", (i think).
Something I will leave as a more advanced exercise for the interested reader (I'm hoping that will be you, empcrono) is to create a simple C 'Hello World' program (see almost any introductory book on C programming). You can compile that with GCC (other compilers exist...although I don't know why ) and look at the assembler generated by the compiler. You'll have to look at 'man gcc' to see how it works, which might be a distraction just right now.
Comparing that with the C is interesting as would be comparing with the code posted by bgeddy. You may not want to that just now, but if this is part of a learning experience, it would be a good idea to try it sometime soon.
My Question amounts to this: Was each program its own OS? If that is the case would Assembly give me the ability to do that, if I so chose to?
In the sense that there was code with a standard API/ABI that provided system-specific services to access the hardware, there was an OS, but in ROM/EPROM. In today's PCs we call this a BIOS. Operating systems such as DOS made use of some BIOS services, but protected-mode OS's cannot. Computers of the sort that you describe did not have sophisticated, if any, filesystems, process models, peripheral expansion capability, networks, etc. A ROM based OS was adequate for such a system.
--- rod.
Just as an aside, as mentioned, its just different representations of the same thing at different levels. This means that in some C compilers (certainly on the old Z80, 808x systems) you could embed asm (assembly) into the middle of the C code something like this:
Code:
if( i < 1 )
{
#asm - compiler directive: asm starts here
<some asm code lines>
#endasm - compiler directive: asm ends here, and back to C
}
If you read the various Wiki articles, this stuff will become clear, see also that ASCII table I mentioned in my prev post.
BTW, normally these days only device drivers are written in assembly (and other embedded coding), otherwise we use C or similar.
Thank you! Every one gave good info and I'm still digesting it.
My question is then what is the bare min required to program in. Not for the sake of ease or that doing it one way or a different way does not make sense because of such and such. But really what is it at the bottom of it all.
I want to, NOT just in theory, know how, but I want be able to actually look at the Machine Instruction set for a CPU and then start programing using nothing but the native binary.
I think there is tremendous value in knowing how the computer really works, in more then an abstract way.
How to think about programing in purely binary terms for the sake of math. That is I hope to achieve the ability to read binary that a computer spits out and understand it.
I'm looking for MATERIAL that would bring me to namely this: How to comprehend and understand computer binary, or at the very least bring me closer to that goal.
Fine, if you really want this, then try this: http://www.instructables.com/id/How-...o-World!%22-e/ http://blog.codecall.net/component/m...llo-World.html
It is applicable only for making *.COM files on windows or msdos, but you can do same by running debug.exe in dosbox. You might need windows or ms-dos version of debug.exe for that. Programs made this way will work on windows, ms-dos or dosbox only.
And here are intel instruction sets with corresponding HEX codes:
You also can try to find Z80 emulator and program within that one. Z80 instruction set is much simpler/smaller, and it will still give you some low-level practice. Or try to find x86-only instruction set - it will be still smaller and easier to comprehend.
Ok, I will explain why you don't want to do it.
"direct binary approach" makes sense in ROM-programming, microcontroller programming, or real-time OS because in this situation you don't have to handle linking with os-specific functions and remembering their addresses (which may change), arguments, etc.
Also, format of executable file is not easy. I.e. executable file doesn't contain "machine code" in the way as it is stored in memoery. It has headers, few tables for used finding addresses of OS-specific functions, size of of segments, etc. YOu don't want to type this all by hand - this will be complicated, and there will be high chance of error. For example, MS-Windows "PE" executable format is not simple, and there are several version. Same probably applies to linux *.so files.
However, there is one exception:
The only executable format that CAN be using for typing binary directly is *.com file used in MS-DOS. It has no headers (or nearly no headers), and it is loaded into memory directly in the way as it is stored. You can program in it using debug.exe (windows/ms-dos only, you could use dosbox, though), or by entering codes in hex editor.
The only executable format that CAN be using for typing binary directly is *.com file used in MS-DOS.
You can write bootloader code this way, too. I forget the exact commands, but you can coerce debug to barf a chunk of memory to the boot sector of a floppy disk. Just use debug to write a few, okay a bunch, of lines of machine code, write it out to the boot sector of a floppy, and viola; 'Bootable' floppy.
You could also do the same thing in Linux with a hex editor and dd. This actually sounds like what the original poster was trying to work toward. It's also the way real men used to do things (we've grown up since then).
Another thing that is possible (and I've done this in the absence of a sane method) is to hand write, with a text editor, an Intel hex format file that has your machine code, and use that to burn an EPROM. Properly formatted, the EPROM can be launched at boot time if plugged in to a suitably equipped expansion card, such as a network card with an EPROM socket. Thats pretty low level, too.
Yet another down-and-dirty thing I've done is used Dallas Semiconductor battery-backed RAM chips wired to a couple or three printer ports, and a bit of bit-bashing code to turn the parallel ports into a poor-man's PROM burner (for DS RAMs, at least). Pretty sure I never hand-coded the hex file when I did that, though.
In all of these cases, one can use nothing more than primitive editor style tools and some other primitive tools to create code, and transfer the code to an executable media. Ahh, the nostalgia....
And are you sure that what you posted in the first message is true binary and not a piece of Basic tokens?
jlinkels
Who are you asking? If its me I'm not sure I took my understanding and the source of ware I quoted for granted. I did not whole heartedly do so: therefore I posted here and Asked questions.
The thing that differentiates a calculator from a computer is the IF statement, or, speaking in assembler - jp z, 003e4F , i.e. the ability to make a decision, to go left or right, based ON something.
If you want to learn machine code, I highly recommend Michael Abrash's Graphics Programming Blackbook - I think it may be available as a free download on the net somewhere, take a look.
Q: Have you had a chance to look at "Programming from the Ground Up" yet? Has it answered any of your questions?
Q: Did you ever get an answer to your question about a "Monitor program" (like the ROM software on the computers you ran as a child, that only ran one program at a time) vs an "Operating System" (like Windows or Linux), and why, in both cases, "program" <> "OS"?
"A microprocessor," like the now-ubiquitous Intel x86, "is a machine that knows how to execute simple instructions very fast." (Millions per second.)
For any microprocessor, there is a specific set of instructions (the "instruction set") that this particular processor knows how to execute.
Each instruction is represented, in the memory of the computer, by a string of one-or-more numbers. A single instruction might be represented by one number, or by a string of 16 or more numbers. (It entirely depends upon the microprocessor.)
The name of the game, here, is: "there are lots and lots of 'em (who cares...), and the chip can execute them very, very fast."
When we're writing computer software, though, we don't want to work at that level of detail. "That's what computers are for." We want to express our thoughts in a way that is natural for us, and let the computer generate the corresponding list of instructions, for the computer to execute. It turns out that computers are very, very good at that.
"What is natural for us, and appropriate to the job at hand," of course depends upon who we are and precisely what we're doing. There are many programming-language tools (and many non-language tools, such as spreadsheets) which are designed to allow humans to harness the computer to a particular task and to accomplish different kinds of work. (Or for that matter, entertainment.)
For example, the "assembler" and "monitor" programs, that some folks have spoken of, are tools that are designed to help you work with "all those strings of numbers," at a machine-specific and instruction-specific level. You'd use these tools when you are, as they say, "riding the pony bareback... yee haw."
At pretty-much the opposite extreme, we've got tools like the programming-language Java, which is supposed to let you run an identical program on widely-different kinds of machines ... "take it from your PC and run it on your telephone ("plunk!") without changing a single line of code."
(Well, that's the idea, anyway ... )
When the dust settles, in every case "microprocessors are running through millions of instructions per second." The processor in your PC is nothing like the one in your telephone (maybe...) but conceptually it's the same sort of device ("a microprocessor") doing the same thing in the same way. Executing tiny instructions, very fast.
Last edited by sundialsvcs; 02-03-2009 at 09:58 AM.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.