Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game. |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
|
08-17-2009, 08:46 AM
|
#1
|
Member
Registered: Jul 2009
Distribution: Ubuntu 9.04
Posts: 328
Rep:
|
A question about big-endian little-endian and how it affects things
I'm missing something.
I've been reading "Programming from the Ground Up", and I got to this section of it ...
Here is what it says about little-endian:
Quote:
The byte-switching magic happens automatically behind the scenes during register-to-memory transfers. However, the byte order can cause problems in several instances:
• If you try to read in several bytes at a time using movl but deal with them on a byte-by-byte basis using the least significant byte (i.e. - by using %al and/or shifting of the register), this will be in a different order than they appear in memory.
• If you read or write files written for different architectures, you may have to account for whatever order they write their bytes in.
• If you read or write to network sockets, you may have to account for a different byte order in the protocol.
As long as you are aware of the issue, it usually isn’t a big deal.
|
Can someone explain in simple terms what this means to someone programming in assembly?
I've been reading this book, and I looked at a link that was in this book about it for a few minutes, and I found this here, at LinuxQuestions:
http://www.linuxquestions.org/questi...ghlight=endian
I know more than what is in this thread^
I know that the least significant byte is first in memory, the order of the bytes is reversed.
I just want a simple explanation. I'm not really grasping yet how this affects things.
Last edited by joeBuffer; 08-17-2009 at 08:59 AM.
|
|
|
08-17-2009, 09:15 AM
|
#2
|
Senior Member
Registered: May 2005
Posts: 4,481
|
Quote:
Originally Posted by joeBuffer
I'm missing something.
I've been reading "Programming from the Ground Up", and I got to this section of it ...
Here is what it says about little-endian:
Can someone explain in simple terms what this means to someone programming in assembly?
I've been reading this book, and I looked at a link that was in this book about it for a few minutes, and I found this here, at LinuxQuestions:
http://www.linuxquestions.org/questi...ghlight=endian
I know more than what is in this thread^
I know that the least significant byte is first in memory, the order of the bytes is reversed.
I just want a simple explanation. I'm not really grasping yet how this affects things.
|
Maybe this article: http://en.wikipedia.org/wiki/Endiannes .
|
|
|
08-17-2009, 09:21 AM
|
#3
|
Member
Registered: Jul 2009
Distribution: Ubuntu 9.04
Posts: 328
Original Poster
Rep:
|
Quote:
• If you try to read in several bytes at a time using movl but deal with them on a byte-by-byte basis using the least significant byte (i.e. - by using %al and/or shifting of the register), this will be in a different order than they appear in memory.
|
I don't understand what this is saying, exactly.
Is this as simple as it seems?
Say a 32-bit register is put into memory. Then you use that memory a byte at a time. It's in the reverse order?
But if you use the 32 bits, it gets switched around automatically?
Also, the endianness is different for different things, so you have to take that into consideration?
I understand the fundamental idea here, I'm mainly wondering if this is as simple as it seems, or if this book is talking about something more complex.
Really it's almost a silly question, because I think I understand all of this, and it's very simple. I just wanted to make sure, and have a simple explanation.
The most confusing thing to me is, normally, you wouldn't do this, would you? You'd have a symbol or whatever (in your assembly code), and that would be the location of the data you want to retrieve from memory, and it all gets worked out automatically?
Last edited by joeBuffer; 08-17-2009 at 09:28 AM.
|
|
|
08-17-2009, 09:29 AM
|
#4
|
Senior Member
Registered: May 2005
Posts: 4,481
|
Quote:
Originally Posted by joeBuffer
I don't understand what this is saying, exactly.
Is this as simple as it seems?
Say a 32-bit register is put into memory. Then you use that memory a byte at a time. It's in the reverse order?
But if you use the 32 bits, it gets switched around automatically?
Also, the endianness is different for different things, so you have to take that into consideration?
I understand the fundamental idea here, I'm mainly wondering if this is as simple as it seems, or if this book is talking about something more complex.
Really it's almost a silly question, because I think I understand all of this, and it's very simple. I just wanted to make sure, and have a simple explanation.
|
It is not as simple as it seems because of bit endianness on top of byte endianness - see the Wiki article above.
|
|
|
08-17-2009, 09:35 AM
|
#5
|
Member
Registered: Jul 2009
Distribution: Ubuntu 9.04
Posts: 328
Original Poster
Rep:
|
This entire book is on x86 assembly.
Like it says, say you have a word in a register, and you move it to memory ... it does this:
[byte0][byte1][byte2][byte3] <- register
[byte3][byte2][byte1][byte0] <- memory
if you read the whole word into a register, it will be worked out automatically. So when you read from the memory into the register, it ends up looking like this again:
[byte0][byte1][byte2][byte3] <- register
but if you read it in a byte at a time, you'll be reading it in from memory in this order:
[byte3][byte2][byte1][byte0] <- memory
Correct?
|
|
|
08-17-2009, 09:35 AM
|
#6
|
Member
Registered: Jun 2007
Location: Bavaria
Distribution: slackware, xubuntu
Posts: 143
Rep:
|
I think they're just making it more difficult than it actually is, i.e. they could have also just written, if the byteorder differs from memory storage to CPU (e.g. big endian storage with an Intel CPU), you will have to do the byte swapping yourself, and can't rely on bytes automatically being adressable correctly if you read them in chunks of more than one byte at once.
|
|
|
08-17-2009, 09:46 AM
|
#7
|
LQ Guru
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Rep:
|
Hi -
This might also help:
Code:
#include <stdio.h>
typedef unsigned char BYTE;
typedef unsigned short WORD;
typedef unsigned long DWORD;
int
main (int argc, char *argv[])
{
int i;
BYTE buf[4];
BYTE *b = &buf[0];
WORD *w = (WORD *)&buf[0];
DWORD *d = (DWORD *)&buf[0];
*d = 0x0123abcd;
for (i=0; i < 4; i++)
printf ("b[%d]: 0x%02x...\n", i, b[i]);
for (i=0; i < 2; i++)
printf ("w[%d]: 0x%04x...\n", i, w[i]);
printf ("d: 0x%x...\n", d[0]);
return 0;
}
Sample output:
Quote:
b[0]: 0xcd...
b[1]: 0xab...
b[2]: 0x23...
b[3]: 0x01...
w[0]: 0xabcd...
w[1]: 0x0123...
d: 0x123abcd...
|
|
|
|
08-17-2009, 10:06 AM
|
#8
|
Member
Registered: Jul 2009
Distribution: Ubuntu 9.04
Posts: 328
Original Poster
Rep:
|
Right, exactly what I was thinking.
The way it's worded is just a little funny. Obviously understandable, or I wouldn't have gotten it right. I just wanted verification, mainly, and an opinion on whether or not it is more complex than it seems. I'm gonna mark this as solved.
P.S. Using -Wall, the code example you have gives a warning but not an error.
Code:
joebuffer@ubuntu:~/clang$ gcc -Wall test.c -o test
test.c: In function ‘main’:
test.c:22: warning: format ‘%x’ expects type ‘unsigned int’, but argument 2 has type ‘DWORD’
Using a typecast to unsigned int works:
Code:
printf("d: 0x%x...\n", (unsigned int)d[0]);
Thanks.
|
|
|
08-20-2009, 05:59 AM
|
#9
|
LQ Guru
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733
|
I often wish that Intel had never invented this problem. It becomes important when you are saving files or communicating with another system. Using 8 bit ascii there isn't a problem but when characters are saved in a multi-byte encoding scheme, there is. A utf-16 file will write an initial BOM (byte order marker) of 0xFEFF. This reveals what the endianess of the system saving the file was. It also makes working with text utilities difficult, because now the file starts with a short binary blob.
If one computer uses services of another computer, they both need to agree on the endianess used. Sometimes the protocol used enforces this.
If you play a video, the audio stream may be use s16le or s16be sampling. Your player needs to detect this so the two bytes of every sample don't get switched.
|
|
|
08-20-2009, 06:03 AM
|
#10
|
LQ Newbie
Registered: Aug 2009
Posts: 1
Rep:
|
how could i debug c programming as f7 key used in turbo c/c++ in gcc as well,is there any options...
kindly reply
|
|
|
08-20-2009, 06:12 AM
|
#11
|
Senior Member
Registered: May 2005
Posts: 4,481
|
Quote:
Originally Posted by ahshan m d
how could i debug c programming as f7 key used in turbo c/c++ in gcc as well,is there any options...
kindly reply
|
And what does your question have to do with the thread ?
|
|
|
08-20-2009, 08:06 AM
|
#12
|
LQ Guru
Registered: Dec 2007
Distribution: Centos
Posts: 5,286
|
Quote:
Originally Posted by jschiwal
I often wish that Intel had never invented this problem.
|
Intel certainly did not invent that problem. There were both computers using least significant first and computers using most significant first long before Intel existed.
I'm not sure, but I think the problem was really invented when someone copied the decimal numbering system from a language written right to left into a language written left to right, but kept the least significant digit (which had been first) on the right, so it became last.
Positional numbering was invented using base 60 with the least significant digit on the right used with a language that was read right to left, so the least significant digit was originally first. Sometime later it was switched to base 10 and sometime later it was adopted into languages that are read left to right and somehow that got us to our belief that the most significant digit in a number as naturally first and LSB is backwards.
So you get nonsense such as what the OP quoted from "Programming from the Ground Up" about byte swapping between memory and the CPU in ordinary operations. None of that happens. The Intel design is consistently LSB: Least significant bit first in each byte, Least significant byte first in each larger object, consistently in the CPU and in memory and in external interfaces.
Quote:
It becomes important when you are saving files or communicating with another system.
|
Correct. The byte swapping becomes an issue when you must communicate using a standard based on a different byte ordering.
|
|
|
08-20-2009, 01:02 PM
|
#13
|
Member
Registered: May 2002
Location: new hampshire
Distribution: Fedora, RHEL
Posts: 600
Rep:
|
Actually, little endian on some architectures with some programs is faster since low numbers are already "there" instead of the ALU having to parse the first 16 bits or so. Not a big deal with processors that are fast nowadays, but long ago there were systems that had small integers and didn't need to use the "top" bits of the storage unit.
|
|
|
All times are GMT -5. The time now is 10:24 PM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|