LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 08-17-2009, 08:46 AM   #1
joeBuffer
Member
 
Registered: Jul 2009
Distribution: Ubuntu 9.04
Posts: 328

Rep: Reputation: 42
Red face A question about big-endian little-endian and how it affects things


I'm missing something.
I've been reading "Programming from the Ground Up", and I got to this section of it ...
Here is what it says about little-endian:
Quote:
The byte-switching magic happens automatically behind the scenes during register-to-memory transfers. However, the byte order can cause problems in several instances:
• If you try to read in several bytes at a time using movl but deal with them on a byte-by-byte basis using the least significant byte (i.e. - by using %al and/or shifting of the register), this will be in a different order than they appear in memory.
• If you read or write files written for different architectures, you may have to account for whatever order they write their bytes in.
• If you read or write to network sockets, you may have to account for a different byte order in the protocol.

As long as you are aware of the issue, it usually isn’t a big deal.
Can someone explain in simple terms what this means to someone programming in assembly?
I've been reading this book, and I looked at a link that was in this book about it for a few minutes, and I found this here, at LinuxQuestions:
http://www.linuxquestions.org/questi...ghlight=endian
I know more than what is in this thread^
I know that the least significant byte is first in memory, the order of the bytes is reversed.
I just want a simple explanation. I'm not really grasping yet how this affects things.

Last edited by joeBuffer; 08-17-2009 at 08:59 AM.
 
Old 08-17-2009, 09:15 AM   #2
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454
Quote:
Originally Posted by joeBuffer View Post
I'm missing something.
I've been reading "Programming from the Ground Up", and I got to this section of it ...
Here is what it says about little-endian:

Can someone explain in simple terms what this means to someone programming in assembly?
I've been reading this book, and I looked at a link that was in this book about it for a few minutes, and I found this here, at LinuxQuestions:
http://www.linuxquestions.org/questi...ghlight=endian
I know more than what is in this thread^
I know that the least significant byte is first in memory, the order of the bytes is reversed.
I just want a simple explanation. I'm not really grasping yet how this affects things.
Maybe this article: http://en.wikipedia.org/wiki/Endiannes .
 
Old 08-17-2009, 09:21 AM   #3
joeBuffer
Member
 
Registered: Jul 2009
Distribution: Ubuntu 9.04
Posts: 328

Original Poster
Rep: Reputation: 42
Quote:
• If you try to read in several bytes at a time using movl but deal with them on a byte-by-byte basis using the least significant byte (i.e. - by using %al and/or shifting of the register), this will be in a different order than they appear in memory.
I don't understand what this is saying, exactly.
Is this as simple as it seems?
Say a 32-bit register is put into memory. Then you use that memory a byte at a time. It's in the reverse order?
But if you use the 32 bits, it gets switched around automatically?
Also, the endianness is different for different things, so you have to take that into consideration?
I understand the fundamental idea here, I'm mainly wondering if this is as simple as it seems, or if this book is talking about something more complex.
Really it's almost a silly question, because I think I understand all of this, and it's very simple. I just wanted to make sure, and have a simple explanation.
The most confusing thing to me is, normally, you wouldn't do this, would you? You'd have a symbol or whatever (in your assembly code), and that would be the location of the data you want to retrieve from memory, and it all gets worked out automatically?

Last edited by joeBuffer; 08-17-2009 at 09:28 AM.
 
Old 08-17-2009, 09:29 AM   #4
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454
Quote:
Originally Posted by joeBuffer View Post
I don't understand what this is saying, exactly.
Is this as simple as it seems?
Say a 32-bit register is put into memory. Then you use that memory a byte at a time. It's in the reverse order?
But if you use the 32 bits, it gets switched around automatically?
Also, the endianness is different for different things, so you have to take that into consideration?
I understand the fundamental idea here, I'm mainly wondering if this is as simple as it seems, or if this book is talking about something more complex.
Really it's almost a silly question, because I think I understand all of this, and it's very simple. I just wanted to make sure, and have a simple explanation.
It is not as simple as it seems because of bit endianness on top of byte endianness - see the Wiki article above.
 
Old 08-17-2009, 09:35 AM   #5
joeBuffer
Member
 
Registered: Jul 2009
Distribution: Ubuntu 9.04
Posts: 328

Original Poster
Rep: Reputation: 42
This entire book is on x86 assembly.
Like it says, say you have a word in a register, and you move it to memory ... it does this:
[byte0][byte1][byte2][byte3] <- register
[byte3][byte2][byte1][byte0] <- memory

if you read the whole word into a register, it will be worked out automatically. So when you read from the memory into the register, it ends up looking like this again:
[byte0][byte1][byte2][byte3] <- register

but if you read it in a byte at a time, you'll be reading it in from memory in this order:
[byte3][byte2][byte1][byte0] <- memory

Correct?
 
Old 08-17-2009, 09:35 AM   #6
fantas
Member
 
Registered: Jun 2007
Location: Bavaria
Distribution: slackware, xubuntu
Posts: 143

Rep: Reputation: 22
I think they're just making it more difficult than it actually is, i.e. they could have also just written, if the byteorder differs from memory storage to CPU (e.g. big endian storage with an Intel CPU), you will have to do the byte swapping yourself, and can't rely on bytes automatically being adressable correctly if you read them in chunks of more than one byte at once.
 
Old 08-17-2009, 09:46 AM   #7
paulsm4
LQ Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
Hi -

This might also help:
Code:
#include <stdio.h>

typedef unsigned char BYTE;
typedef unsigned short WORD;
typedef unsigned long DWORD;

int
main (int argc, char *argv[])
{
  int i;
  BYTE buf[4];
  BYTE *b = &buf[0];
  WORD *w = (WORD *)&buf[0];
  DWORD *d = (DWORD *)&buf[0];

  *d = 0x0123abcd;

  for (i=0; i < 4; i++)
    printf ("b[%d]: 0x%02x...\n", i, b[i]);
  for (i=0; i < 2; i++)
    printf ("w[%d]: 0x%04x...\n", i, w[i]);
  printf ("d: 0x%x...\n", d[0]);

  return 0;
}
Sample output:
Quote:
b[0]: 0xcd...
b[1]: 0xab...
b[2]: 0x23...
b[3]: 0x01...
w[0]: 0xabcd...
w[1]: 0x0123...
d: 0x123abcd...
 
Old 08-17-2009, 10:06 AM   #8
joeBuffer
Member
 
Registered: Jul 2009
Distribution: Ubuntu 9.04
Posts: 328

Original Poster
Rep: Reputation: 42
Right, exactly what I was thinking.
The way it's worded is just a little funny. Obviously understandable, or I wouldn't have gotten it right. I just wanted verification, mainly, and an opinion on whether or not it is more complex than it seems. I'm gonna mark this as solved.

P.S. Using -Wall, the code example you have gives a warning but not an error.
Code:
joebuffer@ubuntu:~/clang$ gcc -Wall test.c -o test
test.c: In function ‘main’:
test.c:22: warning: format ‘%x’ expects type ‘unsigned int’, but argument 2 has type ‘DWORD’
Using a typecast to unsigned int works:
Code:
printf("d: 0x%x...\n", (unsigned int)d[0]);
Thanks.
 
Old 08-20-2009, 05:59 AM   #9
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682
I often wish that Intel had never invented this problem. It becomes important when you are saving files or communicating with another system. Using 8 bit ascii there isn't a problem but when characters are saved in a multi-byte encoding scheme, there is. A utf-16 file will write an initial BOM (byte order marker) of 0xFEFF. This reveals what the endianess of the system saving the file was. It also makes working with text utilities difficult, because now the file starts with a short binary blob.

If one computer uses services of another computer, they both need to agree on the endianess used. Sometimes the protocol used enforces this.

If you play a video, the audio stream may be use s16le or s16be sampling. Your player needs to detect this so the two bytes of every sample don't get switched.
 
Old 08-20-2009, 06:03 AM   #10
ahshan m d
LQ Newbie
 
Registered: Aug 2009
Posts: 1

Rep: Reputation: 0
how could i debug c programming as f7 key used in turbo c/c++ in gcc as well,is there any options...

kindly reply
 
Old 08-20-2009, 06:12 AM   #11
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454
Quote:
Originally Posted by ahshan m d View Post
how could i debug c programming as f7 key used in turbo c/c++ in gcc as well,is there any options...

kindly reply
And what does your question have to do with the thread ?
 
Old 08-20-2009, 08:06 AM   #12
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
Quote:
Originally Posted by jschiwal View Post
I often wish that Intel had never invented this problem.
Intel certainly did not invent that problem. There were both computers using least significant first and computers using most significant first long before Intel existed.

I'm not sure, but I think the problem was really invented when someone copied the decimal numbering system from a language written right to left into a language written left to right, but kept the least significant digit (which had been first) on the right, so it became last.

Positional numbering was invented using base 60 with the least significant digit on the right used with a language that was read right to left, so the least significant digit was originally first. Sometime later it was switched to base 10 and sometime later it was adopted into languages that are read left to right and somehow that got us to our belief that the most significant digit in a number as naturally first and LSB is backwards.

So you get nonsense such as what the OP quoted from "Programming from the Ground Up" about byte swapping between memory and the CPU in ordinary operations. None of that happens. The Intel design is consistently LSB: Least significant bit first in each byte, Least significant byte first in each larger object, consistently in the CPU and in memory and in external interfaces.

Quote:
It becomes important when you are saving files or communicating with another system.
Correct. The byte swapping becomes an issue when you must communicate using a standard based on a different byte ordering.
 
Old 08-20-2009, 01:02 PM   #13
orgcandman
Member
 
Registered: May 2002
Location: new hampshire
Distribution: Fedora, RHEL
Posts: 600

Rep: Reputation: 110Reputation: 110
Actually, little endian on some architectures with some programs is faster since low numbers are already "there" instead of the ALU having to parse the first 16 bits or so. Not a big deal with processors that are fast nowadays, but long ago there were systems that had small integers and didn't need to use the "top" bits of the storage unit.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
small-endian to big-endian conversion of data to store in a structure NancyT Programming 2 11-26-2008 10:06 AM
problem in understanding little endian/big endian machine program indian Programming 6 04-19-2006 02:50 PM
big endian little endian? blackzone Linux - Hardware 4 09-23-2004 06:04 AM
What is all this big endian-little endian stuff about? vdemuth Linux - Newbie 1 04-28-2004 02:16 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 10:24 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration