LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 01-14-2012, 03:51 AM   #1
aspiring_stellar
LQ Newbie
 
Registered: May 2011
Distribution: Fedora, ubuntu
Posts: 18

Rep: Reputation: Disabled
host to network byte order for strings


Hi guys,
I want to convert a "string" from host byte order to network byte order. For integers and short, I can use htons(), htonl(); but what to do for strings?

Thanks.
 
Old 01-14-2012, 04:43 AM   #2
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948
In general, strings have no byte order. For variable-length character encodings like UTF-8, the order is specified in the encoding, and does not depend on endianness (also known as byte order). It is always the same.

Some encodings do use multiple bytes for every character, though. For example, UTF-16 the "string" is actually a sequence of 16-bit unsigned integers (unsigned shorts), and UCS-4 "strings" are actually sequences of 31-bit unsigned (32-bit signed) integers. Since these "strings" are actually sequences of shorts or ints, you can use htons() and htonl() to convert them to network byte order. However, Unicode users using those encodings should instead use a byte order mark (BOM, U+FEFF) as the first character to let the reader handle the byte order (if there is any risk of confusion); correspondingly, all readers should be prepared to understand any byte order for multibyte Unicode strings based on the initial byte order mark. (There is no reason to use or retain byte order marks when using UTF-8.)

For a large number of reasons, I recommend using UTF-8 for your strings. Each character (Unicode U+0000 to U+10FFFF) may be encoded between one to four bytes (and therefore you should be aware of the difference between string length in bytes and in characters!), but the order of the byte components in each characters is always the same and does not depend on the byte order. No byte order conversion is ever done for such strings. Furthermore, you can use standard C library functions to handle the strings. (UTF-16 and UCS-4 require wide character support, using special wide character types.)
 
1 members found this post helpful.
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
C function to reverse the byte order in a double? sneakyimp Programming 3 08-12-2010 11:24 AM
IPv6 conversion frm network to host order akaash1087 Linux - Networking 0 05-11-2010 11:59 PM
How to replace UTF-8 BOMs (Byte Order Marks) () in a File robbbert Linux - General 1 05-01-2008 03:40 AM
byte order conversion from network to host... adnap Programming 1 04-15-2007 04:36 AM
C : byte order of served http binary file. slzckboy Programming 5 06-22-2006 02:36 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 01:09 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration