ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
solved: why fwprintf writes chars instead of wchars?
Hello good programmers, I'm testing wcs functions
on linux (slackware 12) and strangely my first test fails.
The code when compiled and executed on windows fills
bigString.hex by the 16bit wchars, while on linux
I only find 1 byte characters inside the hex file
instead of the 4 bytes of the wchar_t.
I tried many combinations and also tried to use fwide.
Can you help me to find what am I mistaking, please?
Unluckily not, I'm using khexedit to view the result.
I'm a bit astonished, because the test code seems
correct to me (and I get the same 8bit behaviour at home).
This problem is setting me in great trouble because
I'm the only one to work on the porting of a large
codebase and I find myself with no clues in front
of such a 'simple' problem...and windows programmers
around here are obviously not helpful.
I still hope in some missing defines to switch on the
full wchar_t capabilities.
The output of the next test program is:
test1: fwprintf wrote 16 bytes
test1: fwrite error: return code is 0
test2: fwrite wrote 64 bytes
test2: fwprintf error: return code is -1
It looks like I cannot mix fwprintf and fwrite.
If I try to use fwide it doesn't fail and it
always let the fwrite to fail.
I'm really lost. I only can guess that probably
the fwprintf does its work internally by the wchar_t
as expected, but when it comes to write inside the file,
it makes an UTF8 conversion.
I'm really lost. I only can guess that probably
the fwprintf does its work internally by the wchar_t
as expected, but when it comes to write inside the file,
it makes an UTF8 conversion.
This is expected behavior. The wide character string (a string of 4-byte values) will be converted to a string of mbcs characters depending on locale. On windows, the locale assumes UTF-16, in which case you get 2 bytes for most characters (especially the ones you picked). On most linux machines, the default locale is UTF-8, in which case you get one byte for ASCII characters (the ones you picked were all ASCII).
If you absolutely want UTF-16, change locale (non-thread-safe) or use iconv.
Summary for ignorant people like me that are approaching wide chars:
1) a file pointer [FP] cannot be used during the same lifetime by
both char functions [CF]: {fprintf vfprintf fputc fwrite!! fwide(fp,-1)}
and wide char functions [WCF]: {fwprintf vfwprintf fputwc fwide(fp,1)}
2) to mix char and wide char output the code shall either use CF and write
wide chars by (%ls,%lc) or use WCF and write chars by (%s,%c).
3) the 'strange' but sane behaviour (that unluckily is not present in
windows) is due to the internal state of the FP that is imposed either
by fwide or implicitly by either a call to CF (that inhibits WCF) or a call
to WCF (that inhibits CF).
4) pay attention to fwrite when you want to mix binary and wide char
output inside a stream because it switches the FP to accept only CF
5) use swprintf, sprintf, iconv, open and write to start your experiments
Last edited by cicorino; 02-19-2008 at 03:10 AM.
Reason: typo
Interesting result to say the least as I haven't used those specific WC-related file functions myself yet. What I do/did instead is use a templated char vector (8, 16 or 32 bits) and use the "normal" fwrite etc. functions with write size relative to the char width.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.