What encoding do you use for your sources?

i92guboj · 04-15-2014, 10:51 AM

Hi.

From some time now, I've been doing multiplatform programming using Qt/C++. My IDE of choice is qtcreator, though I don't feel particularly tied to it since I don't use the UI creator part, whatever that was called.

I've been using utf-8 for quite a few years in Linux, which is my primary (and almost-only) platform.

My locale is es_ES@UTF-8.

The problem is that some of my customers like working in Xp, which means problems when it comes to utf-8 in any languages that's not plain English.

I discovered it the hard way, and now I am in a situation where I have to go converting the encoding of the source files every now and then between utf-8 and iso-8859-15, and then back, which can be quite tedious.

How do people handle this kind of stuff? It seems a trivial thing but I can't find anything about this, and after more than one year of developing this way it has become quite time consuming, even with automation scripts, besides it's also error prone.

Thanks for any idea or insight into the issue

mina86 · 04-15-2014, 11:50 AM

You could play around with BOM but it has it's own set of problems.

pan64 · 04-15-2014, 12:19 PM

For sources usually I use the standard ASCII char set, nothing more. I do not care about any language specific characters, just replace them with a "normal" one. There is no way to avoid problems (caused by those chars).
(I build softwares for almost 20 years, containing parts in spanish, hungarian, german, swedish..., and using java, c, c++, scripts and other languages, on windows/solaris/linux. There is no other way but removing strange chars)

i92guboj · 04-15-2014, 03:09 PM

Well, in the sources, at some point you will need to write a string or something that will end in the interface.

Intentionally exposing typos to the final users is not something I am willing to do. I'd rather continue this maintenance burden that I have right now or implement something myself when/if I get to the point where I can't stand this anymore.

It just seems odd to me that by now there's no sane way to handle this...

In any case, thanks for the pointer.

i92guboj · 04-15-2014, 03:14 PM

Quote:

Originally Posted by mina86

You could play around with BOM but it has it's own set of problems.

I am not sure how this can help me, care to elaborate?

Or maybe you didn't understand my problem. The issue is with the encoding of the source files. If I save then as utd8 and then compile them the strings won't show properly when you run the .exe in xp. If the source files are saved with iso-8859-15 then the resulting binary works as intended.

Thanks.

mina86 · 04-16-2014, 06:33 AM

Quote:

Originally Posted by i92guboj

I am not sure how this can help me, care to elaborate?

If you add UTF-8 BOM to the file some Windows applications should read the file as UTF-8. But like I mentioned, this solution may introduce other problems and it may not work for some other applications.

Quote:

Originally Posted by i92guboj

Well, in the sources, at some point you will need to write a string or something that will end in the interface.

Depending on the language, you can escape Unicode characters with “\u####” or similar sequence though. This would be rather cumbersome to do manually though, so perhaps your editor could assist you with that automatically converting non-ASCII characters in strings to escaped sequence.

i92guboj · 04-16-2014, 06:39 AM

That I figured. But it wouldn@t help with binaries, would it?

I am not interested in opening source files in windows. I develop fully in linux. What i want is that when it comes the time to launch the exe in windows I'll be able to see accented characters and euro symbols that are hardcoded in the sources, and not some kind of random crap.

pan64 · 04-16-2014, 06:41 AM

Quote:

Originally Posted by i92guboj

Well, in the sources, at some point you will need to write a string or something that will end in the interface.

No you must not do that. All the displayed texts should be collected and put into some "external" files and you only need to use indices in the sources.