LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   about CP_UTF8 macro on Windows/Linux (https://www.linuxquestions.org/questions/programming-9/about-cp_utf8-macro-on-windows-linux-439549/)

George2 04-27-2006 09:50 PM

about CP_UTF8 macro on Windows/Linux
 
Hello everyone,


There is a macro CP_UTF8 defined on Windows platform. I am wondering what is the related macro on Linux platform and which header file should I include if I want to use this macro.


thanks in advance,
George

addy86 04-28-2006 12:55 AM

To what value is this macro defined?

George2 04-28-2006 01:54 AM

Hi addy86,


Quote:

Originally Posted by addy86
To what value is this macro defined?

It is defined as something related with code page information. You can find a brief information from here,

http://msdn.microsoft.com/library/de...l/nls_0ctr.asp


regards,
George

jschiwal 04-28-2006 01:59 AM

As I understand that page, CP_UTF8 is a value that the variable CodePage can take, and not a macro.

addy86 04-28-2006 03:40 AM

What are you trying to achieve? I.e. what do you need this macro for (since there will hardly be a GetCPInfo function on Linux).

George2 05-03-2006 02:00 AM

Thanks jschiwal,


Quote:

Originally Posted by jschiwal
As I understand that page, CP_UTF8 is a value that the variable CodePage can take, and not a macro.

Do you have any ideas of how to port this code page variable to Linux platform? Does Linux have the same concept of code page?


regards,
George

George2 05-03-2006 02:02 AM

Thanks addy86,


Quote:

Originally Posted by addy86
What are you trying to achieve? I.e. what do you need this macro for (since there will hardly be a GetCPInfo function on Linux).

I am porting a program from Windows to Linux. One method of this program uses this variable, I am wondering how to port this method to Linux.


regards,
George

paulsm4 05-03-2006 10:50 AM

Hi -

I was going to respond to this a few days ago, but I thought you would have had things squared away by now. For whatever it's worth, here's my $0.02:

1. The fix is probably something like this:
http://www.nanobit.net/putty/doxy/PUTTY_8H-source.html
Code:

/*
 * Exports from unicode.c.
 */
#ifndef CP_UTF8
#define CP_UTF8 65001
#endif

2. Like you already noted, "CP_UTF8" is indeed a Windows/Microsoft thing, it specifies a "Code Page" (a Microsoft-specific thing) of "UTF-8" (Unicode, an international standard, which maps to simple ASCII text).

You can find more about UTF here:
http://www.unicode.org/faq/utf_bom.html
http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
http://en.wikipedia.org/wiki/UTF-8

3. "Classic" Unix (including most parts of the standard C and C++ runtime libraries in current versions of Linux) and "classic" Win16 (including the Win9x OS and the default output from most Visual Studio C++ compilers) have native 8-bit text.

Java, .Net (and the Windows NT/win2k/XP kernel) have native Unicode (typically 16-bit) text.

"Code pages" were basically a Win9x workaround for dealing with Unicode text in an 8-bit environment. They're an 8-bit, Microsoft-specific thing. They're part of "MSLU", the "Microsoft Layer for Unicode":
http://msdn.microsoft.com/library/de...or_unicode.asp
http://msdn.microsoft.com/library/de...nalization.asp

Things are done differently now in Microsoft .Net:
http://msdn.microsoft.com/msdnmag/is.../03/bugslayer/

Soooooooo.....

4. The real answer depends on exactly what program you're trying to port.

In general, you can probably do something like the macro "Putty" uses above.

If that doesn't work for you please, let us know what package you are trying to port (give us a link to the project, if possible).

'Hope that helps .. PSM

George2 05-04-2006 01:37 AM

Thanks paulsm4!


Quote:

Originally Posted by paulsm4
Hi -

I was going to respond to this a few days ago, but I thought you would have had things squared away by now. For whatever it's worth, here's my $0.02:

1. The fix is probably something like this:
http://www.nanobit.net/putty/doxy/PUTTY_8H-source.html
Code:

/*
 * Exports from unicode.c.
 */
#ifndef CP_UTF8
#define CP_UTF8 65001
#endif

2. Like you already noted, "CP_UTF8" is indeed a Windows/Microsoft thing, it specifies a "Code Page" (a Microsoft-specific thing) of "UTF-8" (Unicode, an international standard, which maps to simple ASCII text).

You can find more about UTF here:
http://www.unicode.org/faq/utf_bom.html
http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
http://en.wikipedia.org/wiki/UTF-8

3. "Classic" Unix (including most parts of the standard C and C++ runtime libraries in current versions of Linux) and "classic" Win16 (including the Win9x OS and the default output from most Visual Studio C++ compilers) have native 8-bit text.

Java, .Net (and the Windows NT/win2k/XP kernel) have native Unicode (typically 16-bit) text.

"Code pages" were basically a Win9x workaround for dealing with Unicode text in an 8-bit environment. They're an 8-bit, Microsoft-specific thing. They're part of "MSLU", the "Microsoft Layer for Unicode":
http://msdn.microsoft.com/library/de...or_unicode.asp
http://msdn.microsoft.com/library/de...nalization.asp

Things are done differently now in Microsoft .Net:
http://msdn.microsoft.com/msdnmag/is.../03/bugslayer/

Soooooooo.....

4. The real answer depends on exactly what program you're trying to port.

In general, you can probably do something like the macro "Putty" uses above.

If that doesn't work for you please, let us know what package you are trying to port (give us a link to the project, if possible).

'Hope that helps .. PSM

It is a great answer! Maybe you can write an article to LinuxQuesitons.org to show how to deal with Windows code page on Linux when porting programs.

I have a question, what means "65001" in PUTTY's source codes?


regards,
George

paulsm4 05-04-2006 05:30 PM

Hi, George2 -

That "65001" in "#define CP_UTF8 65001" is ... you guessed it! A "Code Page" ID!

Specifically, the code page ID for UTF-8.

Here are two more links (in addition to the ones I already posted):
http://en.wikipedia.org/wiki/Code_page
http://en.wikipedia.org/wiki/Code_page_65001

Please keep in mind: "Code Pages" are limited to 8-bit characters; they are *not* the way you should be doing internationalization for any new code on any contemporary platform. You should be using Unicode: which is fully supported by Unix and Windows; by C, C++, Java, C# and most other popular languages.


All times are GMT -5. The time now is 01:47 AM.