python in 14.1 is compiled with Unicode set to UCS-2, whilst other distros use UCS-4
SlackwareThis Forum is for the discussion of Slackware Linux.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
python in 14.1 is compiled with Unicode set to UCS-2, whilst other distros use UCS-4
After debugging an issue in posting unicode emoticons in Gajim, other Gajim users says we must have python built with UCS-4 for it to work. This is done using --enable-unicode=ucs4. They say is very antiquated to use UCS-2.
Is there a reason Slackware uses UCS-2 for python?
Will replacing python with my own package after adding --enable-unicode=ucs4 to SlackBuild break anything?
I notice in current, there is still no --enable-unicode=ucs4. I really hope that this doesn't mean in 14.2 I still can't use unicode emoticons. Looks like every other distro is using UCS-4.
After debugging an issue in posting unicode emoticons in Gajim, other Gajim users says we must have python built with UCS-4 for it to work. This is done using --enable-unicode=ucs4. They say is very antiquated to use UCS-2.
Is there a reason Slackware uses UCS-2 for python?
Will replacing python with my own package after adding --enable-unicode=ucs4 to SlackBuild break anything?
Try it and see. You could use a virtual machine to test, if you don't feel like potentially breaking your main/only system.
Is there a reason Slackware uses UCS-2 for python?
Well, I can imagine one. In UCS-4, aka UTF-32, every character is encoded using four bytes, whereas UCS-2 needs only two bytes. Probably a lot of users will think that's too costly to just get a few emoticons. That would imply that for instance any text in English needs to be represented in UCS-4 twice the space it needs in UCS-2, be it in RAM or on disk. Moreover, if at all possible I would recommend using UTF-8 instead, that has a variable length and uses only one byte to encode any character belonging to the ASCII subset including texts in English but a few quotes or such.
How do I go about asking the devs to include this compile option? Is there a developers mailing list?
There is only one dev that really matters, and that's Pat. You can try the stickied "Requests for -current" thread at the top of the forum. The reasoning to switch and the showing that it would have minimal impact on others should be included with your request.
Is there a reason Slackware uses UCS-2 for python?
Because it is the Python default.
Quote:
Will replacing python with my own package after adding --enable-unicode=ucs4 to SlackBuild break anything?
It will break any binary modules that use Unicode and were compiled using a ucs2 Python. For this reason, it is too late to consider this change for 14.2, but it would be worth revisiting in the next devel cycle.
OK, thanks volkerdi for clarifying. I wish I noticed it sooner in the development cycle.
I'd really like to get good emoticons support in Gajim for our corporate rollout (few hundred users), and for myself. I may end up maintaining my own packages for python with UCS-4 support. There's about 10 packages in my estimation (pygobject, pygtk, pycairo, pyqt4 etc.) that I'd need to rebuild. However I'm reluctant because updating via slackpkg in future may require doing this repeatedly. The other option would be to hack the Gajim plugin for emoticons. But I'm not very experienced with unicode and it's encodings. Is it possibelt o use UTF-8 instead of UTF-32 for those emoticons? The strange thing is, I can see ones posted by others chat users, but can't post myself. So it seems it's just an encoding issue, decoding the ones that Converstations (Android XMPP client) sends is fine. They appear as they shgould ni the chat window. I'd much rather hack the Gajim plugin than Slackware python package.
However I'm reluctant because updating via slackpkg in future may require doing this repeatedly.
Just maintain your own Slackware mirror that your users would use for updates. Then, if Pat puts out any python related updates, you just recompile them before you moving everything to your mirror. (It is probably better to maintain your own mirror anyway, this way you're able to verify updates won't break anything before pushing them out to your users.)
Your other option is to set up a local slackpkg+ repo and list your repo as a higher priority than the default. This is what many people do with Eric's multilib and ktown repos. This way, slackpkg won't attempt to downgrade/revert them to the stock packages.
Just maintain your own Slackware mirror that your users would use for updates. Then, if Pat puts out any python related updates, you just recompile them before you moving everything to your mirror. (It is probably better to maintain your own mirror anyway, this way you're able to verify updates won't break anything before pushing them out to your users.)
Your other option is to set up a local slackpkg+ repo and list your repo as a higher priority than the default. This is what many people do with Eric's multilib and ktown repos. This way, slackpkg won't attempt to downgrade/revert them to the stock packages.
I actually have my own update (repo) system. The client machines update every day oon cron job. It's just a simple shell script, PostgreSQl dB and http download folder. It also logs which clients have installed and uninstalled packages. I'm able to push new, updated and remove packages to my client machines. Mwahahaha!
That's intersting, but it's not a file, it's a unicode character. I guess I could paste it to kwrite, save and try it.
I'm not sure that will work, as I don't see UTF-32 among the proposed encoding in kwrite or kate. I would try in geany instead.
But how can you paste it if it's not already in a file or a file by itself? Do you have an example of some text where I can see one of these emoticons?
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.