LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices


Reply
  Search this Thread
Old 04-09-2016, 04:44 PM   #1
ethoms
Member
 
Registered: Nov 2011
Posts: 113

Rep: Reputation: Disabled
python in 14.1 is compiled with Unicode set to UCS-2, whilst other distros use UCS-4


After debugging an issue in posting unicode emoticons in Gajim, other Gajim users says we must have python built with UCS-4 for it to work. This is done using --enable-unicode=ucs4. They say is very antiquated to use UCS-2.

Is there a reason Slackware uses UCS-2 for python?

Will replacing python with my own package after adding --enable-unicode=ucs4 to SlackBuild break anything?
 
Old 04-09-2016, 04:58 PM   #2
ethoms
Member
 
Registered: Nov 2011
Posts: 113

Original Poster
Rep: Reputation: Disabled
I notice in current, there is still no --enable-unicode=ucs4. I really hope that this doesn't mean in 14.2 I still can't use unicode emoticons. Looks like every other distro is using UCS-4.
 
Old 04-09-2016, 07:21 PM   #3
ethoms
Member
 
Registered: Nov 2011
Posts: 113

Original Poster
Rep: Reputation: Disabled
How do I go about asking the devs to include this compile option? Is there a developers mailing list?
 
Old 04-09-2016, 07:30 PM   #4
Richard Cranium
Senior Member
 
Registered: Apr 2009
Location: Carrollton, Texas
Distribution: Slackware64 14.2
Posts: 3,091

Rep: Reputation: 1469Reputation: 1469Reputation: 1469Reputation: 1469Reputation: 1469Reputation: 1469Reputation: 1469Reputation: 1469Reputation: 1469Reputation: 1469
Quote:
Originally Posted by ethoms View Post
After debugging an issue in posting unicode emoticons in Gajim, other Gajim users says we must have python built with UCS-4 for it to work. This is done using --enable-unicode=ucs4. They say is very antiquated to use UCS-2.

Is there a reason Slackware uses UCS-2 for python?

Will replacing python with my own package after adding --enable-unicode=ucs4 to SlackBuild break anything?
Try it and see. You could use a virtual machine to test, if you don't feel like potentially breaking your main/only system.
 
Old 04-09-2016, 09:15 PM   #5
Didier Spaier
LQ Addict
 
Registered: Nov 2008
Location: Paris, France
Distribution: Slint64-14.2.1 on Lenovo Thinkpad W520
Posts: 8,612

Rep: Reputation: Disabled
Quote:
Originally Posted by ethoms View Post
Is there a reason Slackware uses UCS-2 for python?
Well, I can imagine one. In UCS-4, aka UTF-32, every character is encoded using four bytes, whereas UCS-2 needs only two bytes. Probably a lot of users will think that's too costly to just get a few emoticons. That would imply that for instance any text in English needs to be represented in UCS-4 twice the space it needs in UCS-2, be it in RAM or on disk. Moreover, if at all possible I would recommend using UTF-8 instead, that has a variable length and uses only one byte to encode any character belonging to the ASCII subset including texts in English but a few quotes or such.

To know more, see https://en.wikipedia.org/wiki/Compar...code_encodings, https://en.wikipedia.org/wiki/UTF-32 and the FAQ of the Unicode consortium http://www.unicode.org/faq/

Last edited by Didier Spaier; 04-10-2016 at 12:44 AM. Reason: Typo fix.
 
1 members found this post helpful.
Old 04-09-2016, 11:12 PM   #6
bassmadrigal
LQ Guru
 
Registered: Nov 2003
Location: West Jordan, UT, USA
Distribution: Slackware
Posts: 5,508

Rep: Reputation: 3255Reputation: 3255Reputation: 3255Reputation: 3255Reputation: 3255Reputation: 3255Reputation: 3255Reputation: 3255Reputation: 3255Reputation: 3255Reputation: 3255
Quote:
Originally Posted by ethoms View Post
How do I go about asking the devs to include this compile option? Is there a developers mailing list?
There is only one dev that really matters, and that's Pat. You can try the stickied "Requests for -current" thread at the top of the forum. The reasoning to switch and the showing that it would have minimal impact on others should be included with your request.
 
Old 04-10-2016, 02:34 AM   #7
dugan
LQ Guru
 
Registered: Nov 2003
Location: Canada
Distribution: Slackware
Posts: 8,545

Rep: Reputation: 3568Reputation: 3568Reputation: 3568Reputation: 3568Reputation: 3568Reputation: 3568Reputation: 3568Reputation: 3568Reputation: 3568Reputation: 3568Reputation: 3568
Quote:
Originally Posted by ethoms View Post
Will replacing python with my own package after adding --enable-unicode=ucs4 to SlackBuild break anything?
I cannot imagine it breaking anything, but try it and see.
 
Old 04-10-2016, 02:10 PM   #8
volkerdi
Slackware Maintainer
 
Registered: Dec 2002
Location: Minnesota
Distribution: Slackware! :-)
Posts: 1,609

Rep: Reputation: 4911Reputation: 4911Reputation: 4911Reputation: 4911Reputation: 4911Reputation: 4911Reputation: 4911Reputation: 4911Reputation: 4911Reputation: 4911Reputation: 4911
Quote:
Originally Posted by ethoms View Post
Is there a reason Slackware uses UCS-2 for python?
Because it is the Python default.

Quote:
Will replacing python with my own package after adding --enable-unicode=ucs4 to SlackBuild break anything?
It will break any binary modules that use Unicode and were compiled using a ucs2 Python. For this reason, it is too late to consider this change for 14.2, but it would be worth revisiting in the next devel cycle.
 
2 members found this post helpful.
Old 04-11-2016, 04:47 PM   #9
ethoms
Member
 
Registered: Nov 2011
Posts: 113

Original Poster
Rep: Reputation: Disabled
OK, thanks volkerdi for clarifying. I wish I noticed it sooner in the development cycle.

I'd really like to get good emoticons support in Gajim for our corporate rollout (few hundred users), and for myself. I may end up maintaining my own packages for python with UCS-4 support. There's about 10 packages in my estimation (pygobject, pygtk, pycairo, pyqt4 etc.) that I'd need to rebuild. However I'm reluctant because updating via slackpkg in future may require doing this repeatedly. The other option would be to hack the Gajim plugin for emoticons. But I'm not very experienced with unicode and it's encodings. Is it possibelt o use UTF-8 instead of UTF-32 for those emoticons? The strange thing is, I can see ones posted by others chat users, but can't post myself. So it seems it's just an encoding issue, decoding the ones that Converstations (Android XMPP client) sends is fine. They appear as they shgould ni the chat window. I'd much rather hack the Gajim plugin than Slackware python package.
 
Old 04-11-2016, 04:56 PM   #10
bassmadrigal
LQ Guru
 
Registered: Nov 2003
Location: West Jordan, UT, USA
Distribution: Slackware
Posts: 5,508

Rep: Reputation: 3255Reputation: 3255Reputation: 3255Reputation: 3255Reputation: 3255Reputation: 3255Reputation: 3255Reputation: 3255Reputation: 3255Reputation: 3255Reputation: 3255
Quote:
Originally Posted by ethoms View Post
However I'm reluctant because updating via slackpkg in future may require doing this repeatedly.
Just maintain your own Slackware mirror that your users would use for updates. Then, if Pat puts out any python related updates, you just recompile them before you moving everything to your mirror. (It is probably better to maintain your own mirror anyway, this way you're able to verify updates won't break anything before pushing them out to your users.)

Your other option is to set up a local slackpkg+ repo and list your repo as a higher priority than the default. This is what many people do with Eric's multilib and ktown repos. This way, slackpkg won't attempt to downgrade/revert them to the stock packages.
 
1 members found this post helpful.
Old 04-11-2016, 04:58 PM   #11
Didier Spaier
LQ Addict
 
Registered: Nov 2008
Location: Paris, France
Distribution: Slint64-14.2.1 on Lenovo Thinkpad W520
Posts: 8,612

Rep: Reputation: Disabled
Quote:
Originally Posted by ethoms View Post
Is it possible to use UTF-8 instead of UTF-32 for those emoticons?
Try one of these commands:
Code:
iconv -f UTF-32 -t UTF-8 [emoticon-file] > [new-emoticon-file]
iconv -f UCS-4 -t UTF-8 [emoticon-file] > [new-emoticon-file]
 
1 members found this post helpful.
Old 04-11-2016, 05:06 PM   #12
ethoms
Member
 
Registered: Nov 2011
Posts: 113

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by bassmadrigal View Post
Just maintain your own Slackware mirror that your users would use for updates. Then, if Pat puts out any python related updates, you just recompile them before you moving everything to your mirror. (It is probably better to maintain your own mirror anyway, this way you're able to verify updates won't break anything before pushing them out to your users.)

Your other option is to set up a local slackpkg+ repo and list your repo as a higher priority than the default. This is what many people do with Eric's multilib and ktown repos. This way, slackpkg won't attempt to downgrade/revert them to the stock packages.
I actually have my own update (repo) system. The client machines update every day oon cron job. It's just a simple shell script, PostgreSQl dB and http download folder. It also logs which clients have installed and uninstalled packages. I'm able to push new, updated and remove packages to my client machines. Mwahahaha!
 
Old 04-11-2016, 05:17 PM   #13
ethoms
Member
 
Registered: Nov 2011
Posts: 113

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by Didier Spaier View Post
Try one of these commands:
Code:
iconv -f UTF-32 -t UTF-8 [emoticon-file] > [new-emoticon-file]
iconv -f UCS-4 -t UTF-8 [emoticon-file] > [new-emoticon-file]
That's intersting, but it's not a file, it's a unicode character. I guess I could paste it to kwrite, save and try it.
 
Old 04-11-2016, 05:32 PM   #14
Didier Spaier
LQ Addict
 
Registered: Nov 2008
Location: Paris, France
Distribution: Slint64-14.2.1 on Lenovo Thinkpad W520
Posts: 8,612

Rep: Reputation: Disabled
Quote:
Originally Posted by ethoms View Post
That's intersting, but it's not a file, it's a unicode character. I guess I could paste it to kwrite, save and try it.
I'm not sure that will work, as I don't see UTF-32 among the proposed encoding in kwrite or kate. I would try in geany instead.

But how can you paste it if it's not already in a file or a file by itself? Do you have an example of some text where I can see one of these emoticons?
 
Old 04-11-2016, 05:42 PM   #15
ethoms
Member
 
Registered: Nov 2011
Posts: 113

Original Poster
Rep: Reputation: Disabled
I t works, your iconv suggestion was a great idea to solving this puzzle, thanks. I copy and paste to Kwrite, which uses UTF-8 by default. Then using:

Code:
iconv -f UTF-8 -t UTF-32 test-emote.txt > test-emote-32.txt
it works and I can open in Kwrite also.

Code:
-rw-r--r--  1 euan.thoms users    72 Apr 12 05:34 test-emote-32.txt
-rw-r--r--  1 euan.thoms users    23 Apr 12 05:33 test-emote.txt
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Linux on Cisco UCS host - Storage migration yashraj221087 Linux - Server 1 08-10-2015 07:00 AM
Problems using awk/sed/sort with a ucs-2le encoded file Jem7v! Programming 3 02-05-2010 07:03 AM
Perl File handling issue how to handle ucs 16 character set alix123 Programming 1 10-27-2008 07:51 AM
Where are UCS Unicode strings for GTK? donnied Linux - Desktop 0 08-11-2008 11:19 AM
python problem - compiled from source - python -V still showing old version txm123 Linux - Newbie 1 02-15-2006 12:05 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware

All times are GMT -5. The time now is 04:42 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration