Slackware This Forum is for the discussion of Slackware Linux.
|
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
View Poll Results: What locale/codeset do you use?
|
UTF-8
|
|
73 |
85.88% |
ISO8859-1
|
|
9 |
10.59% |
Other ISO8859-*
|
|
2 |
2.35% |
Other
|
|
3 |
3.53% |
|
|
08-04-2014, 04:57 PM
|
#1
|
LQ Veteran
Registered: May 2008
Posts: 7,010
|
What locale/codeset do you run your slackware box on?
I know most distro's tend to be pre-configured for UTF-8 these days, but I've been giving this some thought of late and was curious how many slackers have made the jump to unicode, and if so, have you encountered any incompatible programs.
|
|
|
08-04-2014, 05:01 PM
|
#2
|
LQ Veteran
Registered: May 2008
Posts: 7,010
Original Poster
|
As for myself, I'm still using ISO8859-15.
|
|
|
08-04-2014, 05:25 PM
|
#3
|
Senior Member
Registered: Dec 2013
Distribution: Slackware
Posts: 1,982
|
I use whatever the default one is, not UTF-8. I suppose UTF-8 will eventually become the default, but it doesn't concern me too much.
|
|
|
08-04-2014, 07:02 PM
|
#4
|
LQ 5k Club
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware64-15.0
Posts: 6,448
|
I just change to en_AU in /etc/profile.d/lang.sh and /etc/profile.d/lang.csh which suits my purposes. I do not have a need for UTF-8.
Code:
bash-4.3$ locale
LANG=en_AU
LC_CTYPE="en_AU"
LC_NUMERIC="en_AU"
LC_TIME="en_AU"
LC_COLLATE=C
LC_MONETARY="en_AU"
LC_MESSAGES="en_AU"
LC_PAPER="en_AU"
LC_NAME="en_AU"
LC_ADDRESS="en_AU"
LC_TELEPHONE="en_AU"
LC_MEASUREMENT="en_AU"
LC_IDENTIFICATION="en_AU"
LC_ALL=
|
|
|
08-04-2014, 07:22 PM
|
#5
|
LQ Guru
Registered: Mar 2004
Distribution: Slackware
Posts: 6,552
|
I prefer set the lang in ~/.bash_profile, I use ISO8859-1
|
|
|
08-04-2014, 11:46 PM
|
#6
|
Senior Member
Registered: May 2012
Location: Sebastopol, CA
Distribution: Slackware64
Posts: 1,038
|
ASCII4EVAR
Also, I improve the performance of all my text-parsing utilities (sort, grep, etc) by setting LANG=C and LC_ALL=C. It's like the modern equivalent to the old PC's "Turbo" switch.
|
|
|
08-05-2014, 02:30 AM
|
#7
|
Moderator
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,297
|
All UTF8 now, since about when Slackware 14 was released.
|
|
|
08-05-2014, 03:09 AM
|
#8
|
Senior Member
Registered: Feb 2009
Posts: 1,727
|
UTF-8,
should be default,
especially if you have to deal with multiple languages, even if the sys lang en_??
|
|
|
08-05-2014, 04:17 AM
|
#9
|
LQ Veteran
Registered: May 2008
Posts: 7,010
Original Poster
|
BTW, one I've found breaks in UTF-8 is vi (elvis). vim is fine though.
|
|
|
08-05-2014, 04:21 AM
|
#10
|
LQ Addict
Registered: Nov 2008
Location: Paris, France
Distribution: Slint64-15.0
Posts: 11,172
Rep:
|
fr_FR.utf8. This doesn't prevent me to write "LANG=C <something>" and maybe LC_COLLATE=C [1] when <something> is happier or faster with that, of course. To properly display the man pages encoded in UTF-8, I've in ~/.bashrc:
Code:
alias uman="GROFF_ENCODING=utf8 man"
There still remain a few _not_English_man_ pages_ in legacy encodings, but what can I do?
Also, I can understand that people speaking and reading only in English be not that much interested by UTF8, though but a very few performance costs, or issues with legacy utilities, as ASCII is functionally a subset of UTF-8 I hardly see any drawback even for them using UTF-8.
[1] I'll add LC_CTYPE if you insist, though I rarely need to set LANG to anything other than fr_FR.utf8, and practically never find the need to set other internationalization variables as defined in POSIX' xbd volume.
Last edited by Didier Spaier; 08-05-2014 at 04:34 AM.
|
|
|
08-05-2014, 04:48 AM
|
#11
|
LQ Veteran
Registered: May 2008
Posts: 7,010
Original Poster
|
Running in utf-8 and then overriding to LANG=C for performance is fine as long as you know there are no multibyte characters in the input data, or that you are doing no character specific operations on it. But, as the following shows, it can break things:
Code:
gazl@ws1:/tmp$ echo -n "€x€" | wc -m
3
gazl@ws1:/tmp$ echo -n "€x€" | LANG=C wc -m
7
I don't think I'd be inclined to do this very often, if at all.
|
|
|
08-05-2014, 05:01 AM
|
#12
|
LQ 5k Club
Registered: Jan 2006
Location: Oldham, Lancs, England
Distribution: Slackware64 15; SlackwareARM-current (aarch64); Debian 12
Posts: 8,307
Rep:
|
I've got locale set to en_GB. On my laptop, anyway, (where I am now). But I'm pretty sure I've got en_GB.UTF-8 on my desktop - I'll check later.
I'm using a unicode font in the console (Lat2-Terminus16), because it looks better than the default. Nothing bad has happened yet. But it probably will now I've mentioned it.
|
|
|
08-05-2014, 05:06 AM
|
#13
|
Senior Member
Registered: Dec 2005
Location: Philippines
Distribution: Slackware64-current
Posts: 3,098
|
~$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE=C
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
|
|
|
08-05-2014, 05:19 AM
|
#14
|
LQ Addict
Registered: Nov 2008
Location: Paris, France
Distribution: Slint64-15.0
Posts: 11,172
Rep:
|
Quote:
Originally Posted by GazL
Running in utf-8 and then overriding to LANG=C for performance is fine as long as you know there are no multibyte characters
|
Of course! I do that only when I *know* that the input is encoded in ASCII.
Quote:
But, as the following shows, it can break things:
Code:
gazl@ws1:/tmp$ echo -n "€x€" | wc -m
3
gazl@ws1:/tmp$ echo -n "€x€" | LANG=C wc -m
7
I don't think I'd be inclined to do this very often, if at all.
|
Nothing is broken IMO. You tell wc that you feed it with one byte characters, give it 7 bytes, then it answers you that it founded 7 characters. I don't see anything wrong here.
Last edited by Didier Spaier; 08-05-2014 at 05:24 AM.
|
|
1 members found this post helpful.
|
08-05-2014, 05:32 AM
|
#15
|
LQ Veteran
Registered: May 2008
Posts: 7,010
Original Poster
|
The breakage I was referring to is in the usage: the inappropriate override of LANG=C. I thought that was obvious from the context of what I posted, but I guess not. As you say, the 'wc' utility is clearly not broken, working as designed, and doing exactly what I told it to.
|
|
|
All times are GMT -5. The time now is 09:53 PM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|