LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices


View Poll Results: What locale/codeset do you use?
UTF-8 73 85.88%
ISO8859-1 9 10.59%
Other ISO8859-* 2 2.35%
Other 3 3.53%
Multiple Choice Poll. Voters: 85. You may not vote on this poll

Reply
  Search this Thread
Old 08-04-2014, 04:57 PM   #1
GazL
LQ Veteran
 
Registered: May 2008
Posts: 7,010

Rep: Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142
What locale/codeset do you run your slackware box on?


I know most distro's tend to be pre-configured for UTF-8 these days, but I've been giving this some thought of late and was curious how many slackers have made the jump to unicode, and if so, have you encountered any incompatible programs.
 
Old 08-04-2014, 05:01 PM   #2
GazL
LQ Veteran
 
Registered: May 2008
Posts: 7,010

Original Poster
Rep: Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142
As for myself, I'm still using ISO8859-15.
 
Old 08-04-2014, 05:25 PM   #3
metaschima
Senior Member
 
Registered: Dec 2013
Distribution: Slackware
Posts: 1,982

Rep: Reputation: 492Reputation: 492Reputation: 492Reputation: 492Reputation: 492
I use whatever the default one is, not UTF-8. I suppose UTF-8 will eventually become the default, but it doesn't concern me too much.
 
Old 08-04-2014, 07:02 PM   #4
allend
LQ 5k Club
 
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware64-15.0
Posts: 6,448

Rep: Reputation: 2787Reputation: 2787Reputation: 2787Reputation: 2787Reputation: 2787Reputation: 2787Reputation: 2787Reputation: 2787Reputation: 2787Reputation: 2787Reputation: 2787
I just change to en_AU in /etc/profile.d/lang.sh and /etc/profile.d/lang.csh which suits my purposes. I do not have a need for UTF-8.
Code:
bash-4.3$ locale
LANG=en_AU
LC_CTYPE="en_AU"
LC_NUMERIC="en_AU"
LC_TIME="en_AU"
LC_COLLATE=C
LC_MONETARY="en_AU"
LC_MESSAGES="en_AU"
LC_PAPER="en_AU"
LC_NAME="en_AU"
LC_ADDRESS="en_AU"
LC_TELEPHONE="en_AU"
LC_MEASUREMENT="en_AU"
LC_IDENTIFICATION="en_AU"
LC_ALL=
 
Old 08-04-2014, 07:22 PM   #5
keefaz
LQ Guru
 
Registered: Mar 2004
Distribution: Slackware
Posts: 6,552

Rep: Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872
I prefer set the lang in ~/.bash_profile, I use ISO8859-1
 
Old 08-04-2014, 11:46 PM   #6
ttk
Senior Member
 
Registered: May 2012
Location: Sebastopol, CA
Distribution: Slackware64
Posts: 1,038
Blog Entries: 27

Rep: Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484
ASCII4EVAR

Also, I improve the performance of all my text-parsing utilities (sort, grep, etc) by setting LANG=C and LC_ALL=C. It's like the modern equivalent to the old PC's "Turbo" switch.
 
Old 08-05-2014, 02:30 AM   #7
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,297
Blog Entries: 24

Rep: Reputation: 4255Reputation: 4255Reputation: 4255Reputation: 4255Reputation: 4255Reputation: 4255Reputation: 4255Reputation: 4255Reputation: 4255Reputation: 4255Reputation: 4255
All UTF8 now, since about when Slackware 14 was released.
 
Old 08-05-2014, 03:09 AM   #8
a4z
Senior Member
 
Registered: Feb 2009
Posts: 1,727

Rep: Reputation: 742Reputation: 742Reputation: 742Reputation: 742Reputation: 742Reputation: 742Reputation: 742
UTF-8,
should be default,
especially if you have to deal with multiple languages, even if the sys lang en_??
 
Old 08-05-2014, 04:17 AM   #9
GazL
LQ Veteran
 
Registered: May 2008
Posts: 7,010

Original Poster
Rep: Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142
BTW, one I've found breaks in UTF-8 is vi (elvis). vim is fine though.
 
Old 08-05-2014, 04:21 AM   #10
Didier Spaier
LQ Addict
 
Registered: Nov 2008
Location: Paris, France
Distribution: Slint64-15.0
Posts: 11,172

Rep: Reputation: Disabled
fr_FR.utf8. This doesn't prevent me to write "LANG=C <something>" and maybe LC_COLLATE=C [1] when <something> is happier or faster with that, of course. To properly display the man pages encoded in UTF-8, I've in ~/.bashrc:
Code:
alias uman="GROFF_ENCODING=utf8 man"
There still remain a few _not_English_man_ pages_ in legacy encodings, but what can I do?

Also, I can understand that people speaking and reading only in English be not that much interested by UTF8, though but a very few performance costs, or issues with legacy utilities, as ASCII is functionally a subset of UTF-8 I hardly see any drawback even for them using UTF-8.

[1] I'll add LC_CTYPE if you insist, though I rarely need to set LANG to anything other than fr_FR.utf8, and practically never find the need to set other internationalization variables as defined in POSIX' xbd volume.

Last edited by Didier Spaier; 08-05-2014 at 04:34 AM.
 
Old 08-05-2014, 04:48 AM   #11
GazL
LQ Veteran
 
Registered: May 2008
Posts: 7,010

Original Poster
Rep: Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142
Running in utf-8 and then overriding to LANG=C for performance is fine as long as you know there are no multibyte characters in the input data, or that you are doing no character specific operations on it. But, as the following shows, it can break things:
Code:
gazl@ws1:/tmp$ echo -n "€x€" | wc -m
3
gazl@ws1:/tmp$ echo -n "€x€" | LANG=C wc -m
7
I don't think I'd be inclined to do this very often, if at all.
 
Old 08-05-2014, 05:01 AM   #12
brianL
LQ 5k Club
 
Registered: Jan 2006
Location: Oldham, Lancs, England
Distribution: Slackware64 15; SlackwareARM-current (aarch64); Debian 12
Posts: 8,307
Blog Entries: 61

Rep: Reputation: Disabled
I've got locale set to en_GB. On my laptop, anyway, (where I am now). But I'm pretty sure I've got en_GB.UTF-8 on my desktop - I'll check later.
I'm using a unicode font in the console (Lat2-Terminus16), because it looks better than the default. Nothing bad has happened yet. But it probably will now I've mentioned it.
 
Old 08-05-2014, 05:06 AM   #13
chrisretusn
Senior Member
 
Registered: Dec 2005
Location: Philippines
Distribution: Slackware64-current
Posts: 3,098

Rep: Reputation: 1632Reputation: 1632Reputation: 1632Reputation: 1632Reputation: 1632Reputation: 1632Reputation: 1632Reputation: 1632Reputation: 1632Reputation: 1632Reputation: 1632
~$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE=C
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
 
Old 08-05-2014, 05:19 AM   #14
Didier Spaier
LQ Addict
 
Registered: Nov 2008
Location: Paris, France
Distribution: Slint64-15.0
Posts: 11,172

Rep: Reputation: Disabled
Quote:
Originally Posted by GazL View Post
Running in utf-8 and then overriding to LANG=C for performance is fine as long as you know there are no multibyte characters
Of course! I do that only when I *know* that the input is encoded in ASCII.

Quote:
But, as the following shows, it can break things:
Code:
gazl@ws1:/tmp$ echo -n "€x€" | wc -m
3
gazl@ws1:/tmp$ echo -n "€x€" | LANG=C wc -m
7
I don't think I'd be inclined to do this very often, if at all.
Nothing is broken IMO. You tell wc that you feed it with one byte characters, give it 7 bytes, then it answers you that it founded 7 characters. I don't see anything wrong here.

Last edited by Didier Spaier; 08-05-2014 at 05:24 AM.
 
1 members found this post helpful.
Old 08-05-2014, 05:32 AM   #15
GazL
LQ Veteran
 
Registered: May 2008
Posts: 7,010

Original Poster
Rep: Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142Reputation: 5142
The breakage I was referring to is in the usage: the inappropriate override of LANG=C. I thought that was obvious from the context of what I posted, but I guess not. As you say, the 'wc' utility is clearly not broken, working as designed, and doing exactly what I told it to.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
locale: Cannot Set LC_ALL to default locale: No such file or directory. asym Linux - General 11 10-24-2018 05:54 PM
Locale or run level - cannot open display message konzo Linux - Newbie 5 01-14-2010 04:32 PM
How on Sarge to get adduser+locale let new user run a language root & others don't Frommacau Debian 3 06-28-2006 08:11 AM
locale: Cannot set LC_ALL to default locale: Invalid argument GadgetWiz Mandriva 1 03-31-2006 11:38 PM
Linux box calling a batch script on a windows box to run? Is it possible? joelhop Programming 8 05-17-2004 04:49 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware

All times are GMT -5. The time now is 09:53 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration