LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Desktop
User Name
Password
Linux - Desktop This forum is for the discussion of all Linux Software used in a desktop context.

Notices


Reply
  Search this Thread
Old 08-08-2022, 12:29 PM   #1
mfoley
Senior Member
 
Registered: Oct 2008
Location: Columbus, Ohio USA
Distribution: Slackware
Posts: 2,564

Rep: Reputation: 177Reputation: 177
UTF-8 characters not showing in Konsole


I am using Konsole for a terminal session. It has "Default character encoding" set to UTF-8. In the session, $LANG=en_US.UTF-8. Yet when I cat a file with UTF-8 characters I get the question-mark inside an inverse diamond. Why don't I see the UTF-8 character? I have the Konsole font set to DejaVu Sans Mono 13pt.

Last edited by mfoley; 08-08-2022 at 12:37 PM.
 
Old 08-09-2022, 06:29 AM   #2
DavidMcCann
LQ Veteran
 
Registered: Jul 2006
Location: London
Distribution: PCLinuxOS, Debian
Posts: 6,142

Rep: Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314
I use xdce4-terminal but I too set it to DejaVu Sans Mono. A quick check shows that it can cat a file with runes in it, even though that font doesn't include runes.

The problem could be either specific to Konsole or it could be a wider KDE problem — not seeking an alternative font when the default is unsuitable. I can't solve either problem, but I can suggest an alternative terminal: mlterm is specifically designed to cope with any language. Of course that may not help if it's a KDE problem.
 
Old 08-09-2022, 01:41 PM   #3
mfoley
Senior Member
 
Registered: Oct 2008
Location: Columbus, Ohio USA
Distribution: Slackware
Posts: 2,564

Original Poster
Rep: Reputation: 177Reputation: 177
DavidMcCann: Thanks. I just tried mlterm. As installed, it didn't work either to show UTF-8 characters with 'cat'. I didn't see any way to change settings in mlterm, such as BG/FG colors, font or charset. Is there a way to do that? Maybe I need to config a charset setting or something.

Last edited by mfoley; 08-09-2022 at 01:49 PM.
 
Old 08-10-2022, 08:31 AM   #4
mfoley
Senior Member
 
Registered: Oct 2008
Location: Columbus, Ohio USA
Distribution: Slackware
Posts: 2,564

Original Poster
Rep: Reputation: 177Reputation: 177
Ah ha! Problem solved. Being new to the bewildering world of UTF-8 notation I had the wrong Hex bytes. My file had 00 B0, which is the "Hex Code Point" which, I suppose is represented as U+00B0. But that's not the Hex byte values. Those are C2 B0. When I changed the file to have C2 B0, then Konsole displayed the UTF-8 character correctly (degree symbol). The browser, however, display the degree symbol correctly whether or not the Hex characters were 00 B0 or C2 B0.

I'm sure there are all kinds of good reason for representing the hex sequence C2 B0 as U+00B0, but it confused me for a while.

ref: https://www.cogsci.ed.ac.uk/~richard...=00B0&mode=hex
 
Old 08-10-2022, 09:17 AM   #5
boughtonp
Senior Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 3,612

Rep: Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553
Quote:
Originally Posted by mfoley View Post
The browser, however, display the degree symbol correctly whether or not the Hex characters were 00 B0 or C2 B0.
A feature of browsers is that they try to be helpful and correct mistakes, which can make them a poor tool for debugging.


It may have helped to use cat -A and/or od to see actual bytes involved.

Also, Wikipedia's UTF-8 article has an example of how multi-byte encoding works.

 
  


Reply

Tags
konsole, utf-8



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] How to show UTF-8 characters on bash command line, using Konsole mfoley Slackware 3 12-28-2021 09:10 PM
[SOLVED] Midnight Commander, F-keys, UTF-8 and ASCII characters not showing correctly Imyrryr Linux - Software 7 01-23-2013 08:39 AM
tmux not showing utf-8 characters correctly zagzagel Linux - Software 4 03-01-2012 04:38 AM
How to input non-utf characters from utf-8 linux enviroment? jadas Linux - General 6 02-07-2009 03:20 PM
im getting UTF-8 to STRING: Could not open converter from 'UTF-8' to 'ISO-8859-1' jabka Linux - Newbie 2 11-24-2006 05:44 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Desktop

All times are GMT -5. The time now is 08:11 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration