LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 07-26-2012, 09:20 AM   #1
lukester
LQ Newbie
 
Registered: Mar 2012
Posts: 11

Rep: Reputation: Disabled
special characters not displaying


Hi All,

Wondering if someone could help me out with a weird little problem i'm having.

Middots (&middot are not displaying correctly, neither are some other special characters. I get a ? in a black diamaond.This happens when looking at file names on the server through SSH and when browsing the page using a web browser.

System setup if it helps:
4 core cpu
8gb ram
250 gb hdd
Centos 6.2 basic install
Apache
PHP
ColdFusion 8

Stuff I've tried:
checked this line is in the apache conf file -
IndexOptions FancyIndexing VersionSort HTMLTable NameWidth=* Charset=UTF-8

Added MS true type fonts to the O/S

Checked ColdFusion to make sure the fonts are loaded, they are, and this is happening on flat html pages as well as file names on the O/S so don't think its a web issue but a system wide problem.

I've searched Google and the forums here relentlessly but am having little luck. I have found some posts that are close to my problem so have followed their suggestions but with no luck so far.

I have found in /etc/sysconfig/i18n I have this:
LANG="en_US.UTF-8"
SYSFONT="latarcyrheb-sun16"
I am not sure about the sysfont being correct but am unsure what it should be changed to.

Also i have found this in a forum post "ensure this line is present on each page" <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> I have done this and this has not helped either. I thought it could be because in the apache httpd.conf file i had entered charset=utf-8 so i commented this out, restarted apache and still no joy.

This is a live production server so anything that may kill it please let me know first!

Thanks for any help.
Luke

refs:
http://www.linuxquestions.org/questi...up-4175417996/
http://www.linuxquestions.org/questi...lp-4175411995/
http://www.howtoforge.com/forums/showthread.php?t=27230
http://lists.centos.org/pipermail/ce...er/082397.html
http://blog.salientdigital.com/2009/...black-diamond/
http://www.joelonsoftware.com/articles/Unicode.html
 
Old 07-26-2012, 10:17 AM   #2
eSelix
Senior Member
 
Registered: Oct 2009
Location: Wroclaw, Poland
Distribution: Arch, Kubuntu
Posts: 1,281

Rep: Reputation: 320Reputation: 320Reputation: 320Reputation: 320
About web browser (and server) - displaying entities for example &middot; is dependent only on the client side (web browser), obviously server sends just these characters: ampersand,m,i,d,d,o,t,; not a one character. So this is problem with your browser or system fonts where this browser is installed - not a web server. I assume that they are written correctly, as entity, not incorrectly as coded character in UTF8 or other charset. And, if web browser display some questiuon marks in place of middot character, then it means that font currently used has not defined that character. In browser you can usualy change font for display.
 
Old 07-26-2012, 10:49 AM   #3
lukester
LQ Newbie
 
Registered: Mar 2012
Posts: 11

Original Poster
Rep: Reputation: Disabled
Hi,
Its not a problem with the browser, because Chrome, FF, and IE can't possibly all fail to display the same thing, my colleagues also cannot see the special characters and they are using win 7 where as i am on XP so i know its not an issue with my computer. Also the customer cannot see these special characters on his computer, which is a mac and using safari as far as i am aware. We have also tested using an Ipad.

Also, i'm 99% sure that the server doesn't send m,i,d,d,o,t,; to the browser. It uses a character map to map the special character - for example a middot under utf-8 is sent as 'c2 b7' then the browser would decode this based on the meta information of the webpage using the character map on the local machine. I know its not a web browser issue because we see the same thing on the linux filesystem - the filenames have ?'s in them rather than long dashes (-)

I would be interested in getting some more information about
"I assume that they are written correctly, as entity, not incorrectly as coded character in UTF8 or other charset"
Could you give me an example?


Cheers
Luke

Last edited by lukester; 07-26-2012 at 10:52 AM.
 
Old 07-26-2012, 11:39 AM   #4
eSelix
Senior Member
 
Registered: Oct 2009
Location: Wroclaw, Poland
Distribution: Arch, Kubuntu
Posts: 1,281

Rep: Reputation: 320Reputation: 320Reputation: 320Reputation: 320
Quote:
Originally Posted by lukester View Post
Also, i'm 99% sure that the server doesn't send m,i,d,d,o,t,; to the browser. It uses a character map to map the special character
Then make sure that the served files are saved in UTF8 and Content-Type sended to browser is also UTF8, do not change it to "iso-8859-1". There are two places to check, the server headers (Tamper Data extension to Firefox can be helpful or just use sniffer) and html headers (those between <HEAD>).

About characters on filesystem is the same - font and console settings. For example on Kubuntu in virtual console for displaing text in UTF8, I need to use "setfont" with proper font file like "/usr/share/consolefonts/Uni3-Terminus16.psf.gz". But I don't known where are settings for your operating system. You did not mentioned if this happen under terminal emulator on X server or in "text mode" (after switching CTRL+ALT+F1)?

Last edited by eSelix; 07-26-2012 at 11:43 AM.
 
Old 07-26-2012, 12:44 PM   #5
DavidMcCann
LQ Veteran
 
Registered: Jul 2006
Location: London
Distribution: PCLinuxOS, Debian
Posts: 6,139

Rep: Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314
The character "? in a black diamond" is Unicode U+FFFD: "used to replace an incoming character whose value is unknown or unrepresentable in Unicode". Non-unicode encodings will fall in the 0-FF range (unless the old Sino-Japanese ones) and generate the wrong character (or perhaps a box for a control code). It looks like you're getting a number greater than FF that is not a valid encoding format in UTF-8. Could it be that you're getting UTF-16 or something weird like CESU-8? Can you get the offending symbol into a hex editor and find what it is?
 
Old 07-27-2012, 05:25 AM   #6
eSelix
Senior Member
 
Registered: Oct 2009
Location: Wroclaw, Poland
Distribution: Arch, Kubuntu
Posts: 1,281

Rep: Reputation: 320Reputation: 320Reputation: 320Reputation: 320
Check also if you have BOM in your html files, maybe it is incorrect. Can you post a link to one of the web page affected by this?
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] special characters help? bednarjm Fedora 8 06-18-2012 09:09 PM
Special Characters SimeonV SUSE / openSUSE 14 07-07-2006 01:29 PM
special characters greenbox Linux - Software 9 12-23-2005 07:33 PM
Special characters consty Programming 3 08-07-2005 05:53 AM
using special characters one_ro Mandriva 5 11-04-2004 08:52 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 08:08 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration