Latest LQ Deal: Complete CCNA, CCNP & Red Hat Certification Training Bundle
Go Back > Forums > Linux Forums > Linux - Server
User Name
Linux - Server This forum is for the discussion of Linux Software used in a server related context.


  Search this Thread
Old 10-01-2009, 12:19 PM   #1
Registered: Nov 2005
Distribution: CentOS
Posts: 154

Rep: Reputation: 15
character set problem

I have a CentOS 5.x Server that I am running into character encoding problems on. The Server variable for $LANG is en_US.UTF-8 so I should be good for unicode there. But I am writing content from the web to a mysql db using the PHP cli and when the data gets to the db, the non-english letters are garbled. The mysql db is using latin1 character set with the swedish collation as this is the mysql default. I understand that pumping utf8 characters into a latin1 db is going to cause problems but I also read that php itself has a default character set and so does apache so before I go mucking around on the server too much, does anyone know what I have to do to resolve this? Here is the scenario:


Centos 5.x running en_US.UTF-8
Mysql db running latin1
PHP CLI character set: not defined in php.ini
PHP CGI character set: not defined in php.ini
Shell character set: should be en_US.UTF-8 from echo $LANG
Apache character set: not defined in http.conf


php cli scrapes the web for text --> php cli writes the scraped text to mysql db.


Non-english letters and characters like apostrophe's are encoded.

Side note, php functions like trying to preg_replace apostrophes and addslahes and/or mysql_real_escape_string dont work so php is apparently not able to see the apostrophe characters.
Old 10-01-2009, 05:31 PM   #2
Registered: Mar 2009
Distribution: CentOS - Ubuntu - Debian
Posts: 83

Rep: Reputation: 27

Usually the best approach is to define a encoding for all parts of the systems
and stick to it, if you are going to use many special characters then
use UTF-8

To configure mysql to use utf8 in this storage you have to do in you /etc/my.cnf
in the [mysqld] and [mysql.server] sections add the following lines:

# Enable UTF-8 for Server Storage

Also, take into account that the PHP script must set the connection to use
utf8 also, this is usually done with a 'set names utf8' query right after you
start your datatabase connection or if you use mysqli then use the set_charset('utf8')

Finally, even if your script and your database are correctly
configured, you may experience trouble when you see your data because of issues like
the charset/encoding used in your terminal for example and if you are extracting data
via Apache/web then the encoding of your page also need to be set.

Best regards.
Old 10-01-2009, 05:33 PM   #3
LQ Guru
Registered: Jun 2004
Location: Piraeus
Distribution: Slackware
Posts: 11,480

Rep: Reputation: 1481Reputation: 1481Reputation: 1481Reputation: 1481Reputation: 1481Reputation: 1481Reputation: 1481Reputation: 1481Reputation: 1481Reputation: 1481
You can use mysqldump to make a backup of your database. Make another copy of it to be able to play safely and then use iconv to transform the backup.sql file from latin1 (that is iso-8859-1 i think) to utf-8.
iconv -f iso-8859-1 -t utf8 backup.sql > backup-iconv.sql
Drop the database and restore it using the modified .sql.
You can find more details here



Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
character set problem vbsaltydog Linux - General 2 07-06-2009 01:33 PM
Character Set problem Libertes Linux - General 0 09-08-2007 03:27 PM
Korean Character Set bootneck Linux - Newbie 2 06-29-2006 05:41 AM
Character set in the terminal intuxicator Debian 1 05-01-2005 02:56 AM
Character Set displayed problem. chrislee8 Linux - Newbie 3 10-02-2004 12:28 PM

All times are GMT -5. The time now is 09:55 PM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration