Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I think the answer can be found in Chapter 9 of the MySql Reference Manual. At the moment on creating the database, you need to specify the character set to be used (which contains both the accented and unaccented leters), not just specifying utf-8. There are several character sets listed, such as German, Swedish, English, etc.
You may be able to redefine your database to treat accented letters as different from unaccented, or you may be able to redefine your entries/queries to treat them as different.
I hope that this helps solve the problem, or at least gets the cognitive juices flowing which leads to the solution.
I think the answer can be found in Chapter 9 of the MySql Reference Manual. At the moment on creating the database, you need to specify the character set to be used (which contains both the accented and unaccented leters), not just specifying utf-8. There are several character sets listed, such as German, Swedish, English, etc.
I guess I'll have a read of chapter 9, even though the concept of
having to specify TWO character sets (e.g. UTF-8 and Swedish)
appears rather moronic considering that UTF was conceived to get
rid of such idiosyncratic distinctions. But then, this is MySQL,
and I've never held it in very high esteem.
Quote:
Originally Posted by bigrigdriver
You may be able to redefine your database to treat accented letters as different from unaccented, or you may be able to redefine your entries/queries to treat them as different.
I hope that this helps solve the problem, or at least gets the cognitive juices flowing which leads to the solution.
Heh. Kind of. Thanks for the hint - where ever it may lead.
A brief update ... after 4 days of ongoing pain I still haven't managed to
import my data ... whether I use "INSERT" statements, "LOAD DATA", or the
command line tool mysqlimport, I can't get accented characters into the database.
Short of setting my system locale to cp1250 which is MySQLs (odd) default
and reinstalling everything I've tried pretty much every option and combination
available, using latin1 (otherwise known as iso-8856-1), utf8 and ASCII with
en_NZ and en_US from the command-line (and in the database server and client
config files), and the best thing I can achieve is either names with accents
to be truncated in the database (skipping ~50000 people because of 'duplicate key
violations'), or giving a "funny" character that looks something like y with a
circumflex for an accented e and a y with a grave for an i with a grave - which
would help if one was willing to memorize all the alternatives and use them in
searches .... hahaha. And that's with collation being set to either utf8_generic_ci
or utf8_bin.
People in the #mysql channel on freenode didn't have any ideas, and quite frankly
I'm pretty fed up with the product and its community.
Funnily enough PostgreSQL on the same machine imported the same file w/o a
hick-up, albeit somewhat slower than MySQLs pathetic workings.
Anyway ... should anyone have any other ideas I'd be grateful to hear of them.
mysql> create table tinkster (name varchar(12) character set utf8 collate utf8_bin unique);
Query OK, 0 rows affected (0.06 sec)
mysql> insert into tinkster values ('Sergio'),('Sérgio'),('ì'),('ý');
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql> select * from tinkster;
+---------+
| name |
+---------+
| Sergio |
| Sérgio |
| ì |
| ý |
+---------+
4 rows in set (0.00 sec)
Yah, fine, good and well ... I can reproduce that. Now stick
those in a file, and import it.
[edit]
And I'm not being facetious - it's that I have ~ 1.8 Mio
rows that I'd like to import into the database, and no, I
won't be copy & pasting them into an interactive session.
[/edit]
Oh, and another update. MySQL 5.1.37 on WinXP has the same problem
that 5.0.84 on Slackware has. Importing accented characters from a
file appears (to me at least) impossible with MySQL, and it makes no
difference whether that's a delimited text file, or whether I wrap
the data into properly escaped insert statements.
my.cnf is the 'default' as shipped by Arch Linux: http://repos.archlinux.org/wsvn/pack...ra-i686/my.cnf (I don't usually run MySQL on this machine, I installed it just to see if I could help you fix your problem - that's why I'm using the default configuration file)
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.