Hi,
I have problems with special characters in Postgres.
I am inserting filenames that contain accented characters such as
é (e with acute), and also unicode characters such as
€ (euro sign).
I am using modern and recent systems, they all use and are set to UTF-8 by default.
On client:
Fedora 28 5.0.16-100.fc28.x86_64
psql 10.7
Perl v5.26.3
perl-DBI-1.641-1.fc28.x86_64
perl-DBD-Pg-3.7.4-3.fc28.x86_64
On server
CentOS 3.10.0-957.27.2.el7.x86_64
postgresql-server-9.2.24-1.el7_5.x86_64
All the actions are initiated from the client side.
Basically, the filenames contains few special characters, which are displayed correctly on my terminal (
lxterminal-0.3.2-1.fc28.x86_64).
Code:
$ ls -1 test/
'1 file.txt'
'2 file.txt'
'3 file € with euro.txt'
'4 file é with acute e.txt'
'5 file foobar.txt'
On the database server, I have a test database created with UTF-8 encoding.
Code:
=> \l test
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-------+--------+----------+-------------+-------------+-------------------
test | user_t | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
And in that database is a table named test.
Code:
=> \d test
Table "public.test"
Column | Type | Collation | Nullable | Default
----------+-------------------+-----------+----------+---------
id | integer | | not null |
comments | character varying | | |
Indexes:
"test_id_key" UNIQUE CONSTRAINT, btree (id)
Check constraints:
"test_id_check" CHECK (id > 0)
The Perl script below (named
script.pl) will connect to the database test, re-create the test table, insert into it the filenames (taken from the standard input), and later retrieve them back from the database and display them on the standard output.
Code:
$ cat script.pl
#! /usr/bin/perl
use strict;
use DBI;
BEGIN {
our ($database, $servername, $username, $password) = qw( test localhost user_t);
our ($dbh, $sql_drop, $sql_create, $sql_insert, $sth_insert, $results_insert, $sql_select, $sth_select, $results_select, @values);
our ($id, $filename);
$dbh = DBI->connect("DBI:Pg:dbname=$database;host=$servername", $username, $password, { AutoCommit => 1, RaiseError => 0, PrintError => 0 }) || die "connection to database '$database' on server '$servername' unsuccessful: $DBI::errstr\n";
$sql_drop = q{DROP TABLE IF EXISTS test};
$dbh->do($sql_drop) || warn ("removal of table test with '$sql_drop' in database '$database' on server '$servername' unsuccessful: ", $dbh->errstr, "\n");
$sql_create = q{CREATE TABLE test (id integer NOT NULL UNIQUE CHECK(id > 0), filename character varying)};
$dbh->do($sql_create) || warn ("creation of table test with '$sql_create' in database '$database' on server '$servername' unsuccessful: ", $dbh->errstr, "\n");
$sql_insert = q{INSERT INTO test (id, filename) VALUES (?, ?)};
$sth_insert = $dbh->prepare($sql_insert) || die ("preparation of statement for files insert '$sql_insert' in database '$database' on server '$servername' unsuccessful: ", $dbh->errstr, "\n");
$sql_select = q{SELECT id, filename FROM test WHERE id = ?};
$sth_select = $dbh->prepare($sql_select) || die ("preparation of statement for files select '$sql_select' in database '$database' on server '$servername' unsuccessful: ", $dbh->errstr, "\n");
}
our ($database, $servername, $username, $password) = qw( test localhost user_t);
our ($dbh, $sql_drop, $sql_create, $sql_insert, $sth_insert, $results_insert, $sql_select, $sth_select, $results_select, @values);
our ($id, $filename);
if ($sth_insert and $sth_select) {
$filename = $_;
$filename =~ s/\0$//;
$id++;
print "id=$id\n";
print "filename=$filename\n";
if ($results_insert = $sth_insert->execute($id, $filename)) {
if ($results_insert == 1) {
print (" => insert successful with values\n");
print"\tid=$id\n\tfilename=$filename\n";
if ($sth_select->execute($id)) {
while (@values = $sth_select->fetchrow_array) {
print (" => select successful with values\n");
print"\tid=$values[0]\n\tfilename=$values[1]\n";
}
warn (" -> retrieval for id '$id' from statement for select '$sql_select' in database '$database' on server '$servername' unsuccessful: ", $sth_select->errstr, "\n") if ($sth_select->errstr);
$results_select = $sth_select->rows;
warn (" -> retrieval of select rows for id '$id' from statement for select '$sql_select' in database '$database' on server '$servername' unsuccessful: ", $sth_select->errstr, "\n") if ($sth_select->errstr);
if ($results_select == 1) {
# as expected
}
elsif ($results_select != 0) {
warn (" -> execution for id '$id' of statement for select '$sql_select' in database '$database' on server '$servername' should have affected 1 or 0 record instead of $results_select\n");
}
}
else {
warn (" -> execution for id '$id' of statement for select '$sql_select' in database '$database' on server '$servername' unsuccessful:", $sth_select->errstr, "\n");
}
}
else {
warn (" -> execution for id '$id' of statement for insert '$sql_insert' in database '$database' on server '$servername' should have affected 1 record instead of $results_select\n");
}
}
else {
warn (" -> execution for id '$id' of statement for files insert '$sql_insert' in database '$database' on server '$servername' unsuccessful: ", $sth_insert->errstr, "\n");
}
print "\n";
}
else {
warn " -> sql statement not correct ready\n";
}
END {
#
unless ($sth_select and $sth_insert) {
print (" => files not processed because of statement failure\n");
warn (" -> failure on statement select '$sql_select' in database '$database' on server '$servername'\n") unless ($sth_select);
warn (" -> failure on statement insert '$sql_insert' in database '$database' on server '$servername'\n") unless ($sth_insert);
}
#
if ($dbh->{Active}) {
#
$dbh->disconnect || warn (" -> cannot disconnect, in database '$database' on server '$servername': ", $dbh->errstr, "\n");
}
print $id + 0, " files processed\n";
}
Code:
$ find test -ignore_readdir_race -print0 | perl -n0 script.pl
id=1
filename=test
=> insert successful with values
id=1
filename=test
=> select successful with values
id=1
filename=test
id=2
filename=test/4 file é with acute e.txt
=> insert successful with values
id=2
filename=test/4 file é with acute e.txt
=> select successful with values
id=2
filename=test/4 file é with acute e.txt
id=3
filename=test/3 file € with euro.txt
=> insert successful with values
id=3
filename=test/3 file € with euro.txt
=> select successful with values
id=3
filename=test/3 file € with euro.txt
id=4
filename=test/1 file.txt
=> insert successful with values
id=4
filename=test/1 file.txt
=> select successful with values
id=4
filename=test/1 file.txt
id=5
filename=test/5 file foobar.txt
=> insert successful with values
id=5
filename=test/5 file foobar.txt
=> select successful with values
id=5
filename=test/5 file foobar.txt
id=6
filename=test/2 file.txt
=> insert successful with values
id=6
filename=test/2 file.txt
=> select successful with values
id=6
filename=test/2 file.txt
6 files processed
All that works fine, and as expected.
So I know that the client side terminal, the client side Perl and the server side Postgres work correctly.
I had not to modify, neither set, any of the various available encoding related properties and settings at my disposal.
It works out of the box, which is really the expected behaviour nowadays.
The problems occur when I use the client side psql (launched from the client side lxterminal) to retreive and display data from the database; the filenames are not displayed correctly.
For an unknown reason to me, the special characters are not recognized as UTF-8, and so are interpreted as ISO-8859-1 (aka LATIN1) and wrongly displayed like that.
Code:
$ sql -A -P fieldsep=$'\n' -P recordsep=$'\n\n' -P footer=no -h centos -U applis -c "SELECT id, filename FROM test ORDER BY id" files
id
filename
1
test
2
test/4 file é with acute e.txt
3
test/3 file € with euro.txt
4
test/1 file.txt
5
test/5 file foobar.txt
6
test/2 file.txt
My question, what am I supposed to do in order to have
psql nicely display the special characters?