The problem that most people face when setting up a UTF-8 database in mySQL is that without calling ‘SET NAMES’ in the mySQL client prior to issuing any queries (PHP, C++ etc …) that the client connection will actually in most cases default to latin-1.
However as of mySQL 5.x or higher you can issue a statement in the my.cnf file calling init_connect.
This will trigger a series of defined commands / queries every time a non super user connects (So if you are using root to connect to your mySQL database, stop reading now and slap yourself HARD).
i.e.
1 2 3 4 5 6 7 | [mysqld] init_connect='SET collation_connection = utf8_general_ci' init_connect='SET NAMES utf8' default-character-set=utf8 character-set-server=utf8 collation-server=utf8_general_ci skip-character-set-client-handshake |
UPDATE 04/09/09
my mySQL version 5.0.45 x64 only picks up the last entry of init_connect
Use this example in this case:
1 2 3 4 5 | [mysqld] init_connect='SET collation_connection = utf8_general_ci; SET NAMES utf8;' default-character-set=utf8 character-set-server=utf8 collation-server=utf8_general_ci |
Restart mySQL and check the mysqld.log has not returned any errors (Or your event viewer if you are using windows).
Every client connection will now default to utf-8 encoding and not latin-1, removing the need to add a SET NAMES call on every connection.
This will work for PHP, C++, ruby etc… as the client encoding is now handeled server side, rather that waiting on the client to issue a SET NAMES command.
UPDATE 30/03/09: Added “skip-character-set-client-handshake” this ignores the clients request to set the connection charset, this info courtesy of “wardo” http://word.wardosworld.com/?p=164
UPDATE 10/09/09
Been having some issues with this working the workaround is to add this config as a single line:
1 | init_connect='SET collation_connection = utf8_general_ci; SET NAMES utf8;' |
Entries (RSS)
Thank you!
I have been looking everywhere for hints on how to get UTF-8 working on my new server.
This installation is a fully updated FreeBSD 7.0 with Apache 2.2.9 and MySQL 5.0.67.
When importing my old utf8_general_ci database everything looked fine… but reading it from my website showed characters wrong. Writing new entries into the database (from web) made something unreadable to the human eye for special chars like æøåöëü etc.
I have tried SO many options for both Apache, MySQL and my own (perl) code.
Page bookmarked! (and I will copy some info to my soon to come wiki http://www.confighell.com)
Thanks again
– Nicolai
[...] existing literature on the topic including possible fixes [here, here, here, here, here, here, horror story here, here] encourage the configuration of the MySQL daemon (mysqld) to [...]
Thank you! Works great
[...] Force mySQL utf8 connections [...]
[...] [via Saiweb] [...]
Thanks Buzz,
you saves my life! i was stuck since 9.00 AM…
Quick question. Looking through `show variables;` I see the following, seemingly character/collation related variables:
character_set_client
character_set_connection
character_set_database
character_set_results
character_set_server
character_set_system
collation_connection
collation_database
collation_server
The reason I say related, is because by default they all have a latin based character/collation value on a fresh mysql install. Your command switches all but the following to utf8/utf8_general_ci:
character_set_client
character_set_connection
character_set_results
collation_connection
Those remain latin1/latin1_swedish_ci
—-
My question is, do these not need to be set manually as all databases will now be created accordingly in utf8/general? Is there a reason them being latin based doesn’t cause issues? I would assume making them the appropriate utf8 type would make it more consistent, but am curious what they would affect. I assumed (based on their names alone) that the connection ones (character_set_connection, collation_connection) might be of interest.
Any thoughts?
“When a client connects to the server, it sends the name of the character set that it wants to use. The server uses the name to set the character_set_client, character_set_results, and character_set_connection system variables. In effect, the server performs a SET NAMES operation using the character set name.”
–
found that in mysql’s docs. is that saying that since we have the init_connect setting, when we connect to the db, from say a php script, it’ll automatically set those three settings accordingly on it’s own?
i’ve just slapped myself.
good doc btw. ty.