简体   繁体   中英

Special characters in mySQL (and php) - THE BASICS

I am confused! Recently my webhotel updated php and now my old tables render special characters differently (wrongly). Both my tables and my input/output-php-pages are set to utf-8 and since this update, also the inputs from php are treated differently; now my special characters are being utf-8-encoded as they enter the database. So since this change, when I review tables within phpMyAdmin, the old inserts have the original (non-encoded) special characters - the new posts have utf-8-encoded charcters (also special).

So what I would like to do is rewrite input and output to insert and show non-encoded characters - but I am not sure if this is possible without skipping utf-8 entirely (in php and mySQL). But is there an utf-8- way to submit non-encoded characters?

AND - perhaps more fundamentally - I need to understand what the possible downsides are. I am using Danish characters in and out and I'm not going to use any other language (for this project). So if it IS possible to insert and output non-encoded characters using utf-8 - am I then going to have unexpected/destructive issues?

I have read a lot of posts regarding php/mySQL/special characters but I haven't seen this angle on the issue yet. Hope I am not duplicating I hope not because it has been working very nicely until the update.

Even if you are using only Danish characters, you may as well go utf8 all the way.

There are many places where the encoding needs to be stated:

  • The at the top of the html
  • The columns in the database (column CHARACTER SET defaults from table, which defaults from database)
  • The encoding in your PHP code.

When you CREATE TABLE , tack on DEFAULT CHARACTER SET utf8 . If you have existing tables, without that, speak up; we may need to deal with them. If you want Danish collation, the specify COLLATION utf8_danish_ci , too. Then (if I recall correctly), aa will sort after z . (The default is utf8_general_ci , which won't do that sorting.) Figure out what encoding you have (or can get) in your php code. If you have some text with accents in it, do this:

$hex = unpack('H*', $text);
echo implode('', $hex)

If you have utf8, å will be C3A5 , for latin1 it will be E5 .

Regardless of what encoding in in the tables, you must call set_charset('utf8') or set_charset('latin1') depending on what encoding is in the data in PHP. MySQL will gladly transcode between latin1 and utf8 as things are passed between PHP and MySQL. For different APIs:

⚈  mysql: mysql_set_charset('utf8');
⚈  mysqli: $mysqli_obj->set_charset('utf8');
⚈  PDO: $db = new PDO('dblib:host=host;dbname=db;charset=UTF-8', $user, $pwd);

For much more info, see http://mysql.rjweb.org/doc.php/charcoll .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM