简体   繁体   中英

How to convert special unicode character to it's closet ASCII in PHP

I have a problem when user input a string in special Unicode like and my system cannot distinguish it with the string "tuyendung" that is written in ASCII. The question is how can I normalize the input string to ASCII before storing it in the database?

Sample Input:

(Char code: 0xd835, 0xde01, 0xd835, 0xde02, 0xd835, 0xde06, 0xd835, 0xddf2, 0xd835, 0xddfb, 0xd835, 0xddf1, 0xd835, 0xde02, 0xd835, 0xddfb, 0xd835, 0xddf4)

Expected output:

tuyendung

(Char code: 0x74, 0x75, 0x79, 0x65, 0x6e, 0x64, 0x75, 0x6e, 0x67)

It looks like the //TRANSLIT option can do the trick here.

<?php

$input = '𝘁𝘂𝘆𝗲𝗻𝗱𝘂𝗻𝗴';
echo iconv('UTF-8', 'US-ASCII//TRANSLIT', $input);

This turns (what I think are?) math symbols like to t

I don't know what "tuyendung" is.

But in php, you can convert the character sets with the "iconv" function or you can keep the original form in a blob field in the database. You can make any transformation you want in the screening.

Maybe it gives an idea.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM