Let say we have random string like this:
$str_test = "faafŠ š čćž đš čšđ ćčš žž fa fssfa afž afžsa f";
and we do some preg_replace function on it:
preg_replace("/[^\da-z ]/i", "_", $str_test);
And the result I get is:
faaf__ __ ______ ____ ______ ______ ____ fa fssfa af__ af__sa f
So if we compare bothe - input and output:
faaf__ __ ______ ____ ______ ______ ____ fa fssfa af__ af__sa f
faafŠ š čćž đš čšđ ćčš žž fa fssfa afž afžsa f
we can see that all special chars are being replaced with two signt "_" ... Result should be:
faaf_ _ ___ __ ___ ___ __ fa fssfa af_ af_sa f
faafŠ š čćž đš čšđ ćčš žž fa fssfa afž afžsa f
I have tried with encodings already but no success.. I also thought to make function to do multiple preg_match once and than replace "_ " with " " ... but that would be slow on big texts ...
Any Ideas?
$str=preg_replace("/[^0-9a-zA-Z ]/u", "_", $str_test);
Notice 'u' modifier! Explanation: http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php#107498
If the _subject_ contains utf-8 sequences the 'u' modifier should be set, otherwise a pattern such as /./ could match a utf-8 *sequence as two to four individual ASCII characters*.
Why not use the build in php multibyte functions?
mb_ereg_replace
is the one to use here. Manual
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.