简体   繁体   中英

php regex: how to match number + letters + Chinese

I am trying to handle the string which got letters, numbers, Chinese and some punctuations and left number, letters and Chinese only like below

raw string

a>b%%c##1@23测$$试??\\:.##,,??!!

result

abc123测试

for Chinese, preg_replace('/\P{Han}+/u', '', $text) works perfectly

测试 // result

and for number and letters, preg_replace('/[^0-9a-zA-Z]/', '', $text) works too.

abc123 // result

But how can I combine them?

why preg_replace('/[\P{Han}]|[0-9a-zA-Z]/u', '', $raw); doesn't work as expected?

Thanks a lot for anyone help!

You need

preg_replace('~[^0-9a-zA-Z\p{Han}]+~u', '', $raw)

See the regex demo .

The [^0-9a-zA-Z\p{Han}]+ is a negated character class that matches any one or more chars other than ASCII digits, ASCII letters and any Chinese chars.

It is important to use the u flag with this pattern, as your input is Unicode strings.

See the PHP demo :

$raw = 'a>b%%c##1@23测$$试??\\:.##,,??!!';
echo preg_replace('~[^0-9a-zA-Z\p{Han}]+~u', '', $raw);
// => abc123测试

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM