简体   繁体   中英

PHP remove all non UTF-8 characters from string

I need to remove symbols like ",./! and so on from the beginning and the end of the string. but still need to leave numbers and characters like ąčęėįšųž and many more from UTF-8. for example:

  1. the result of string &g&g should be g&g ;
  2. the result of string ąčęėį should be ąčęėį ;
  3. the result of string "name" should be name ;
  4. the result of string 69 should be 69
  5. the result of string --abc--- should be abc

I believe it should be done using preg_replace but can't find how.

If I understand well, this will do what you want:

$result = preg_replace('/(?:^[^\p{L}\p{N}]+|[^\p{L}\p{N}]+$)/u', '', $input);

Where

\\p{L} stands for any character that is a letter (unicode)
\\p{N} stands for any character that is a digit (unicode)
[^\\p{L}\\p{N}] is a negative character class that matches characters that is not letter or digit.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM