[英]PHP extract word that contains special character from a string
I have a string : 我有一个字符串:
$str = " Côte-d'azure ! (3000) limousin - limousine ";
And I need to extract some words and put them in an array. 我需要提取一些单词并将其放入数组中。 to get finally : 最终得到:
array (
0 => "Côte-d'azure",
1 => "limousin",
2 => "limousine"
);
So I tried : 所以我尝试了:
preg_match_all("/[a-zA-Z]+/", $str, $all);
but this ignore the special character ô , ' and - 但这忽略了特殊字符ô , '和-
please any advise ? 请任何建议?
Use Unicode mode u
and character properties : 使用Unicode模式u
和字符属性 :
preg_match_all('/\p{L}[\p{L}\\\\\'-]+/u', mysql_real_escape_string($str), $all);
This requires one (Unicode) letter and then matches as many other Unicode letters, backslashes, hyphens and apostrophes as possible. 这需要一个(Unicode)字母,然后匹配尽可能多的其他Unicode字母,反斜杠,连字符和撇号。 If you want other punctuation characters to not separate a word, include it in the character class. 如果希望其他标点符号不分隔单词,请将其包括在字符类中。
Note that 5 backslashes. 请注意5个反斜杠。 Three backslashes are removed when the string is compiled, because two of them escape the backslash following them, and the last one escapes the '
. 编译字符串时,将删除三个反斜杠,因为其中两个反斜杠转义了后面的反斜杠,而最后一个反斜杠转义了'
。 So the regex engine receives only 2 backslashes. 因此,正则表达式引擎仅接收2个反斜杠。 These are interpreted by the regex engine as one literal backslash. 这些由正则表达式引擎解释为一个文字反斜杠。 Unfortunately there is no way to use less than 4 backslashes to represent one literal backslash when using PHP. 不幸的是,在使用PHP时,无法使用少于4个反斜杠来表示一个文字反斜杠。
try 尝试
if (preg_match('/[^a-zA-Z0-9]+/', $your_string, $matches))
{
echo ' symbol encountered !!';
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.