PHP正则表达式，用于匹配所有特殊字符，包括带重音符号的字符

Question

I am looking for a way to match all the possible special characters in a string. 我正在寻找一种匹配字符串中所有可能的特殊字符的方法。 I have a list of cities in the world and many of the names of those cities contain special characters and accented characters. 我有一个世界城市列表，这些城市中的许多名称都包含特殊字符和重音字符。 So I am looking for a regular expression that will return TRUE for any kind of special characters. 因此，我正在寻找一个正则表达式，该表达式对于任何特殊字符都将返回TRUE。 All the ones I found only match some, but I need one for every possible special character out there, spaces at the begin of the string included. 我发现的所有字符都只与某些字符匹配，但是我需要为每个可能的特殊字符使用一个，在字符串的开头包含空格。 Is this possible? 这可能吗？

This is the one I found, but does not match all the different and possible characters I may encounter in the name of a city: 这是我找到的，但与我在城市名中可能遇到的所有不同和可能的字符都不匹配：

preg_match('/[#$%^&*()+=\-\[\]\';,.\/{}|":<>?~\\\\]/', $string);

Answer 1

You're going to need the UTF8 mode "#pattern#u": http://nl3.php.net/manual/en/reference.pcre.pattern.modifiers.php 您将需要UTF8模式“＃pattern＃u”： http : //nl3.php.net/manual/en/reference.pcre.pattern.modifiers.php

Then you can use the Unicode escape sequences: http://nl3.php.net/manual/en/regexp.reference.unicode.php 然后，您可以使用Unicode转义序列： http : //nl3.php.net/manual/en/regexp.reference.unicode.php

So that preg_match("#\\p{L}*#u", "København", $match) will match. 这样preg_match（“＃\\ p {L} *＃u”，“København”，$ match）将匹配。

Answer 2

Use unicode properties: 使用unicode属性：

\\pL stands for any letter \\pL代表任何字母

To match a city names, i'd do (I suppose - and space are valid characters) : 为了匹配城市名称，我愿意（我想-和空格是有效字符）：

preg_match('/\s*[\pL-\s]/u', $string);

Answer 3

You can just reverse your pattern... to match everything what is not "a-Z09-_" you would use 您可以反转模式...以匹配所有您将使用的不是“ a-Z09-_”的内容

preg_match('/[^-_a-z0-9.]/iu', $string);

The ^ in the character class reverses it. 字符类中的^将其反转。

Answer 4

I had the same problem where I wanted to split nameparts which also contained special characters: 我有一个同样的问题，我想分割也包含特殊字符的名称部分：

For example if you want to split a bunch of names containing: 例如，如果要拆分一堆包含以下内容的名称：

<lastname>,<forename(s)> <initial(s)> <suffix(es)>

fornames and suffix are separated with (white)space(s) 姓氏和后缀之间用空格隔开
initials are separated with a . 首字母以分隔。 and with maximum of 6 initials 且最多有6个首字母缩写

you could use 你可以用

$nameparts=preg_split("/(\w*),((?:\w+[\s\-]*)*)((?:\w\.){1,6})(?:\s*)(.*)/u",$displayname,null,PREG_SPLIT_DELIM_CAPTURE);
//first and last part are always empty
array_splice($naamdelen, 5, 1);
array_splice($naamdelen, 0, 1);
print_r($nameparts);

Input: 输入：
Powers,Björn BA van der
Output: 输出：
Array ( [0] => Powers[1] => Björn [2] => BA [3] => van der)

Tip: the regular expression looks like from outer space but regex101.com to the rescue! 提示：正则表达式看起来像是从外太空到regex101.com一样！

PHP正则表达式，用于匹配所有特殊字符，包括带重音符号的字符

问题描述

4 个解决方案

解决方案1
1 2013-09-17 13:37:25

解决方案2
0 2013-09-17 13:40:19

解决方案3
0 2013-09-17 13:41:14

解决方案4
0 2017-03-27 21:32:56

PHP正则表达式，用于匹配所有特殊字符，包括带重音符号的字符

问题描述

4 个解决方案

解决方案1 1 2013-09-17 13:37:25

解决方案2 0 2013-09-17 13:40:19

解决方案3 0 2013-09-17 13:41:14

解决方案4 0 2017-03-27 21:32:56

解决方案1
1 2013-09-17 13:37:25

解决方案2
0 2013-09-17 13:40:19

解决方案3
0 2013-09-17 13:41:14

解决方案4
0 2017-03-27 21:32:56