简体   繁体   English

适用于不同语言的PHP Regex

[英]PHP Regex for different languages

I want to use regex as follows: 我想如下使用正则表达式:

[a-z' ]*[a-z]

This won't work with different languages such as Chinese. 这不适用于其他语言,例如中文。 Is it possible to create an inverse version of this regex to do the following: 是否可以创建此正则表达式的逆版本来执行以下操作:

Capture a word or words that are connected by a space 捕获由空格连接的一个或多个单词

"Hey, july 2010"
=> hey
=> july

"hey what's up"
=> hey what's up

"汉漢字, 汉漢字 3004303"
=> 汉漢字
=> 汉漢字

First define your set of word characters: [\\pL'-] ( \\pL unicode letter , single quote and hyphen). 首先定义您的单词字符集: [\\pL'-] \\pL [\\pL'-]\\pL Unicode字母 ,单引号和连字符)。

Within word boundaries \\b[\\pL'-]+\\b matches one word. 在单词边界内, \\b[\\pL'-]+\\b匹配一个单词。 Followed by any amount of words, that are preceded by one or more \\h+ horizonal spaces, the final pattern for use with preg_match_all: 后跟任意数量的单词,并带有一个或多个\\h+空格,这是与preg_match_all一起使用的最终模式:

/\b[\pL'-]+(?:\h+[\pL'-]+)*\b/u

Already put into pattern delimiters and set u-modifier for unicode functionality. 已经放入模式定界符并为Unicode功能设置了u-modifier。

Demo at regex101.com 演示在regex101.com

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM