[英]Regex: Matching words with special characters
我正在尝试找到一个正则表达式,该正则表达式与字符串中的一个单词(确切的单词)匹配。 问题是该单词具有特殊字符(如“#”或其他任何字符)时。 特殊字符可以是任何UTF-8字符,例如(“áéíóúñ#@”),并且必须忽略标点符号。
我举了一些我要找的例子:
Searching:#myword
Sentence: "I like the elephants when they say #myword" <- MATCH
Sentence: "I like the elephants when they say #mywords" <- NO MATCH
Sentence: "I like the elephants when they say myword" <-NO MATCH
Sentence: "I don't like #mywords. its silly" <- NO MATCH
Sentence: "I like #myword!! It's awesome" <- MATCH
Sentence: "I like #myword It's awesome" <- MATCH
PHP示例代码:
$regexp= "#myword";
if (preg_match("/(\w$regexp)/", "I like #myword!! It's awesome")) {
echo "YES YES YES";
} else {
echo "NO NO NO ";
}
谢谢!
更新:如果我查找“ myword ”,则该单词必须以“ w”开头,而不是另一个字符。
Sentence: "I like myword!! It's awesome" <- MATCH
Sentence: "I like #myword It's awesome" <-NO MATCH
当分别考虑字符和边界时,将产生以下解决方案。 也可能存在一种直接使用单词边界的可行方法。
码:
function search($strings,$search) {
$regexp = "/(?:[[:space:]]|^)".$search."(?:[^\w]|$)/i";
foreach ($strings as $string) {
echo "Sentence: \"$string\" <- " .
(preg_match($regexp,$string) ? "MATCH" : "NO MATCH") ."\n";
}
}
$strings = array(
"I like the elephants when they say #myword",
"I like the elephants when they say #mywords",
"I like the elephants when they say myword",
"I don't like #mywords. its silly",
"I like #myword!! It's awesome",
"I like #mywOrd It's awesome",
);
echo "Example 1:\n";
search($strings,"#myword");
$strings = array(
"I like myword!! It's awesome",
"I like #myword It's awesome",
);
echo "Example 2:\n";
search($strings,"myword");
输出:
Example 1:
Sentence: "I like the elephants when they say #myword" <- MATCH
Sentence: "I like the elephants when they say #mywords" <- NO MATCH
Sentence: "I like the elephants when they say myword" <- NO MATCH
Sentence: "I don't like #mywords. its silly" <- NO MATCH
Sentence: "I like #myword!! It's awesome" <- MATCH
Sentence: "I like #mywOrd It's awesome" <- MATCH
Example 2:
Sentence: "I like myword!! It's awesome" <- MATCH
Sentence: "I like #myword It's awesome" <- NO MATCH
您应该使用// /\\bmyword\\b/
这样的myword
边界搜索myword
。
#
本身也是一个单词边界,因此/\\b#myword\\b/
不起作用。
一个想法是用\\X
转义unicode字符,但这会带来其他问题。
/ #myword\b/
这应该可以解决问题(将“ myword”替换为您要查找的任何内容):
^.*#myword[^\w].*$
如果匹配成功,那么您找到了答案-否则没有找到。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.