[英]find substring with special characters
I have pattern 'šalotka 29%'
and i need to know if string 'something something šalotka 29% something'
contains the pattern but not if the pattern is part of a longer word 'something something šalotka 29%something'
我有模式
'šalotka 29%'
,我需要知道字符串'something something šalotka 29% something'
是否包含该模式,但如果模式是较长单词'something something šalotka 29%something'
一部分则不包含
I have this mb_eregi('\b'. $pattern. '\b', $string)
but its not working because regex boundaries not working with special character.我有这个
mb_eregi('\b'. $pattern. '\b', $string)
但它不起作用,因为正则表达式边界不适用于特殊字符。 Any suggestion?有什么建议吗?
A word boundary matches only between a word character (a character from the \w
character class) and a non-word character or the limit of the string.单词边界仅匹配单词字符(来自
\w
字符类的字符)和非单词字符或字符串的限制。
If your searched string starts or ends with a non-word character, you can't use a word-boundary.如果您搜索的字符串以非单词字符开头或结尾,则不能使用单词边界。
The difficulty is to define yourself precisely what separates the desired chain from the rest. In other words, it is your choice.困难在于自己准确定义所需链与 rest 之间的区别。换句话说,这是您的选择。 Whatever your choice is, you can use the same technique: using lookarounds before and after your string to define what you don't want around your string: a negative lookbehind
(?<....)
and a negative lookahead (?....)
.无论您的选择是什么,您都可以使用相同的技术:在字符串之前和之后使用环视来定义您不希望在字符串周围出现的内容:负向后视
(?<....)
和负向前视(?....)
。
Example:例子:
mb_eregi('(?<!\S)' . $item . '(?!\S)', $string, $match);
mb_eregi('(?<!\w)' . $item . '(?!\w)', $string, $match);
full example:完整示例:
$item = 'šalotka 29%';
$string = 'something something šalotk 29% something';
mb_regex_encoding('UTF-8'); // be sure to use the correct encoding
// if needed escape regex special characters
$item = mb_eregi_replace('[\[\](){}.\\\\|$^?+*#-]', '\\\0', $item);
mb_eregi('(?<!\S)' . $item . '(?!\S)', $string, $matches);
print_r($matches);
Notices:注意事项:
If ereg
functions are now obsolete and have been removed from recent PHP versions, mb_ereg
functions, based on the oniguruma regex engine, still exist and offer features not available in preg_
functions (PCRE).如果
ereg
函数现在已过时并且已从最近的 PHP 版本中删除,基于 oniguruma 正则表达式引擎的mb_ereg
函数仍然存在并提供preg_
函数 (PCRE) 中不可用的功能。
Obviously for this current question, you can do the same with preg_match
:显然对于当前这个问题,您可以对
preg_match
做同样的事情:
preg_match('~(?<!\S)' . $item . '(?!\S)~ui', $string, $match);
preg_
functions you can use preg_quote
to escape them, but it's also possible to "do it yourself" with $item = mb_ereg_replace('[\[\](){}.\\\\|$^?+*#-]', '\\\0', $item);
preg_
函数,您可以使用preg_quote
来转义它们,但也可以使用$item = mb_ereg_replace('[\[\](){}.\\\\|$^?+*#-]', '\\\0', $item);
that suffices for most of the syntaxes available in mb_ereg
functions (Note that escaping all non-word characters does the job too).mb_ereg
函数中可用的大多数语法(请注意,escaping 所有非单词字符也可以完成这项工作)。 Feel free to write your own if you want to deal with Emacs or BRE syntaxes.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.