简体   繁体   English

如何使用PHP检查RegEx搜索模式的前一个字符?

[英]How to check preceding character of a RegEx search pattern using PHP?

I would like to check if the preceding character of a search pattern is an alphanumeric character. 我想检查search pattern的前一个字符是否为字母数字字符。

If true, do nothing. 如果为true,则什么也不做。

If fasle, remove the preceding space in the search pattern . 如果不方便,请删除search pattern中的前一个space

For example: 例如:

$string1 = "This is a test XYZ something else";

$string2 = "This is a test? XYZ something else";

$pattern = " XYZ";

In $string1 scenario, the preceding character of the search pattern is t and considered a match, nothing will be perform. 在$ string1方案中,搜索模式的前一个字符为t并且被视为匹配项,将不执行任何操作。

In $string2 scenario, the preceding character of the search pattern is ? 在$ string2方案中,搜索模式的前一个字符是? and considered a non-match, and I'm removing the extra space in searhc pattern . 并认为是不匹配项,我将删除searhc pattern的多余空间。

Making it: 进行中:

$string2 = "This is a test?XYZ something else";

How can this be accomplished in PHP? 如何在PHP中完成?

You may use a \\B XYZ pattern and use a preg_replace_callback to trim the match value and insert it back: 您可以使用\\B XYZ模式并使用preg_replace_callback trim匹配值并将其插入回:

$string1 = "This is a test XYZ something else";
$string2 = "This is a test? XYZ something else";
$pattern = " XYZ";
echo preg_replace_callback('~\B'.$pattern.'~', function($m) { return trim($m[0]); }, $string1) . PHP_EOL;
// => This is a test XYZ something else
echo preg_replace_callback('~\B'.$pattern.'~', function($m) { return trim($m[0]); }, $string2);
// => This is a test?XYZ something else

See the PHP demo 参见PHP演示

Since \\B matches at the locations other than those matched with a word boundary (a non-word boundary), the pattern \\B XYZ will only match after a non-word char. 由于\\B在与单词边界(非单词边界)匹配的位置以外的其他位置匹配,因此\\B XYZ模式仅在非单词char之后匹配。

More details : your pattern starts with a space. 更多详细信息 :模式以空格开头。 This is a non word char. 这是一个无字字符。 By adding \\B before it we require that the character before the space should also be a non word char. 通过在\\B之前添加\\B ,我们要求空格之前的字符也应为非单词char。 Else, we'll get no match. 否则,我们将找不到比赛。 The word char is a char from [a-zA-Z0-9_] range. 单词char是[a-zA-Z0-9_]范围内的字符。 If you need to customize the boundary, use a lookbehind like (?<![a-zA-Z0-9]) to exclude the underscore from the boundary characters. 如果需要自定义边界,请使用类似(?<![a-zA-Z0-9])来从边界字符中排除下划线。

For more information on non-word boundary see this What are non-word boundary in regex ( \\B ), compared to word-boundary? 有关非单词边界的更多信息,请参见与单词边界相比,正则表达式( \\B )中的非单词边界是什么? SO thread . SO线程

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM