简体   繁体   English

正则表达式Preg_match_all匹配所有模式

[英]Regex Preg_match_all match all pattern

Here is my concern, I have a string and I need to extract chraracters two by two. 这是我的担忧,我有一个琴弦,我需要两三个地提取字法。

$str = "abcdef" should return array('ab', 'bc', 'cd', 'de', 'ef') . $str = "abcdef"应该返回array('ab', 'bc', 'cd', 'de', 'ef') I want to use preg_match_all instead of loops . 我想使用preg_match_all而不是loops Here is the pattern I am using. 这是我正在使用的模式。

$str = "abcdef";
preg_match_all('/[\w]{2}/', $str);

The thing is, it returns Array('ab', 'cd', 'ef') . 问题是,它返回Array('ab', 'cd', 'ef') It misses 'bc' and 'de' . 它错过了'bc''de'

I have the same problem if I want to extract a certain number of words 如果要提取一定数量的单词,我也会遇到同样的问题

$str = "ab cd ef gh ij";
preg_match_all('/([\w]+ ){2}/', $str); // returns array('ab cd', 'ef gh'), I'm also missing the     last part

What am I missing? 我想念什么? Or is it simply not possible to do so with preg_match_all ? 还是使用preg_match_all根本不可能做到这一点?

For the first problem, what you want to do is match overlapping string , and this requires zero-width (not consuming text) look-around to grab the character: 对于第一个问题,您想要做的是匹配重叠的string ,这需要零宽度 (不消耗文本)的环顾四周来抓住字符:

/(?=(\w{2}))/

The regex above will capture the match in the first capturing group. 上面的正则表达式将在第一个捕获组中捕获匹配项。

DEMO 演示

For the second problem, it seems that you also want overlapping string. 对于第二个问题,看来您也想要重叠的字符串。 Using the same trick: 使用相同的技巧:

/(?=(\b\w+ \w+\b))/

Note that \\b is added to check the boundary of the word. 请注意,已添加\\b以检查单词的边界。 Since the match does not consume text, the next match will be attempted at the next index (which is in the middle of the first word), instead of at the end of the 2nd word. 由于该匹配不消耗文本,因此将在下一个索引(位于第一个单词的中间)而不是第二个单词的末尾尝试下一个匹配。 We don't want to capture from middle of a word, so we need the boundary check. 我们不想从单词的中间捕获,因此我们需要边界检查。

Note that \\b 's definition is based on \\w , so if you ever change the definition of a word, you need to emulate the word boundary with look-ahead and look-behind with the corresponding character set. 请注意, \\b的定义基于\\w ,因此,如果您更改单词的定义,则需要使用相应的字符集来模拟单词边界,并且要先行模拟。

DEMO 演示

In case if you need a Non-Regex solution, Try this... 如果您需要非正则表达式解决方案,请尝试此...

<?php

$str = "abcdef";
$len = strlen($str);

$arr = array();
for($count = 0; $count < ($len - 1); $count++)
{
    $arr[] = $str[$count].$str[$count+1];
}

print_r($arr);

?>

See Codepad . 请参阅键盘

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM