简体   繁体   中英

Regex Preg_match_all match all pattern

Here is my concern, I have a string and I need to extract chraracters two by two.

$str = "abcdef" should return array('ab', 'bc', 'cd', 'de', 'ef') . I want to use preg_match_all instead of loops . Here is the pattern I am using.

$str = "abcdef";
preg_match_all('/[\w]{2}/', $str);

The thing is, it returns Array('ab', 'cd', 'ef') . It misses 'bc' and 'de' .

I have the same problem if I want to extract a certain number of words

$str = "ab cd ef gh ij";
preg_match_all('/([\w]+ ){2}/', $str); // returns array('ab cd', 'ef gh'), I'm also missing the     last part

What am I missing? Or is it simply not possible to do so with preg_match_all ?

For the first problem, what you want to do is match overlapping string , and this requires zero-width (not consuming text) look-around to grab the character:

/(?=(\w{2}))/

The regex above will capture the match in the first capturing group.

DEMO

For the second problem, it seems that you also want overlapping string. Using the same trick:

/(?=(\b\w+ \w+\b))/

Note that \\b is added to check the boundary of the word. Since the match does not consume text, the next match will be attempted at the next index (which is in the middle of the first word), instead of at the end of the 2nd word. We don't want to capture from middle of a word, so we need the boundary check.

Note that \\b 's definition is based on \\w , so if you ever change the definition of a word, you need to emulate the word boundary with look-ahead and look-behind with the corresponding character set.

DEMO

In case if you need a Non-Regex solution, Try this...

<?php

$str = "abcdef";
$len = strlen($str);

$arr = array();
for($count = 0; $count < ($len - 1); $count++)
{
    $arr[] = $str[$count].$str[$count+1];
}

print_r($arr);

?>

See Codepad .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM