简体   繁体   中英

Regex for getting strings that contains only the words from the pattern list?

Consider the following array elements

 1.benclinton
 2.clintonharry
 3.harryben
 4.benwill
 5.jasonsmith
 6.smithclinton

Assume the pattern list is ben,harry,clinton , then the result I should get is

1.benclinton  
2.clintonharry  
3.harryben

So,essentially the result should contain strings that contains only the words that are in the pattern list. Order is not important

Also, each strings will not be having more than two words. ie bensmithwill will never be a case.

Since all my strings are in an array, I thought of using preg_grep in php to do this but i am struck in framing correct regex for this.

what regex can achieve this? Is there any other efficient way apart from regex matching that will do the work?

Thanks in advance!

Something like this

$names_list = ['benclinton','clintonharry','harryben','benwill','jasonsmith','smithclinton'];
$names = ['ben','harry','clinton'];  

$matches = preg_grep('/('.implode('|',$names).')(?1)/', $names_list);
//-  /(ben|harry|clinton)(?1)/  -- (?1) = recurse capture group 1 

print_r($matches);

Output

Array
(
    [0] => benclinton
    [1] => clintonharry
    [2] => harryben
)

Sandbox

This requires that at least two of the names (even the same one 2x) match. But that is kind of a given in this case or everything would match.

If you want to be extra careful, if the $names can contain something important to regex, such as + , * , \\ etc. you can add this

$matches = preg_grep('/('.implode('|',array_map(function($name){return preg_quote($name,'/');},$names)).')(?1)/', $names_list);

It appears that you want to match array elements which are exact combinations of two keywords. For a regex approach, we can try taking the cross product of the vector of keyords, and then generate an alternation. Then, we can use preg_grep against your input array to find all matching elements.

$array = array("benclinton", "clintonharry", "harryben", "benwill", "jasonsmith", "smithclinton");
$input = array("ben", "harry", "clinton");
$regex = "";
foreach ($input as $term1)  {
    foreach ($input as $term2)  {
        if ($regex != "") $regex .= "|";
        $regex .= $term1.$term2;
    }
}
$regex = "/^(" . $regex . ")$/";
$matches = preg_grep($regex, $array);
print_r($matches);

Array
(
    [0] => benclinton
    [1] => clintonharry
    [2] => harryben
)

Here is the regex alternation generated by the above script:

(benben|benharry|benclinton|harryben|harryharry|harryclinton|clintonben|
    clintonharry|clintonclinton)

Without Regex.Do with array_filter and strpos

  1. Filter array with respected matching second array the count greater then 1

Sandbox

<?php
$a = ['benclinton','clintonharry','harryben','benwill','jasonsmith','smithclinton'];
$a2 = ['ben','clinton','harry'];
$res = array_filter($a,function($str="") use($a2){
    $r =array_filter($a2,function($a2str) use($str){
        return strpos($str,$a2str) !== FALSE;
    });
    return count($r) > 1;
});
print_r($res);
?>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM