简体   繁体   中英

How to check for two possible regular expressions?

Let's say I would like to parse someone's home address to street, house number, city..

In my case there are two (very different) possible ways how can the data be formatted. So I have two very long regex expressions I would like to check for. If the regex match, I would like to export data from those regexes.

1:

Long Square
25
London
...

2:

London
Living: Long Square, 25
....

How should I check for both of these? Should I use just two if clauses and check them one by one like:

if (preg_match(@$match_regex, file_get_contents($tag->getAttribute("src")), $matches) == true)
{
  //regex 1 matched
}
else if ((preg_match(@$match_regex_2, file_get_contents($tag->getAttribute("src")), $matches) 
{
  //regex 2 matched
}
else
{
  //no match
}

Or should I check that somehow in one regex ?

Like:

[regex_1|regex_2]

Which method is preffered and will be cpu "faster"?

The fastest way would be searching for Living: text, then do the regex:

$string = file_get_contents($tag->getAttribute("src"));
$matched = false;
$matches = array();

if (false === strpos($string, 'Living:')) {
    $matched = preg_match(@$match_regex, $string, $matches);
} else {
    $matched = preg_match(@$match_regex_2, $string, $matches);
}

if (!$matched) {
    // no match
} else {
    // print matches
}

Notice that I separated the two logics. First if block determines the type of the address string and performs proper regex. Second if block checks if the match occured (no matter which regex was performed).

Don't make assumptions about performance - measure it.

The one regex would be

(regex1)|(regex2)

When you have both version run them against your data and measure the time.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM