简体   繁体   中英

PHP Regexp: ignoring everything before a defined substring

I'm trying to parse a web page. Basically it gets stored in a string that will look like this:

"[HTML CODE ...]world:[HTML CODE ...]my_number[REST OF HTML_CODE ...]"

Of course "world:" and "MY_NUMBER" are part of the html code, however I would like to ignore everything before the first occurrence of "world:". What I need is the first number that appears after the first occurrence of "world:", keeping in mind that a bunch of html code will be between those. I could substring the html code but I would like to do this all just by using a single regex if possible.

This is the regular expression I tried to match:

'/(?<=world:)\D+?[0-9]+/'

But this returns me all the html stuff between "world:" and my number.

Thanks!

I think you were close to getting it. I was able to use this on the string you provided.

$subject = "[HTML CODE ...]world:[HTML CODE ...]3334[REST OF HTML_CODE ...]";
$pattern = "/world:\D+?(?<my_number>[0-9]+)/";
$matches = array();

$result =  preg_match_all($pattern, $subject, &$matches);

print_r($matches);

Results in:

Array
(
    [0] => Array
        (
            [0] => world:[HTML CODE ...]3334
        )

    [my_number] => Array
        (
            [0] => 3334
        )

    [1] => Array
        (
            [0] => 3334
        )

)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM