I'm trying to parse a web page. Basically it gets stored in a string that will look like this:
"[HTML CODE ...]world:[HTML CODE ...]my_number[REST OF HTML_CODE ...]"
Of course "world:" and "MY_NUMBER" are part of the html code, however I would like to ignore everything before the first occurrence of "world:". What I need is the first number that appears after the first occurrence of "world:", keeping in mind that a bunch of html code will be between those. I could substring the html code but I would like to do this all just by using a single regex if possible.
This is the regular expression I tried to match:
'/(?<=world:)\D+?[0-9]+/'
But this returns me all the html stuff between "world:" and my number.
Thanks!
I think you were close to getting it. I was able to use this on the string you provided.
$subject = "[HTML CODE ...]world:[HTML CODE ...]3334[REST OF HTML_CODE ...]";
$pattern = "/world:\D+?(?<my_number>[0-9]+)/";
$matches = array();
$result = preg_match_all($pattern, $subject, &$matches);
print_r($matches);
Results in:
Array
(
[0] => Array
(
[0] => world:[HTML CODE ...]3334
)
[my_number] => Array
(
[0] => 3334
)
[1] => Array
(
[0] => 3334
)
)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.