First of all: Sorry, I´m just learning coding, so this might be an easy question :).
What I want to archive is getting the values of all
<option value="123"></option>
<option value="412"></option>
in an html document into an array. So for the above example only the "123" "412" etc. The arrays then will get checked if they are only numbers.
This is what i got:
$html = file_get_contents(url);
preg_match_all('/value="(\w+)"/', $html, $result);
var_dump($result);
$digits = array_filter($result, 'ctype_digit');
What I get from this is nothing, because the $result gives me results like:
value="123"
I do know that I messed up with those regular expressions, but I cant´t get ir right.
And then I´m not sure whether it is better to use XPath to select it, but I did not get that either :(.
Any help is highly appreciated! :)
Thanks to the hint by CD001 and Kisaragi I manged it. It´s pretty simple with the DOMDocument thing...sometimes one just thinks too complicated... .
$html = file_get_contents('url');
$dom = new DOMDocument;
$dom->loadHTML($html);
$options = $dom->getElementsByTagName('option');
$digits = array();
foreach ($options as $option) {
$valueID = $option->getAttribute('value');
array_push($digits, $valueID);
}
var_dump($digits);
My advice would be to not use a regex but a domparser.
For your provided data, the $result
is an array
which contains 2 arrays. Your values are in the second array $result[1]
You could update your code to:
preg_match_all('/value="(\w+)/', $html, $result);
$digits = array_filter($result[1], 'ctype_digit');
var_dump($digits);
That would give you:
array(2) {
[0]=>
string(3) "123"
[1]=>
string(3) "412"
}
An alternative regex:
value="\\K\\d+(?=")
which would match one or more digits d+
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.