so say I have an HTML page like this.
<input type="text" name="name" />
<input type="hidden" name="test" value="test_result1" />
<input type="hidden" name="test2" value="test_result2" />
I want to parse that HTML page (from a url, using file_get_contents?), then get the names of the input elements that have the type "hidden".
Basically what I want parsed from that page is
test
test2
Is there a way to do this? I looked into some parsing libraries (like SimpleHTMLDom), but I just couldn't do it.
Using the SimpleHTMLDom
$html = file_get_html('http://www.google.com/');
// Find all inputs
foreach($html->find('input') as $element){
$type = $element->type;
if($type=='hidden'){
echo $element->name. '<br/>';
}
}
Use php DOMDocument.
$html = "<html>
<body>
<form>
<input type='text' name='not_in_results'/>
<input type='hidden' name='test1'/>
<input type='hidden' name='test2'/>
</body>
</html>";
$dom = new DOMDocument;
$dom->loadHTML($html);
$inputs = $dom->getElementsByTagName('input');
$hiddenInputNames = [];
foreach ($inputs as $input) {
if($input->getAttribute('type') == "hidden")
$hiddenInputNames[] = $input->getAttribute('name');
}
var_dump($hiddenInputNames);
Try this:
include 'simple_html_dom.php';
$data = file_get_html('index.html');
$nodes = $data->find("input[type=hidden]");
foreach ($nodes as $node) {
$val = $node->name;
echo $val . "<br />";
}
Output:
test
test2
In this case, you have to include php simple html dom par
simple_html_dom is easy to use, but also very old and not very fast. Using php's DOMDocument is much faster, but also not as easy as simple_html_dom. An alternative would be something like DomQuery , which basically give you jquery like access to php's DOMDocument. It would be as simple as:
$dom = new DomQuery(file_get_html('index.html'))
foreach($dom->find('input[type=hidden]') as $elm) {
echo $elm->name;
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.