Here is an Example of what i want to do Example:
<div class='room'>
<h1>This is a h1</h1>
<p>This is a Paragraph</p>
<h2>This is h2</h2>
</div>
From the above emaple I would like to scrape data and tags in arrays. In the result I would like an array containing: arr = [h1,p,h2]; and another array: arr2 = [This is h1,This is paragraph,This is h2]
Assuming the elements are known you could use the domdocument 's getelementsbytagname like this:
$html = "<div class='room'>
<h1>This is a h1</h1>
<p>This is a Paragraph</p>
<h2>This is h2</h2>
</div>";
$doc = new DOMDocument();
$doc->loadhtML($html);
$elements = array();
$content = array();
function iterate_elements($array, $doc){
global $elements, $content;
foreach($array as $element){
$the_element = $doc->getElementsByTagName($element);
foreach($the_element as $target){
$content[] = $target->textContent;
//$target->tagName;
}
if(!empty($the_element->length)) {
$elements[] = $element;
}
}
}
iterate_elements(array('h1','p', 'h2'), $doc);
print_r($elements);
print_r($content);
Demo: https://eval.in/825860
Try this;
$str = "<div class='room'>
<h1>This is a h1</h1>
<p>This is a Paragraph</p>
<h2>This is h2</h2>
</div>";
$arr = explode(PHP_EOL, $str);
$res =array();
Foreach($arr as $row){
If(!strpos($row, "div") !== False){
$res[substr($row, 1, strpos($row, ">")-1)] = strip_tags($row);
}
}
Var_dump($res);
It loops through one line at the time and creates the array with named keys.
Edit if there is more than one room you can make it multidimensional like this:
https://3v4l.org/DdXVd
$str = "<div class='room'>
<h1>This is a h1</h1>
<p>This is a Paragraph</p>
<h2>This is h2</h2>
</div>
<div class='room2'>
<h1>This is a h1</h1>
<p>This is a Paragraph</p>
<h2>This is h2</h2>
</div>";
$arr = explode(PHP_EOL, $str);
$res =array();
Foreach($arr as $row){
If(strpos($row, "div") !== False){
$pos1 = strpos($row, "'")+1;
$room = substr($row, $pos1, strpos($row, "'", $pos1)-$pos1);
}Else{
$pos1 = strpos($row, "<")+1;
$res[$room][substr($row, strpos($row, "<")+1, strpos($row, ">")-$pos1)] = trim(strip_tags($row));
}
}
Var_dump($res);
try below code.
$html = "<div class='room'>
<h1>This is a h1</h1>
<p>This is a Paragraph</p>
<h2>This is h2</h2>
</div>";
$dom = new SimpleXMLElement( $html );
$values = array_filter( array_values( (array) $dom ), function ( $i ) { return ! is_array( $i ); } );
$keys = array_filter( array_keys( (array) $dom ), function ( $i ) { return $i != '@attributes'; } );
print_r( $values ); // This is a h1, This is a Paragraph, This is h2
print_r( $keys ); // h1, p, h2
I used array_filter
for remove div tag from result.
$str = <<<EOF
<div class='room'>
<h1>This is a h1</h1>
<p>This is a Paragraph</p>
<h2>This is h2</h2>
</div>
EOF;
$html = str_get_html($str);
foreach($html->find('.room *') as $el){
$arr[] = $el->tag;
$arr2[] = $el->text();
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.