I created the following code to display a blank page , a piece of an external site , but I had to remove some nodes and each node needed to create a piece of code and it made almost unfeasible his maintenance if it was a big project.
My doubts:
Is there a way to put in a single stretch all we want to eliminate ( footer , header , headerContent , etc.)?
Is there a smarter way to clean instead of deleting elements, just show what I want ( TABLE1 )?
# Create a DOM parser object $dom = new DOMDocument(); libxml_use_internal_errors(true); $dom->loadHTMLFile('http://www.sptrans.com.br/sac/solicitacoes.aspx'); $data = $dom -> getElementByid('TABELA1'); $xpath = new DOMXPath($dom); foreach($xpath->query('//div[contains(attribute::id, "novidadeDestaque")]') as $e ) { // Delete this node $e->parentNode->removeChild($e); } $xpath = new DOMXPath($dom); foreach($xpath->query('//div[contains(attribute::id, "headerLvl1")]') as $e ) { // Delete this node $e->parentNode->removeChild($e); } $xpath = new DOMXPath($dom); foreach($xpath->query('//div[contains(attribute::id, "headerContent")]') as $e ) { // Delete this node $e->parentNode->removeChild($e); } $xpath = new DOMXPath($dom); foreach($xpath->query('//div[contains(attribute::id, "novo_menu")]') as $e ) { // Delete this node $e->parentNode->removeChild($e); } $xpath = new DOMXPath($dom); foreach($xpath->query('//div[contains(attribute::id, "footer")]') as $e ) { // Delete this node $e->parentNode->removeChild($e); } $xpath = new DOMXPath($dom); foreach($xpath->query('//div[contains(attribute::id, "header")]') as $e ) { // Delete this node $e->parentNode->removeChild($e); } $xpath = new DOMXPath($dom); foreach($xpath->query('//div[contains(attribute::id, "pageNovidades")]') as $e ) { // Delete this node $e->parentNode->removeChild($e); } echo $dom->saveHTML(); ?> </body>
To create a short-code routine to eliminate desired elements you can use an array:
$xpath = new DOMXPath($dom);
$idToDelete = [ 'novidadeDestaque', 'headerLvl1', ... ];
foreach( $idToDelete as $id )
{
foreach($xpath->query('//div[contains(attribute::id, "'.$id.'")]') as $e ) {
$e->parentNode->removeChild($e);
}
}
Please note that you don't need to create a new DOMXPath
object for each search: you can create it only once per DOMDocument
object.
To show only what you want, you can use this syntax:
$table = $dom->GetElementById( 'MyTable' );
echo $dom->saveHTML( $table );
To have a complete HTML with only desired table, you can create a new DOMDocument
and use importNode
to add your table:
$src = new DOMDocument();
$dst = new DOMDocument();
$src->loadHTML( $html );
$dst->loadHTML( '<html><head><title>Untitled</title></head><body></body></html>' );
$table = $src->GetElementById( 'MyTable' );
$imported = $dst->importNode( $table );
$dst->getElementsByTagName( 'body' )->item(0)->appendChild( $imported );
$dst->saveHTML();
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.