简体   繁体   中英

Finding and removing html tags with PHP Simple HTML DOM Parser

This is the code I am using:

include 'simple_html_dom.php';
$html = file_get_html('index.html');
echo $html->find('tr', 15);

This will find the row 15 of the table. What I want to do is to remove that row completely.

I have already tried

$html->find('tr', 15)=null; 

But that does not seem to work. I have tried finding the info on the SimpleHTMLDom documentation but it does no contain much information.

simple_html_dom does not seems to allow the deletion.

Try with this instead:

$html = new DOMDocument();
$html->loadHTMLFile('index.html');
$element = $html->getElementsByTagName('tr')->item(15);
$element->parentNode->removeChild($element);

here you have a working example (works as is in Linux, but is easily adaptable).

File dom_test.php :

#!/usr/bin/php
<?php
    $html = new DOMDocument();
    $html->loadHTMLFile('index.html');
    $element = $html->getElementsByTagName('tr')->item(1);
    $element->parentNode->removeChild($element);

    echo $html->saveHTML();
?>

Where the index.html contains:

<html>
    <head></head>
    <body>
        <table>
            <tr><td> hi </td><td>there</td></tr>
            <tr>
                <td> HELLO </td>
                <td> there </td>
            </tr>
            <tr><td> hi </td><td>there</td></tr>
        </table>
    </body>
</html>

Put both files in the same directory and execute this in the console:

php dom_test.php

The output will appears without the "HELLO there" row.

I hope that helps you.

You can do this with simple_html_dom, just set the outertext to the value of innertext

foreach($html->find('div') as $div) {
    $div->outertext = $div->innertext;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM