简体   繁体   中英

Editing html using Simple_html_dom.php

I am working through using simple_html_dom.php to scrape and edit/manipulate the following:

<?php
include('simple_html_dom.php');
$_GET["name"];

$html_code="https://hwb.wales.gov.uk/Home/Pages/Home.aspx";
$html_code= $html_code.$name."/?lang=en";

echo $html_code;


$html = file_get_html($html_code);

echo "<html>";
echo "<head>";
echo "<meta charset='UTF-8'>";
echo  "<title>PHP Test</title>";
echo " </head>";
echo " <body>";


foreach($html->find('#LatestNewsArts') as $e)
   // Code here to append hwb.wale.gov.uk to <img src="/   >
  echo $e->innertext . '<br>';

echo " </body>";
echo "</html>";

?>

I can extract the <div> that I'm looking for - and echo it -- that works fine.

Where I hit a wall (my .php-fu is letting me down) is how to I intercept and edit the html inside the e$ that I have scraped?

What I am looking to do, is replace the <img src="/...."> tag with <img src="hwb.wales.gov.uk/....">

Setting a new value to an attribute can easily be done like this: $elmt->attribute = NewValue

Here's a working code answering your question:

// includes Simple HTML DOM Parser
include "simple_html_dom.php";

$html_code="https://hwb.wales.gov.uk/Home/Pages/Home.aspx";

// => I dont know what $name stands fore... It's up to you to change this code to suit your needs
//$html_code= $html_code.$name."/?lang=en";

echo $html_code;

$html = file_get_html($html_code);

echo "<html>";
echo "<head>";
echo "<meta charset='UTF-8'>";
echo  "<title>PHP Test</title>";
echo " </head>";
echo " <body>";

// Loop through all divs with id="Article"
foreach($html->find('#LatestNewsArts #Article') as $e){
    $url = "https://hwb.wales.gov.uk" . $e->find("img",0)->src;

    // Set src to the new $url
    $e->find("img",0)->src = $url;

    // Print the outertext
    echo $e->outertext . '<br>';
}

echo " </body>";
echo "</html>";


// Clear dom object
$html->clear(); 
unset($html);

=> Working Demo <=

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM