简体   繁体   中英

Unable to use the PHP Simple HTML DOM Parser's find() function

I am trying to scrape a remote website and edit parts of the results before updating a couple of tables in the database and subsequently echo() 'ing the final document.

Here's a redacted snippet of the code in question for reference:

<?php

require_once 'backend/connector.php';
require_once 'table_access/simplehtmldom_1_5/simple_html_dom.php';
require_once 'pronunciation1.php';

// retrieve lookup term
if(isset($_POST["lookup_term"])){ $term = trim($_POST["lookup_term"]); }
else { $term = "hombre"; }

$html = file_get_html("http://www.somesite.com/translate/" . rawurlencode($term));
$coll_temp = $html->find('div[id=translate-en]');
$announce = $coll_temp[0]->find('.announcement');
$quickdef = $coll_temp[0]->find('.quickdef');
$meaning = $announce[0] . $quickdef[0];

$html->clear(); // release scraper variable to prevent memory leak issues
unset($html); // release scraper variable to prevent memory leak issues
$meaning = '<?xml version="1.0" encoding="ISO-8859-15"?>' . $meaning;

// process the newly-created DOM
$dom = new DOMDocument;
$dom->loadHTML($meaning);
// various DOM-manipulation code snippets

// extract the quick definition section
foreach ($dom->find('div[class=quickdef]') as $qdd) {
    $qdh1 = $qdd->find('.source')[0]->find('h1.source-text');
    $qdterm = $qdh1[0]->plaintext;
    $qdlang = $qdh1[0]->getAttribute('source-lang');
    add2qd($qdterm, $qdd, $qdlang);
    unset($qdterm);
    unset($qdlang);
    unset($qdh1);
}

$finalmeaning = $dom->saveHTML(); // store processed DOM in $finalmeaning
push2db($term, $finalmeaning); // add processed DOM to database
echo $finalmeaning; // output processed DOM

// release variables
unset($dom);
unset($html);
unset($finalmeaning);
function add2qd($lookupterm, $finalqd, $lang){
    $connect = dbconn(PROJHOST, CONTEXTDB, PEPPYUSR, PEPPYPWD);
    $sql = 'INSERT IGNORE INTO tblquickdef (word, quickdef, lang) VALUES (:word, :quickdef, :lang)';
    $query = $connect->prepare($sql);
    $query->bindParam(':word', $lookupterm);
    $query->bindParam(':quickdef', $finalqd);
    $query->bindParam(':lang', $lang);
    $query->execute();
    $connect = null;
}
function push2db($lookupword, $finalmean) {
    $connect = dbconn(PROJHOST, DICTDB, PEPPYUSR, PEPPYPWD);
    $sql = 'INSERT IGNORE INTO tbldict (word, mean) VALUES (:word, :mean)';
    $query = $connect->prepare($sql);
    $query->bindParam(':word', $lookupword);
    $query->bindParam(':mean', $finalmean);
    $query->execute();
    $connect = null;
}

?>

The code works fine except for the for loop under the // extract the quick definition section . The function being called from inside this loop is add2qd() which accepts 3 string values as input.

Every time this loop runs, PHP throws a fatal error because it thinks find() is undefined. I know find is a legitimate function in the PHP Simple HTML DOM Parser library because I have used it multiple times in the same code without any problem (in the //retrieve lookup term section). What am I doing wrong?

But your are not using the PHP Simple HTML DOM - only standard PHP DOMDocument, which does not have the method find.

$dom = new DOMDocument;
$dom->loadHTML($meaning);

foreach ($dom->find('div[class=quickdef]') as $qdd) {

http://php.net/manual/en/class.domdocument.php

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM