简体   繁体   English

使用DOMXPath PHP进行文本搜索

[英]Text searching with DOMXPath PHP

html HTML

<td class="one">
  <div>
    <b>
      <span>item</span>
    </b>
    <div>
      <c>text</c>
    </div>
  </div>
</td>


How do you select and echo item by searching for text ? 如何通过搜索文本选择和回显项目

I'm having difficulty with the xpath line in PHP. 我在使用PHP的xpath行时遇到了困难。

$c = $xpath->query("*/c");


php PHP

<?php
$keyword = "String";
$search = strtolower($keyword);

$target_url = "http://www.example.com/";

//USER AGENT
//$userAgent = 'spider';
$userAgent = 'Googlebot/2.1 (http://www.googlebot.com/bot.html)';

$ch = curl_init();
$options = array(CURLOPT_USERAGENT   => $userAgent,
                CURLOPT_URL             => $target_url,
                CURLOPT_HEADER          => false,
                CURLOPT_FAILONERROR     => true,
                CURLOPT_FOLLOWLOCATION  => true,
                CURLOPT_AUTOREFERER     => true,
                CURLOPT_RETURNTRANSFER  => true,
                CURLOPT_TIMEOUT         => 20
                );

curl_setopt_array($ch, $options);
$html= curl_exec($ch);

if (!$html)
{
    echo "ERROR NUMBER: ".curl_errno($ch);
    echo "ERROR: ".curl_error($ch);
    exit;
}
curl_close($ch);


$dom = new DOMDocument();
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$c = $xpath->query("*/c");


foreach($c as $a) { 
    $text = $a->nodeValue;
    echo($text . '<br />');
}


//echo '<pre>';
//print_r($c);
//echo '</pre>';    
?>

Since HTML defines no c element , you won't be able to use DOMDocument::loadHTML unless you also supply the LIBXML_HTML_NOIMPLIED constant , like so: 由于HTML没有定义c元素DOMDocument::loadHTML除非提供了LIBXML_HTML_NOIMPLIED常量 ,否则您将无法使用DOMDocument::loadHTML ,如下所示:

$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED);

This sets an appropriate libxml flag to allow you traverse your document without the element checking. 这将设置适当的libxml标志,以允许您遍历文档而无需检查元素。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM