php Xpath使用innerHTML标签获取innerHTML

Question

I have a HTML file formatted like this:我有一个格式如下的 HTML 文件：

<p class="p1">subject</p>
<p class="p2">detail <span>important</span></p>

<p class="p1">subject</p>
<p class="p2">detail<span>important</span></p>

I wrote a PHP code to automatically get each p1 and it's detail to insert them into my mysql table.我编写了一个 PHP 代码来自动获取每个 p1 并将它们插入到我的 mysql 表中。

this is my code:这是我的代码：

$doc = new DOMDocument();

$doc->loadHTMLFile("file.html");

$xpath = new DomXpath($doc);

$subject = $xpath->query('//p');


for ($i = 0 ; $i < $subject->length-1 ; $i ++) {

if ($subject->item($i)->getAttribute("class") == "p1")
    echo $subject->item($i)->nodeValue;
}
...

This is not my full code, but the problem is:这不是我的完整代码，但问题是：

echo $subject->item($i)->nodeValue;

Which gives me detail important , without the  tag.这给了我detail important ，没有标签。

It is so important to have the span tags around the "important" part of the detail.在细节的“重要”部分周围放置跨度标签非常重要。 is there any function which can do that without getting headache?有没有什么功能可以做到这一点而不会头疼？

Thanks in advance提前致谢

Answer 1

I found the answer to my question :) Thanks to SimpleHTMLDOM我找到了问题的答案 :) 感谢 SimpleHTMLDOM

foreach($html->find('p') as $element) {

 switch ($element->class) {
      case 'p1':
                     $subject = $element;
                     break;
      case 'p2': $detail .= html_entity_decode($element);

 }

} }

the trick is in:诀窍在于：

html_entity_decode($element);

Answer 2

Old query, but there is an one-liner.旧查询，但有一个单行。 The OP should use: OP 应该使用：

$subject = $xpath->query('//p/*');

and then:接着：

echo $doc->saveHtml($subject->item($i));

With the * you'll get the inner html (without the wrapping paragraph tag);使用*您将获得内部 html（没有包装段落标记）； without * you'll get the html with the wrapping paragraph;没有 * 你会得到带有包装段落的 html；

Full example:完整示例：

$html = '<div><p>ciao questa è una <b>prova</b>.</p></div>';
$dom = new DomDocument($html);
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$node = $xpath->query('.//div/*'); // with * you get inner html without surrounding div tag; without * you get inner html with surrounding div tag
$innerHtml = $dom->saveHtml($node);
var_dump($innerHtml);

Output: ciao questa è una prova.输出： ciao questa è una prova.

Answer 3

Whenever I need to parse HTML, I run it through SimpleHTMLDOM:每当我需要解析 HTML 时，我都会通过 SimpleHTMLDOM 运行它：

http://simplehtmldom.sourceforge.net/ http://simplehtmldom.sourceforge.net/

I recommend using version 1.11.我建议使用 1.11 版。 For various reasons, 1.5 is rather broken.由于种种原因，1.5 已经相当破碎了。

php Xpath使用innerHTML标签获取innerHTML

问题描述

3 个解决方案

解决方案1
1 2011-10-29 20:27:27

解决方案2
0 2020-03-13 15:45:40

解决方案3
0 2011-10-22 18:28:12

php Xpath使用innerHTML标签获取innerHTML

问题描述

3 个解决方案

解决方案1 1 2011-10-29 20:27:27

解决方案2 0 2020-03-13 15:45:40

解决方案3 0 2011-10-22 18:28:12

解决方案1
1 2011-10-29 20:27:27

解决方案2
0 2020-03-13 15:45:40

解决方案3
0 2011-10-22 18:28:12