简体   繁体   English

如何处理 PHP DOMXPath 中的多个子元素?

[英]How to handle multiple sub-elements in PHP DOMXPath?

I'd like to import an HTML document onto a MySQL database using PHP.我想使用 PHP 将 HTML 文档导入 MySQL 数据库。

The structure of the document looks like this:文档的结构如下所示:

<p class="word">
<span class="word-text">word1</span>
<span class="grammatical-type">noun</span>
</p>
...
<p class="word">
<span class="word-text">word128</span>
<span class="grammatical-type">adjective</span>
</p>

For each word , I only have one word-text and one grammatical-type .对于每个单词,我只有一个word-text和一个grammatical-type

I'm able to find each word node, but for each of its children word-text and grammatical-type I'd like to perform a MySQL query:我能够找到每个单词节点,但是对于它的每个子单词文本语法类型,我想执行 MySQL 查询:

$dom = new DOMDocument();
            $dom->loadHTMLFile($location);
            $xpath = new DomXPath($dom);
            $res = $xpath->query("//p[@class='word']");
            foreach ($res as $textNode) {
                //do something here for each *word-text*->nodeValue
                //do something here for each *grammatical-type*->nodeValue
                }
            }

I tried in the foreach loop to pass $textNode , which is a DOMNode , as a $contextNode as follows:我尝试在foreach循环中将$textNode (一个DOMNode )作为$contextNode ,如下所示:

$wordText = $xpath->query("span[@class='word-text']", $textNode);
$myWord = $wordText->nodeValue;

But in $wordText I only have a DOMNodeList with a NULL nodeValue .但是在 $wordText 我只有一个带有NULL nodeValue的 DOMNodeList 。

How can I, starting from the word node, manage the children nodes?如何从单词节点开始管理子节点?

Thanks谢谢

Solved.解决了。

You just need to, as you know that the node only contains a single element, select this single element using item(0) :你只需要,因为你知道node只包含一个元素, select 这个单个元素使用item(0)

$dom = new DOMDocument();
            $dom->loadHTMLFile($location);
            $xpath = new DomXPath($dom);
            $res = $xpath->query("//p[@class='word']");
            foreach ($res as $textNode) {
                $wordTextNode = $xpath->query("span[@class='word-text']", $textNode);
                $word = $wordTextNode->item(0)->nodeValue;

                //do same thing here for each *grammatical-type*
                }
            }

You can provide different node as context in your $xpath->query calls:您可以在$xpath->query调用中提供不同的节点作为上下文:

<?php

$location = 'so-dom.html';
$dom = new DOMDocument();
            $dom->loadHTMLFile($location);
            $xpath = new DomXPath($dom);
            $res = $xpath->query("//p[@class='word']");
            foreach ($res as $textNode) {
                echo $xpath->query('./a/text()', $textNode)[0]->nodeValue;
                                                //^^^^^^^^^
                };

?>

Where doc is文档在哪里

<head></head>
<body>
  <p class="word"><a>one</a></p>
  <p class="word"><a>two</a></p>
</body>

will print "onetwo"将打印“onetwo”

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM