如何使用 PHP 解析帶有冒號標記的 XML 節點

Question

我正在嘗試從 [此 URL（加載需要相當長的時間）][1] 中獲取以下節點的值。 我感興趣的元素是：

title, g:price and g:gtin

XML 開始是這樣的：

<rss xmlns:g="http://base.google.com/ns/1.0" version="2.0">
  <channel>
    <title>PhotoSpecialist.de</title>
    <link>http://www.photospecialist.de</link>
    <description/>
    <item>
      <g:id>BEN107C</g:id>
      <title>Benbo Trekker Mk3 + Kugelkopf + Tasche</title>
      <description>
        Benbo Trekker Mk3 + Kugelkopf + Tasche Das Benbo Trekker Mk3 ist eine leichte Variante des beliebten Benbo 1. Sein geringes Gewicht macht das Trekker Mk3 zum idealen Stativ, wenn Sie viel draußen fotografieren und viel unterwegs sind. Sollten Sie in eine Situation kommen, in der maximale Stabilität zählt, verfügt das Benbo Trekker Mk3 über einen Haken an der Mittelsäule. An diesem können Sie das Stativ mit zusätzlichem Gewicht bei Bedarf beschweren. Dank der zwei besonderen Kamera-Befestigungsschrauben können Sie mit dem Benbo Trekker Mk3 sehr nah am Boden fotografieren. So nah, dass in vielen Fällen die einzige Einschränkung die Größe Ihrer Kamera darstellt. In diesem Set erhalten Sie das Benbo Trekker Mk3 zusammen mit einem Kugelkopf, Socket und einer Tasche für den sicheren und komfortablen Transport.
      </description>
      <link>
        http://www.photospecialist.de/benbo-trekker-mk3-kugelkopf-tasche?dfw_tracker=2469-16
      </link>
      <g:image_link>http://static.fotokonijnenberg.nl/media/catalog/product/b/e/benbo_trekker_mk3_tripod_kit_with_b__s_head__bag_ben107c1.jpg</g:image_link>
      <g:price>199.00 EUR</g:price>
      <g:condition>new</g:condition>
      <g:availability>in stock</g:availability>
      <g:identifier_exists>TRUE</g:identifier_exists>
      <g:brand>Benbo</g:brand>
      <g:gtin>5022361100576</g:gtin>
      <g:item_group_id>0</g:item_group_id>
      <g:product_type>Tripod</g:product_type>
      <g:mpn/>
      <g:google_product_category>Kameras & Optik</g:google_product_category>
    </item>
  ...
  </channel>
</rss>

為此，我編寫了以下代碼：

$z = new XMLReader;
$z->open('https://my.datafeedwatch.com/static/files/1248/8222ebd3847fbfdc119abc9ba9d562b2cdb95818.xml');

$doc = new DOMDocument;

while ($z->read() && $z->name !== 'item')
    ;

while ($z->name === 'item')
{
    $node = new SimpleXMLElement($z->readOuterXML());
    $a = $node->title;
    $b = $node->price;
    $c = $node->gtin;
    echo $a . $b . $c . "<br />";
    $z->next('item');
}

這僅返回標題...price 和 gtin 未顯示。

Answer 1

您詢問的元素不是默認命名空間的一部分，而是位於不同的命名空間中。 您可以看到，因為它們的名稱中有一個前綴，以冒號分隔：

  ...
  <channel>
    <title>PhotoSpecialist.de</title>
    <!-- title is in the default namespace, no colon in the name -->
    ...
    <g:price>199.00 EUR</g:price>
    ...
    <g:gtin>5022361100576</g:gtin>
    <!-- price and gtin are in a different namespace, colon in the name and prefixed by "g" -->
  ...

命名空間帶有前綴，在您的情況下為“g”。 命名空間所代表的前綴在此處的文檔元素中定義：

<rss xmlns:g="http://base.google.com/ns/1.0" version="2.0">

所以命名空間是“ http://base.google.com/ns/1.0 ”。

當您像當前一樣使用SimpleXMLElement通過名稱訪問子元素時：

$a = $node->title;
$b = $node->price;
$c = $node->gtin;

您只查看默認命名空間。 因此，只有第一個元素實際上包含文本，另外兩個是建立在，你飛和尚未清空。

要訪問命名空間子元素，您需要使用children()方法顯式告訴SimpleXMLElement 。 它創建了一個新的SimpleXMLElement，其中包含該命名空間中的所有子元素，而不是默認的：

$google = $node->children("http://base.google.com/ns/1.0");

$a = $node->title;
$b = $google->price;
$c = $google->gtin;

孤立的例子就這么多（是的，就是這樣）。

一個完整的例子可能看起來像（包括閱讀器上的節點擴展，你的代碼有點生疏）：

<?php
/**
 * How to parse an XML node with a colon tag using PHP
 *
 * @link http://stackoverflow.com/q/29876898/367456
 */
const HTTP_BASE_GOOGLE_COM_NS_1_0 = "http://base.google.com/ns/1.0";

$url = 'https://my.datafeedwatch.com/static/files/1248/8222ebd3847fbfdc119abc9ba9d562b2cdb95818.xml';

$reader = new XMLReader;
$reader->open($url);

$doc = new DOMDocument;

// move to first item element
while (($valid = $reader->read()) && $reader->name !== 'item') ;

while ($valid) {
    $default    = simplexml_import_dom($reader->expand($doc));
    $googleBase = $default->children(HTTP_BASE_GOOGLE_COM_NS_1_0);
    printf(
        "%s - %s - %s<br />\n"
        , htmlspecialchars($default->title)
        , htmlspecialchars($googleBase->price)
        , htmlspecialchars($googleBase->gtin)
    );

    // move to next item element
    $valid = $reader->next('item');
};

我希望這既能給出解釋，又能稍微拓寬對XMLReader使用的看法。

Answer 2

如果主標記是帶有冒號的字符串，則必須使用

$xml->next($xml->localName);

移動到下一個項目元素。

如何使用 PHP 解析帶有冒號標記的 XML 節點

問題描述

2 個解決方案

解決方案1
11 已采納 2015-04-26 12:17:26

解決方案2
0 2021-02-10 07:56:48

如何使用 PHP 解析帶有冒號標記的 XML 節點

問題描述

2 個解決方案

解決方案1 11 已采納 2015-04-26 12:17:26

解決方案2 0 2021-02-10 07:56:48

解決方案1
11 已采納 2015-04-26 12:17:26

解決方案2
0 2021-02-10 07:56:48