简体   繁体   English

如何使用PHP simplexml获取XML名称空间属性

[英]How to get XML namespace attributes with PHP simplexml

I'm pretty new to this, and I've followed several tutorials (including other OS questions ), but I can't seem to get this to work. 我对此还很陌生,并且已经遵循了一些教程(包括其他OS问题 ),但似乎无法使它正常工作。

I'm working with a library's EAD file (Library of Congress XML standard for describing library collections, http://www.loc.gov/ead/index.html ), and I'm having trouble with the namespaces. 我正在使用图书馆的EAD文件(用于描述图书馆收藏的美国国会图书馆XML标准, http://www.loc.gov/ead/index.html ),并且在命名空间方面遇到了麻烦。

A simplified example of the XML: XML的简化示例:

<?xml version="1.0"?>
<ead xsi:schemaLocation="urn:isbn:1-931666-22-9 http://www.loc.gov/ead/ead.xsd"  xmlns:ns2="http://www.w3.org/1999/xlink" xmlns="urn:isbn:1-931666-22-9" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

<c02 id="ref24" level="item">
                <did>
                    <unittitle>"Lepidoptera and seas(on) of appearance"</unittitle>
                    <unitid>1</unitid>
                    <container id="cid71717" type="Box" label="Mixed materials">1</container>
                    <physdesc>
                        <extent>Pencil</extent>
                    </physdesc>
                    <unitdate>[1817]</unitdate>
                </did>
                <dao id="ref001" ns2:actuate="onRequest" ns2:show="embed" ns2:role="" ns2:href="http://diglib.amphilsoc.org/fedora/repository/graphics:92"/>
            </c02>
            <c02 id="ref25" level="item">
                <did>
                    <unittitle>Argus carryntas (Butterfly)</unittitle>
                    <unitid>2</unitid>
                    <container id="cid71715" type="Box" label="Mixed materials">1</container>
                    <physdesc>
                        <extent>Watercolor</extent>
                    </physdesc>
                    <unitdate>[1817]</unitdate>
                </did>
                <dao ns2:actuate="onRequest" ns2:show="embed" ns2:role="" ns2:href="http://diglib.amphilsoc.org/fedora/repository/graphics:87"/>
            </c02>

Following advise I found elsewhere, I was trying this (and variations on this theme): 遵循我在其他地方找到的建议,我正在尝试这样做(以及此主题的变体):

<?php 
$entries = simplexml_load_file('test.xml');        
    foreach ($entries->c02->children('http://www.w3.org/1999/xlink') as $entry) {

      echo 'link: ', $entry->children('dao', true)->href, "\n";

  }
 ?> 

Which, of course, isn't working. 当然,这是行不通的。

You have to understand the difference between a namespace and a namespace prefix. 您必须了解名称空间和名称空间前缀之间的区别。 The namespace is the value inside the xmlns attributes. 命名空间是xmlns属性内的值。 The xmlns attributes define the prefix, which is an alias for the actual namespace for that node and its descendants. xmlns属性定义前缀,该前缀是该节点及其后代的实际名称空间的别名。

In you example are three namespaces: 在您的示例中,有三个名称空间:

So elements and attributes starting with "ns2:" are inside the xlink namespace, elements and attributes starting with "xsi:" in the XML schema instance namespace. 因此,以“ ns2:”开头的元素和属性位于xlink命名空间中,而以XML模式实例命名空间中以“ xsi:”开头的元素和属性。 All elements without an namespace prefix are in the isbn specific namespace. 所有没有名称空间前缀的元素都在isbn特定的名称空间中。 Attributes without a namespace prefix are always in NO namespace. 没有名称空间前缀的属性始终位于NO名称空间中。

If you query the xml dom, you need to define your own namespaces prefixes. 如果查询xml dom,则需要定义自己的名称空间前缀。 The namespace prefixes in the xml documents can change, especially if they are external resources. xml文档中的名称空间前缀可以更改,特别是如果它们是外部资源。

I don't use "SimpleXML", so here is an DOM example: 我不使用“ SimpleXML”,所以这是一个DOM示例:

<?php

$xml = <<<'XML'
<?xml version="1.0"?>
<ead 
  xsi:schemaLocation="urn:isbn:1-931666-22-9 http://www.loc.gov/ead/ead.xsd"
  xmlns:ns2="http://www.w3.org/1999/xlink" 
  xmlns="urn:isbn:1-931666-22-9" 
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <c02 id="ref24" level="item">
    <did>
       <unittitle>"Lepidoptera and seas(on) of appearance"</unittitle>
    </did>
  </c02>
</ead>
XML;

// create dom and load the xml
$dom = new DOMDocument();
$dom->loadXml($xml);
// create an xpath object
$xpath = new DOMXpath($dom);
// register you own namespace prefix
$xpath->registerNamespace('isbn', 'urn:isbn:1-931666-22-9');

foreach ($xpath->evaluate('//isbn:unittitle', NULL, FALSE) as $node) {
  var_dump($node->textContent);
}

Output: 输出:

string(40) ""Lepidoptera and seas(on) of appearance""

Xpath is quite powerful and the most comfortable way to extract data from XML. Xpath非常强大,并且是从XML提取数据的最舒适的方法。

The default namespace in you case is weird. 在这种情况下,默认名称空间很奇怪。 It looks like it is dynamic, so you might need a way to read it. 它看起来像是动态的,因此您可能需要一种阅读方法。 Here is the Xpath for that: 这是为此的Xpath:

$defaultNamespace = $xpath->evaluate('string(/*/namespace::*[name() = ""])');

It reads the namespace without a prefix from the document element. 它从文档元素中读取没有前缀的名称空间。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM