Just when I thought I understood XPath! I must be missing something really simple, but I can't select the value of the node "citedby-count" in the following:
xml <- "<?xml version='1.0' encoding='UTF-8'?>
<search-results xmlns='http://www.w3.org/2005/Atom' xmlns:cto='http://www.elsevier.com/xml/cto/dtd' xmlns:atom='http://www.w3.org/2005/Atom' xmlns:prism='http://prismstandard.org/namespaces/basic/2.0/' xmlns:opensearch='http://a9.com/-/spec/opensearch/1.1/' xmlns:dc='http://purl.org/dc/elements/1.1/'>
<entry>
<prism:url>http://api.elsevier.com/content/abstract/scopus_id/111111</prism:url>
<dc:title>Paper Title</dc:title>
<citedby-count>1</citedby-count>
</entry>
</search-results>"
doc <- xmlParse(xml)
I've tried
doc["//citedby-count"]
and
doc["//{'citedby-count'}"]
and
doc["//entry"]
but all return
list()
attr(,"class")
[1] "XMLNodeSet"
however,
doc["//dc:title"]
works just fine.
Have I just been looking at this too long? Please help!
**Edit:**I thought this was because of the hyphen but it can't be because
doc["//entry"]
doesn't work either.
Common namespace prefix is declared as xmlns:foo="..."
, where foo
is the prefix, and it is used in element name explicitly as <foo:bar>
where bar
is the element's local-name. Apart from that there is default namespace . It is namespace declared without prefix like xmlns="..."
, and the usage is implied on the element where default prefix is declared as well as the descendant elements, unless something is overriding the default namespace inheritance ie having local default namespace or using explicit prefix in the descendant element's name.
That's the first part the story, which is about namespace in XML. On the other hand, XPath has no idea about default namespace. In XPath, element without prefix is always considered in empty namespace. To bridge the difference between XML and XPath regarding default namespace, usually when you need to query element in default namespace, you have to define a prefix pointing to the XML's default namespace and use that prefix in the XPath expression. That's basically what @hrbrmstr suggested in the first comment, something like the following (the prefix can be anything as long as it is mapped to the correct default namespace) :
doc["//d:citedby-count", namespaces=c(d="http://www.w3.org/2005/Atom")]
but turns out that your XML has an explicit prefix, atom
, which already points to the same namespace uri and can be used directly.
您也可以使用doc["//x:citedby-count", namespace = "x"]
来处理默认的名称空间(来自xpathApply
的示例)。
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.