[英]In R, how do I combine two XML documents into one document?
I'm querying data from an XML-based API. 我正在从基于XML的API查询数据。 The API responses are paginated, so I have to make a bunch of queries to get the full data set. API响应是分页的,因此我必须进行一堆查询才能获取完整的数据集。
Using read_xml
from the xml2
package, I can easily make each request and save it as an XML document, but I've been having trouble figuring out how to use the library to combine them into one document. 使用xml2
包中的read_xml
,我可以轻松地发出每个请求并将其保存为XML文档,但是我一直在想出如何使用该库将它们组合成一个文档时遇到了麻烦。 (I would like to do this so I can make the Xpath queries I need to make once instead of 50 times.) (我想这样做,所以我可以使我需要进行一次而不是50次的Xpath查询。)
I've tried creating a new blank document and adding the nodes of others as elements, but the xml_add_child
nor the xml_add_sibling
functions will take a second document as an argument, and neither seem to like being passed the result of an xml_find_all
query. 我尝试创建一个新的空白文档并将其他节点添加为元素,但是xml_add_child
或xml_add_sibling
函数将第二个文档作为参数,而且似乎都不喜欢传递xml_find_all
查询的结果。 (They complain about not being able to work with references.) (他们抱怨无法使用参考。)
So, I'm stumped. 所以,我很困惑。
(Note: I've also not had any success in discovering how to do this with the original XML
package.) (注意:我也没有发现如何使用原始XML
包来完成此操作。)
Consider the XML package to initialize an empty document with <root>
and iteratively append other XML content using addChildren()
method from the root of each XML. 考虑XML包以<root>
初始化一个空文档,并使用addChildren()
方法从每个XML的根开始迭代附加其他XML内容。
library(XML)
doc = newXMLDoc()
root = newXMLNode("root", doc = doc)
# LOOP THROUGH 50 REQUESTS
lapply(seq(50), function(i) {
# PARSE ALL CONTENT
tmp <- xmlParse("/path/to/API/call")
# APPEND FROM API XML ROOT
addChildren(root, getNodeSet(tmp, '/apixmlroot'))
})
# SAVE TO FILE OR USE doc FOR FURTHER WORK
saveXML(doc, file="/path/to/output.xml")
I cannot find a counterpart method in xml2 as its xml_add_child
requires a character string not node(s). 我在xml2中找不到对应的方法,因为它的xml_add_child
需要字符串而不是节点。
After some trial and error, I've figured out how to do this with the xml2
package. 经过一番尝试和错误之后,我已经弄清楚了如何使用xml2
软件包进行此操作。
Let us consider the simple case of two very simple XML documents we'd like to combine together. 让我们考虑将两个非常简单的XML文档合并在一起的简单情况。
doc1 <- read_xml("<items><item>1</item><item>2</item><items>")
doc2 <- read_xml("<items><item>3</item><item>4</item><items>")
(Note: where the documents come from don't matter, the argument to read_xml
is anything it can read.) (注意:文档来自哪里都没有关系, read_xml
的参数是它可以读取的任何参数。)
To combine them together, simply do the following: 要将它们组合在一起,只需执行以下操作:
doc2children <- xml_children(doc2)
for (child in doc2children) {
xml_add_child(doc1, child)
}
Now when you look at doc1 you should see this: 现在,当您查看doc1时,应该看到以下内容:
> doc1
{xml_document}
<items>
[1] <item>\n 1</item>
[2] <item>\n 2</item>
[3] <item>\n 3</item>
[4] <item>\n 4</item>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.