[英]Convert specific node of XML to CSV with R
I have XML file structured like this我的 XML 文件结构如下
<Item>
<Tab1>
<Info1>x<Info1>
<Info2>y<Info2>
<Info3>z<Info3>
</Tab1>
<Tab2>
<Info1>foo<Info1>
<Info2>bar<Info2>
<Info3>foobar<Info3>
</Tab2>
</Item>
<Item>
<Tab1>
<Info1>x<Info1>
<Info2>y<Info2>
<Info3>z<Info3>
</Tab1>
<Tab2>
<Info1>foo<Info1>
<Info2>bar<Info2>
<Info3>foobar<Info3>
</Tab2>
</Item>
Using this code使用此代码
file <- "file.xml"
doc <- xmlParse(file, useInternalNodes = TRUE)
xmldataframe <- xmlToDataFrame(doc)
I get dataframe that looks like this我得到看起来像这样的 dataframe
Tab1 Tab2
xyz foobarfoobar
But I need only Tab2 info in separate columns.但我只需要单独列中的 Tab2 信息。 How can I get the following result?我怎样才能得到以下结果?
Info1 Info2 Info3
foo bar foobar
You need to fix up your example XML first (it is currently invalid), but given that you have a vaild XML document, you can use xpath to pull out specific nodes.您需要先修复您的示例 XML(目前无效),但鉴于您有一个有效的 XML 文档,您可以使用 xpath 拉出特定节点。 You are usng the XML package it seems (helpful to specify this in a question) so the function XML::getNodeSet
is what you need: You are usng the XML package it seems (helpful to specify this in a question) so the function XML::getNodeSet
is what you need:
file <- "file.xml"
doc <- XML::xmlParse(file, useInternalNodes = TRUE)
tab2 <- XML::getNodeSet(doc, "//Tab2")
xmldataframe <- XML::xmlToDataFrame(tab2)
This gives only the nodes you are looking for.这仅提供您正在寻找的节点。
> xmldataframe
Info1 Info2 Info3
1 foo bar foobar
2 foo bar foobar
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.