[英]nested XML to data frame in R
Hello I'm new to R and XML files. 您好,我是R和XML文件的新手。
I'm trying to get this XML SOAP response into a dataframe: 我正在尝试将此XML SOAP响应放入数据框:
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<PrepareDataByClientResponse xmlns="urn:HM-schema">
<PrepareDataByClientResult>
<READOUT>
<SerialNumber>1728527</SerialNumber>
<Date>1510505992000</Date>
<Type>1</Type>
<Value>78.2</Value>
<Status>OK</Status>
</READOUT>
<READOUT>
<SerialNumber>1728527</SerialNumber>
<Date>1510509592000</Date>
<Type>1</Type>
<Value>76.87</Value>
<Status>OK</Status>
</READOUT>
<READOUT>
<SerialNumber>1728527</SerialNumber>
<Date>1510513192000</Date>
<Type>1</Type>
<Value>75.61</Value>
<Status>OK</Status>
</READOUT>
<READOUT>
<SerialNumber>e2ddeed13b4cc4d132f8c6a67d67eed3</SerialNumber>
<Date>4531528776000</Date>
<Type>3</Type>
<Value>230.68</Value>
<Status>OK</Status>
</READOUT>
</PrepareDataByClientResult>
</PrepareDataByClientResponse>
</soap:Body>
</soap:Envelope>
I have tried several options like: 我尝试了几种选择,例如:
xmlout <- do.call(rbind, xpathApply(xmldoc,'//soap:Envelope/soap:Body/PrepareDataByClientResponse', xmlToDataFrame))
xmlout <- as.data.frame(t(xpathSApply(xmldoc,"//readout",function(x) xmlSApply(x,xmlValue))))
xmlout <- as.data.frame(t(xmlSApply(xmldoc["/PrepareDataByClientResponse/PrepareDataByClientResult/READOUT"],xmlAttrs)),stringsAsFactors=FALSE)
xmlout <- ldply(xmlToList(xmldoc), data.frame)
After extensive research in SO and other google searches, I have been unable to produce the desired results. 经过对SO和其他Google搜索的广泛研究,我一直无法产生理想的结果。 All I can get is a data frame with a single row and all the observations in a different column each.
我所能得到的是一个只有一行的数据帧,所有观察值都在不同的列中。
I'm trying to get a table of READOUTS like: 我正在尝试获取READOUTS表,例如:
SerialNumber Date Type Value Status
1 1728527 1510505992000 1 78.2 OK
2 1728527 1510509592000 1 76.87 OK
3 1728527 1510513192000 1 75.61 OK
Is there any way to get this sort of table to work? 有什么办法可以使这种表起作用?
Thanks in advance. 提前致谢。
Because you have a default namespace at the <PrepareDataByClientResponse>
tag (ie, xmlns
without a colon separated prefix), all its children follow under this default namespace. 因为您在
<PrepareDataByClientResponse>
标记处具有默认名称空间(即,不带冒号分隔前缀的xmlns
),所以其所有子级都遵循该默认名称空间。
To parse the <READOUT>
tags, consider declaring a prefix to use in a getNodeSet()
call. 要解析
<READOUT>
标签,请考虑声明要在getNodeSet()
调用中使用的前缀。 Below uses nm . 下面使用nm 。 Such a call can then be used inside the convenience method
xmlToDataFrame
which can easily migrate relatively flat XML like you have into dataframes: 然后可以在方便的方法
xmlToDataFrame
使用这样的调用,该方法可以轻松地将相对扁平的XML像您一样迁移到数据帧中:
library(XML)
doc <- xmlParse('/path/to/SOAP/Response.xml')
df <- xmlToDataFrame(doc, nodes=getNodeSet(doc, "//nm:READOUT",
namespaces=c(nm="urn:HM-schema")))
df
# SerialNumber Date Type Value Status
# 1 1728527 1510505992000 1 78.2 OK
# 2 1728527 1510509592000 1 76.87 OK
# 3 1728527 1510513192000 1 75.61 OK
# 4 e2ddeed13b4cc4d132f8c6a67d67eed3 4531528776000 3 230.68 OK
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.