[英]Convert XML into data frame in R
In an XML response I get from the web there is one part of the data that is structured as follows: 从Web上获得的XML响应中,数据的一部分结构如下:
<votos>
<Deputado Nome="Roberto Britto" ideCadastro="141529" Partido="PP " UF="BA" Voto="Sim "/>
<Deputado Nome="Luiz Argôlo" ideCadastro="160547" Partido="PP " UF="BA" Voto="Sim "/>
<Deputado Nome="José Carlos Araújo" ideCadastro="74140" Partido="PSD " UF="BA" Voto="Sim "/>
</votos>
I parsed the file as follows: 我将文件解析如下:
doc <- xmlTreeParse(raw_result,useInternal=TRUE)
rootNode <- xmlRoot(doc)
And then tried to create a data frame of the node I showed in the beginning as follows: 然后尝试创建我一开始显示的节点的数据框,如下所示:
rootvotacao <- rootNode[[4]][[1]][[2]]
votacao2 <- xmlSApply(rootvotacao, function(x) xmlSApply(x, xmlValue))
votacao2_df <- data.frame(t(votacao2),row.names=NULL)
However, I only get a table with two columns for each Deputado
and one row containing list()
. 但是,我只得到一张表,其中每个
Deputado
具有两列,而其中的一行包含list()
。
What I wanted is to get a table, a row for each Deputado
and 5 columns: Nome
, ideCadastro
, Partido
, UF
, Voto
. 我想要的是获取一张桌子,每个
Deputado
一行,并有5列: Nome
, ideCadastro
, Partido
, UF
, Voto
。
Any thoughts? 有什么想法吗? Thanks!
谢谢!
You can use XML
package and xmlToList
to get this done. 您可以使用
XML
包和xmlToList
完成此操作。
library(XML)
raw_result <- '<votos>
<Deputado Nome="Roberto Britto" ideCadastro="141529" Partido="PP " UF="BA" Voto="Sim "/>
<Deputado Nome="Luiz Argôlo" ideCadastro="160547" Partido="PP " UF="BA" Voto="Sim "/>
<Deputado Nome="José Carlos Araújo" ideCadastro="74140" Partido="PSD " UF="BA" Voto="Sim "/>
</votos>'
#i faced encoding issue hence
raw_result <- iconv(raw_result,'latin1','utf-8')
do.call(rbind,xmlToList(raw_result))
Output: 输出:
> do.call(rbind,xmlToList(raw_result))
Nome ideCadastro Partido UF Voto
Deputado "Roberto Britto" "141529" "PP " "BA" "Sim "
Deputado "Luiz Argôlo" "160547" "PP " "BA" "Sim "
Deputado "José Carlos Araújo" "74140" "PSD " "BA" "Sim "
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.