简体   繁体   中英

Convert XML into data frame in R

In an XML response I get from the web there is one part of the data that is structured as follows:

<votos>
<Deputado Nome="Roberto Britto" ideCadastro="141529" Partido="PP " UF="BA" Voto="Sim "/>
<Deputado Nome="Luiz Argôlo" ideCadastro="160547" Partido="PP " UF="BA" Voto="Sim "/>
<Deputado Nome="José Carlos Araújo" ideCadastro="74140" Partido="PSD " UF="BA" Voto="Sim "/>
</votos>

I parsed the file as follows:

doc <- xmlTreeParse(raw_result,useInternal=TRUE)
rootNode <- xmlRoot(doc)

And then tried to create a data frame of the node I showed in the beginning as follows:

rootvotacao <- rootNode[[4]][[1]][[2]]
votacao2 <- xmlSApply(rootvotacao, function(x) xmlSApply(x, xmlValue))
votacao2_df <- data.frame(t(votacao2),row.names=NULL)

However, I only get a table with two columns for each Deputado and one row containing list() .

What I wanted is to get a table, a row for each Deputado and 5 columns: Nome , ideCadastro , Partido , UF , Voto .

Any thoughts? Thanks!

You can use XML package and xmlToList to get this done.

library(XML)

raw_result <- '<votos>
<Deputado Nome="Roberto Britto" ideCadastro="141529" Partido="PP " UF="BA" Voto="Sim "/>
<Deputado Nome="Luiz Argôlo" ideCadastro="160547" Partido="PP " UF="BA" Voto="Sim "/>
<Deputado Nome="José Carlos Araújo" ideCadastro="74140" Partido="PSD " UF="BA" Voto="Sim "/>
</votos>'

#i faced encoding issue hence

raw_result <- iconv(raw_result,'latin1','utf-8')



do.call(rbind,xmlToList(raw_result))

Output:

> do.call(rbind,xmlToList(raw_result))
         Nome                   ideCadastro Partido UF   Voto  
Deputado "Roberto Britto"       "141529"    "PP "   "BA" "Sim "
Deputado "Luiz Argôlo"         "160547"    "PP "   "BA" "Sim "
Deputado "José Carlos Araújo" "74140"     "PSD "  "BA" "Sim "

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM