Hello guys, I need to load an xml file into a data frame in R. The xml format is as shown below. How do I acheive the same?
<?xml version="1.0" encoding="utf-8"?><posts> <row Id="1" PostTypeId="1" AcceptedAnswerId="17" CreationDate="2010-07-26T19:14:18.907" Score="6"/></posts>
I tried the below code....It does not give the desired output. I am expecting a tabular output with the column names and their values listed below.
library(XML)
xml.url ="test.xml"
xmlfile = xmlTreeParse(xml.url)
class(xmlfile)
xmltop=xmlRoot(xmlfile)
print(xmltop)[1:2]
plantcat <- xmlSApply(xmltop, function(x) xmlSApply(x, xmlValue))
plantcat_df <- data.frame(t(plantcat))
xml.text <-
'<?xml version="1.0" encoding="utf-8"?>
<posts>
<row Id="1" PostTypeId="1" AcceptedAnswerId="17" CreationDate="2010-07-26T19:14:18.907" Score="6"/>
<row Id="2" PostTypeId="1" AcceptedAnswerId="17" CreationDate="2010-07-26T19:14:18.907" Score="6"/>
<row Id="3" PostTypeId="1" AcceptedAnswerId="17" CreationDate="2010-07-26T19:14:18.907" Score="6"/>
<row Id="4" PostTypeId="1" AcceptedAnswerId="17" CreationDate="2010-07-26T19:14:18.907" Score="6"/>
</posts>'
library(XML)
xml <- xmlParse(xml.text)
result <- as.data.frame(t(xmlSApply(xml["/posts/row"],xmlAttrs)),
stringsAsFactors=FALSE)
# Id PostTypeId AcceptedAnswerId CreationDate Score
# 1 1 1 17 2010-07-26T19:14:18.907 6
# 2 2 1 17 2010-07-26T19:14:18.907 6
# 3 3 1 17 2010-07-26T19:14:18.907 6
# 4 4 1 17 2010-07-26T19:14:18.907 6
This is a bit trickier than usual because the data is in attributes, not nodes (the nodes are empty), so we can't use xlmToDataFrame(...)
unfortunately.
All the data above is still character, so you still need to convert the columns to whatever class is appropriate.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.