I'm facing this issue, I could not read an .xml
file to make it as a data.frame
in R. I know that this question have already great answers here and here , but I'm not able to decline the answers to my necessity, so sorry if it's duplicate.
I have a .xml
like this:
<?xml version='1.0' encoding='UTF-8'?>
<LexicalResource>
<GlobalInformation label="Created with the standard propagation algorithm"/>
<Lexicon languageCoding="UTF-8" label="sentiment" language="-">
<LexicalEntry id="id_0" partOfSpeech="adj">
<Lemma writtenForm="word"/>
<Sense>
<Confidence score="0.333333333333" method="automatic"/>
<Sentiment polarity="negative"/>
<Domain/>
</Sense>
</LexicalEntry>
</Lexicon>
</LexicalResource>
Stored locally. So i tried this way:
library(XML)
doc<-xmlParse("...\\test2.xml")
xmldf <- xmlToDataFrame(nodes=getNodeSet(doc,"//LexicalEntry/Lemma/Sense/Confidence/Sentiment"))
but the result is this:
> xmldf
data frame with 0 columns and 0 rows
So I tried the xml2
package:
library(xml2)
pg <- read_xml("...test2.xml")
recs <- xml_find_all(pg, "LexicalEntry")
> recs
{xml_nodeset (0)}
I have a lack of knowledge in manipulating .xml
files, so I think I'm missing the point. What am I doing wrong?
You need the attributes, not the values, that's why the methods you have used do not work, try something like this:
data.frame(as.list(xpathApply(doc, "//Lemma", fun = xmlAttrs)[[1]]),
as.list(xpathApply(doc, "//Confidence", fun = xmlAttrs)[[1]]),
as.list(xpathApply(doc, "//Sentiment", fun = xmlAttrs)[[1]]))
writtenForm score method polarity
1 word 0.333333333333 automatic negative
Another option is to get all the attributes of the xml and build with them a data.frame:
df <- data.frame(as.list(unlist(xmlToList(doc, addAttributes = TRUE, simplify = TRUE))))
colnames(df) <- unlist(lapply(strsplit(colnames(df), "\\."), function(x) x[length(x)]))
df
label writtenForm score method
1 Created with the standard propagation algorithm word 0.333333333333 automatic
polarity id partOfSpeech languageCoding label language
1 negative id_0 adj UTF-8 sentiment -
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.