[英]XML convert to JSON R
R中的軟件包在將XML轉換為JSON時似乎無法正常工作。 我已經嘗試過“ XML”包的RJSONIO,rjson和jsonlite。 我首先解析XML,然后使用XML :: xmlToList()將其轉換為列表,然后使用這3個包中的toJSON()將其轉換為JSON。
我的XML文件:
<?xml version="1.0" encoding="utf-8"?>
<votes>
<row Id="1" PostId="1" VoteTypeId="2" CreationDate="2014-05-13T00:00:00.000" />
<row Id="2" PostId="1" VoteTypeId="2" CreationDate="2014-05-13T00:00:00.000" />
<row Id="3" PostId="3" VoteTypeId="2" CreationDate="2014-05-13T00:00:00.000" />
</votes>
我的源代碼:
library(XML)
library(RJSONIO)
library(rjson)
library(jsonlite)
xml_parse <- xmlTreeParse("~/Downloads/test.xml", useInternalNodes=TRUE)
xml_root <- xmlRoot(xml_parse)
xml_list <- xmlToList(xml_root, simplify = TRUE)
#jsonlite package
xml_jsonlite <- jsonlite::toJSON(xml_list)
write(xml_jsonlite, "test_jsonlite.json")
#RJSONIO package
xml_rjsonio <- RJSONIO::toJSON(xml_list)
write(xml_rjsonio, "test_rjsonio.json")
#rjson package
xml_rjson <- RJSONIO::toJSON(xml_list)
write(xml_rjson, "test_rjson.json")
從RJSONIO轉換的JSON文件:
{
"row": {
"Id": "98",
"PostId": "10",
"VoteTypeId": "2",
"CreationDate": "2014-05-14T00:00:00.000"
},
"row": {
"Id": "99",
"PostId": "7",
"VoteTypeId": "5",
"UserId": "111",
"CreationDate": "2014-05-14T00:00:00.000"
}
}
由於字段名重復,這顯然是錯誤的。
從jsonlite轉換的JSON文件:
{"row":["1","1","2","2014-05-13T00:00:00.000"],
"row.1":["2","1","2","2014-05-13T00:00:00.000"],
"row.2":["3","3","2","2014-05-13T00:00:00.000"]}
這很奇怪,因為應該只有一個字段名稱“ row”帶有子文檔數組,而不是遞增“ rows”數組。 它甚至沒有字段名。
從rjson轉換的JSON文件:
{
"row": {
"Id": "1",
"PostId": "1",
"VoteTypeId": "2",
"CreationDate": "2014-05-13T00:00:00.000"
},
"row": {
"Id": "2",
"PostId": "1",
"VoteTypeId": "2",
"CreationDate": "2014-05-13T00:00:00.000"
}
}
理想的JSON文件將是這樣的:
{"votes" : {
"row" : [
{
"Id" : "1",
"PostId" : "1",
"VoteTypeId" : "2",
"CreationDate" : "2014-05-13T00:00:00.000"
},
{
"Id" : "2",
"PostId" : "1",
"VoteTypeId" : "2",
"CreationDate" : "2014-05-13T00:00:00.000"
}
]
}
}
尋找解決方案。 任何幫助表示贊賞。
xml2
和jsonlite
能為您提供大部分幫助,但是您甚至還沒有向我們展示,您知道R代碼確實為此嘗試了解決方案,所以這里發布了部分解決方案,它可以幫助其他人:
library(xml2)
library(jsonlite)
read_xml('<?xml version="1.0" encoding="utf-8"?>
<votes>
<row Id="1" PostId="1" VoteTypeId="2" CreationDate="2014-05-13T00:00:00.000" />
<row Id="2" PostId="1" VoteTypeId="2" CreationDate="2014-05-13T00:00:00.000" />
<row Id="3" PostId="3" VoteTypeId="2" CreationDate="2014-05-13T00:00:00.000" />
</votes>') -> doc
x <- xml2::as_list(doc)
xl <- lapply(x, attributes)
toJSON(xl, pretty = TRUE, auto_unbox = TRUE)
## {
## "row": {
## "Id": "1",
## "PostId": "1",
## "VoteTypeId": "2",
## "CreationDate": "2014-05-13T00:00:00.000"
## },
## "row.1": {
## "Id": "2",
## "PostId": "1",
## "VoteTypeId": "2",
## "CreationDate": "2014-05-13T00:00:00.000"
## },
## "row.2": {
## "Id": "3",
## "PostId": "3",
## "VoteTypeId": "2",
## "CreationDate": "2014-05-13T00:00:00.000"
## }
## }
根據您的評論
您想要的不是數據的結構方式。 這意味着,如果您想要某些東西,則不能使用罐頭香草實用程序。
xml_find_all(doc, "//votes/row") %>%
map_chr(~{
toJSON(as.list(xml_attrs(.x)), auto_unbox = TRUE, pretty = TRUE)
}) %>%
paste0(collapse=",\n") %>%
gsub("[\n]", "\n ", .) %>%
sprintf('{ "votes" : {\n row" : [\n %s]\n }\n}', .) %>%
cat()
## { "votes" : {
## row" : [
## {
## "Id": "1",
## "PostId": "1",
## "VoteTypeId": "2",
## "CreationDate": "2014-05-13T00:00:00.000"
## },
## {
## "Id": "2",
## "PostId": "1",
## "VoteTypeId": "2",
## "CreationDate": "2014-05-13T00:00:00.000"
## },
## {
## "Id": "3",
## "PostId": "3",
## "VoteTypeId": "2",
## "CreationDate": "2014-05-13T00:00:00.000"
## }]
## }
## }
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.