[英]Strategies for formatting JSON output from R
I'm trying to figure out the best way of producing a JSON file from R. I have the following dataframe tmp
in R
. 我试图找出从R生成JSON文件的最佳方法。我在R
有以下数据帧tmp
。
> tmp
gender age welcoming proud tidy unique
1 1 30 4 4 4 4
2 2 34 4 2 4 4
3 1 34 5 3 4 5
4 2 33 2 3 2 4
5 2 28 4 3 4 4
6 2 26 3 2 4 3
The output of dput(tmp)
is as follows: dput(tmp)
的输出如下:
tmp <- structure(list(gender = c(1L, 2L, 1L, 2L, 2L, 2L), age = c(30,
34, 34, 33, 28, 26), welcoming = c(4L, 4L, 5L, 2L, 4L, 3L), proud = c(4L,
2L, 3L, 3L, 3L, 2L), tidy = c(4L, 4L, 4L, 2L, 4L, 4L), unique = c(4L,
4L, 5L, 4L, 4L, 3L)), .Names = c("gender", "age", "welcoming",
"proud", "tidy", "unique"), na.action = structure(c(15L, 39L,
60L, 77L, 88L, 128L, 132L, 172L, 272L, 304L, 305L, 317L, 328L,
409L, 447L, 512L, 527L, 605L, 618L, 657L, 665L, 670L, 708L, 709L,
729L, 746L, 795L, 803L, 826L, 855L, 898L, 911L, 957L, 967L, 983L,
984L, 988L, 1006L, 1161L, 1162L, 1224L, 1245L, 1256L, 1257L,
1307L, 1374L, 1379L, 1386L, 1387L, 1394L, 1401L, 1408L, 1434L,
1446L, 1509L, 1556L, 1650L, 1717L, 1760L, 1782L, 1814L, 1847L,
1863L, 1909L, 1930L, 1971L, 2004L, 2022L, 2055L, 2060L, 2065L,
2082L, 2109L, 2121L, 2145L, 2158L, 2159L, 2226L, 2227L, 2281L
), .Names = c("15", "39", "60", "77", "88", "128", "132", "172",
"272", "304", "305", "317", "328", "409", "447", "512", "527",
"605", "618", "657", "665", "670", "708", "709", "729", "746",
"795", "803", "826", "855", "898", "911", "957", "967", "983",
"984", "988", "1006", "1161", "1162", "1224", "1245", "1256",
"1257", "1307", "1374", "1379", "1386", "1387", "1394", "1401",
"1408", "1434", "1446", "1509", "1556", "1650", "1717", "1760",
"1782", "1814", "1847", "1863", "1909", "1930", "1971", "2004",
"2022", "2055", "2060", "2065", "2082", "2109", "2121", "2145",
"2158", "2159", "2226", "2227", "2281"), class = "omit"), row.names = c(NA,
6L), class = "data.frame")
Using the rjson
package, I run the line toJSON(tmp)
which produces the following JSON file: 使用rjson
包,我运行toJSON(tmp)
,生成以下JSON文件:
{"gender":[1,2,1,2,2,2],
"age":[30,34,34,33,28,26],
"welcoming":[4,4,5,2,4,3],
"proud":[4,2,3,3,3,2],
"tidy":[4,4,4,2,4,4],
"unique":[4,4,5,4,4,3]}
I also experimented with the RJSONIO
package; 我还尝试了RJSONIO
包; the output of toJSON()
was the same. toJSON()
的输出是相同的。 What I would like to produce is the following structure: 我想要产生的是以下结构:
{"traits":["gender","age","welcoming","proud", "tidy", "unique"],
"values":[
{"gender":1,"age":30,"welcoming":4,"proud":4,"tidy":4, "unique":4},
{"gender":2,"age":34,"welcoming":4,"proud":2,"tidy":4, "unique":4},
....
]
I'm not sure how best to do this. 我不确定如何最好地做到这一点。 I realize that I can parse it line by line using python
but I feel like there is probably a better way of doing this. 我意识到我可以使用python
逐行解析它,但我觉得可能有更好的方法来做到这一点。 I also realize that my data structure in R
does not reflect the meta-information desired in my JSON
file (specifically the traits
line), but I am mainly interested in producing the data formatted like the line 我也意识到我在R
中的数据结构并不反映我的JSON
文件中所需的元信息(特别是traits
行),但我主要感兴趣的是生成格式化为行的数据
{"gender":1,"age":30,"welcoming":4,"proud":4,"tidy":4, "unique":4}
as I can manually add the first line. 因为我可以手动添加第一行。
EDIT: I found a useful blog post where the author dealt with a similar problem and provided a solution. 编辑:我找到了一篇有用的博客文章,其中作者处理了类似的问题并提供了解决方案。 This function produces a formatted JSON file from a data frame. 此函数从数据框生成格式化的JSON文件。
toJSONarray <- function(dtf){
clnms <- colnames(dtf)
name.value <- function(i){
quote <- '';
# if(class(dtf[, i])!='numeric'){
if(class(dtf[, i])!='numeric' && class(dtf[, i])!= 'integer'){ # I modified this line so integers are also not enclosed in quotes
quote <- '"';
}
paste('"', i, '" : ', quote, dtf[,i], quote, sep='')
}
objs <- apply(sapply(clnms, name.value), 1, function(x){paste(x, collapse=', ')})
objs <- paste('{', objs, '}')
# res <- paste('[', paste(objs, collapse=', '), ']')
res <- paste('[', paste(objs, collapse=',\n'), ']') # added newline for formatting output
return(res)
}
Building upon Andrie's idea with apply
, you can get exactly what you want by modifying the tmp
variable before calling toJSON
. 在Andrie与思想建设apply
,你可以得到你想要什么通过修改tmp
调用之前变量toJSON
。
library(RJSONIO)
modified <- list(
traits = colnames(tmp),
values = unname(apply(tmp, 1, function(x) as.data.frame(t(x))))
)
cat(toJSON(modified))
Using the package jsonlite
: 使用jsonlite
包:
> jsonlite::toJSON(list(traits = names(tmp), values = tmp), pretty = TRUE)
{
"traits": ["gender", "age", "welcoming", "proud", "tidy", "unique"],
"values": [
{
"gender": 1,
"age": 30,
"welcoming": 4,
"proud": 4,
"tidy": 4,
"unique": 4
},
{
"gender": 2,
"age": 34,
"welcoming": 4,
"proud": 2,
"tidy": 4,
"unique": 4
},
{
"gender": 1,
"age": 34,
"welcoming": 5,
"proud": 3,
"tidy": 4,
"unique": 5
},
{
"gender": 2,
"age": 33,
"welcoming": 2,
"proud": 3,
"tidy": 2,
"unique": 4
},
{
"gender": 2,
"age": 28,
"welcoming": 4,
"proud": 3,
"tidy": 4,
"unique": 4
},
{
"gender": 2,
"age": 26,
"welcoming": 3,
"proud": 2,
"tidy": 4,
"unique": 3
}
]
}
Building further on Andrie and Richie's ideas, use alply
instead of apply
to avoid converting numbers to characters: 进一步构建Andrie和Richie的想法,使用alply
而不是apply
以避免将数字转换为字符:
library(RJSONIO)
library(plyr)
modified <- list(
traits = colnames(tmp),
values = unname(alply(tmp, 1, identity))
)
cat(toJSON(modified))
plyr's alply
is similar to apply
but returns a list automatically; plyr的alply
类似于apply
但会自动返回一个列表; whereas without the more complicated function inside Richie Cotton's answer, apply
would return a vector or array. 而在Richie Cotton的答案中没有更复杂的函数, apply
会返回一个向量或数组。 And those extra steps, including t
, mean that if your dataset has any non-numeric columns, the numbers will get converted to strings. 这些额外的步骤,包括t
,意味着如果您的数据集有任何非数字列,则数字将转换为字符串。 So use of alply
avoids that concern. 因此,使用alply
可以避免这种担忧。
For example, take your tmp
dataset and add 例如,获取您的tmp
数据集并添加
tmp$grade <- c("A","B","C","D","E","F")
Then compare this code (with alply
) vs the other example (with apply
). 然后将此代码(与alply
)与另一个示例(使用apply
)进行比较。
It seems to me you can do this by sending each row of your data.frame
to JSON with the appropriate apply
statement. 在我看来,你可以通过使用适当的apply
语句将data.frame
每一行发送到JSON来实现。
For a single row: 对于单行:
library(RJSONIO)
> x <- toJSON(tmp[1, ])
> cat(x)
{
"gender": 1,
"age": 30,
"welcoming": 4,
"proud": 4,
"tidy": 4,
"unique": 4
}
The entire data.frame
: 整个data.frame
:
x <- apply(tmp, 1, toJSON)
cat(x)
{
"gender": 1,
"age": 30,
"welcoming": 4,
"proud": 4,
"tidy": 4,
"unique": 4
} {
...
} {
"gender": 2,
"age": 26,
"welcoming": 3,
"proud": 2,
"tidy": 4,
"unique": 3
}
Another option is to use the split
to split your data.frame
with N rows into N data.frames with 1 row. 另一种选择是使用split
将具有N行的data.frame
split
为具有1行的N个data.frames。
library(RJSONIO)
modified <- list(
traits = colnames(tmp),
values = split(tmp, seq_len(nrow(tmp)))
)
cat(toJSON(modified))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.