从R格式化JSON输出的策略

Question

I'm trying to figure out the best way of producing a JSON file from R. I have the following dataframe tmp in R . 我试图找出从R生成JSON文件的最佳方法。我在R有以下数据帧tmp 。

> tmp
  gender age welcoming proud tidy unique
1      1  30         4     4    4      4
2      2  34         4     2    4      4
3      1  34         5     3    4      5
4      2  33         2     3    2      4
5      2  28         4     3    4      4
6      2  26         3     2    4      3

The output of dput(tmp) is as follows: dput(tmp)的输出如下：

tmp <- structure(list(gender = c(1L, 2L, 1L, 2L, 2L, 2L), age = c(30, 
34, 34, 33, 28, 26), welcoming = c(4L, 4L, 5L, 2L, 4L, 3L), proud = c(4L, 
2L, 3L, 3L, 3L, 2L), tidy = c(4L, 4L, 4L, 2L, 4L, 4L), unique = c(4L, 
4L, 5L, 4L, 4L, 3L)), .Names = c("gender", "age", "welcoming", 
"proud", "tidy", "unique"), na.action = structure(c(15L, 39L, 
60L, 77L, 88L, 128L, 132L, 172L, 272L, 304L, 305L, 317L, 328L, 
409L, 447L, 512L, 527L, 605L, 618L, 657L, 665L, 670L, 708L, 709L, 
729L, 746L, 795L, 803L, 826L, 855L, 898L, 911L, 957L, 967L, 983L, 
984L, 988L, 1006L, 1161L, 1162L, 1224L, 1245L, 1256L, 1257L, 
1307L, 1374L, 1379L, 1386L, 1387L, 1394L, 1401L, 1408L, 1434L, 
1446L, 1509L, 1556L, 1650L, 1717L, 1760L, 1782L, 1814L, 1847L, 
1863L, 1909L, 1930L, 1971L, 2004L, 2022L, 2055L, 2060L, 2065L, 
2082L, 2109L, 2121L, 2145L, 2158L, 2159L, 2226L, 2227L, 2281L
), .Names = c("15", "39", "60", "77", "88", "128", "132", "172", 
"272", "304", "305", "317", "328", "409", "447", "512", "527", 
"605", "618", "657", "665", "670", "708", "709", "729", "746", 
"795", "803", "826", "855", "898", "911", "957", "967", "983", 
"984", "988", "1006", "1161", "1162", "1224", "1245", "1256", 
"1257", "1307", "1374", "1379", "1386", "1387", "1394", "1401", 
"1408", "1434", "1446", "1509", "1556", "1650", "1717", "1760", 
"1782", "1814", "1847", "1863", "1909", "1930", "1971", "2004", 
"2022", "2055", "2060", "2065", "2082", "2109", "2121", "2145", 
"2158", "2159", "2226", "2227", "2281"), class = "omit"), row.names = c(NA, 
6L), class = "data.frame")

Using the rjson package, I run the line toJSON(tmp) which produces the following JSON file: 使用rjson包，我运行toJSON(tmp) ，生成以下JSON文件：

 {"gender":[1,2,1,2,2,2],
 "age":[30,34,34,33,28,26],
 "welcoming":[4,4,5,2,4,3],
 "proud":[4,2,3,3,3,2],
  "tidy":[4,4,4,2,4,4],
  "unique":[4,4,5,4,4,3]}

I also experimented with the RJSONIO package; 我还尝试了RJSONIO包; the output of toJSON() was the same. toJSON()的输出是相同的。 What I would like to produce is the following structure: 我想要产生的是以下结构：

  {"traits":["gender","age","welcoming","proud", "tidy", "unique"],
   "values":[   
            {"gender":1,"age":30,"welcoming":4,"proud":4,"tidy":4, "unique":4},
            {"gender":2,"age":34,"welcoming":4,"proud":2,"tidy":4, "unique":4},
            ....
            ]

I'm not sure how best to do this. 我不确定如何最好地做到这一点。 I realize that I can parse it line by line using python but I feel like there is probably a better way of doing this. 我意识到我可以使用python逐行解析它，但我觉得可能有更好的方法来做到这一点。 I also realize that my data structure in R does not reflect the meta-information desired in my JSON file (specifically the traits line), but I am mainly interested in producing the data formatted like the line 我也意识到我在R中的数据结构并不反映我的JSON文件中所需的元信息（特别是traits行），但我主要感兴趣的是生成格式化为行的数据

{"gender":1,"age":30,"welcoming":4,"proud":4,"tidy":4, "unique":4}

as I can manually add the first line. 因为我可以手动添加第一行。

EDIT: I found a useful blog post where the author dealt with a similar problem and provided a solution. 编辑：我找到了一篇有用的博客文章，其中作者处理了类似的问题并提供了解决方案。 This function produces a formatted JSON file from a data frame. 此函数从数据框生成格式化的JSON文件。

toJSONarray <- function(dtf){
clnms <- colnames(dtf)

name.value <- function(i){
quote <- '';
# if(class(dtf[, i])!='numeric'){
if(class(dtf[, i])!='numeric' && class(dtf[, i])!= 'integer'){ # I modified this line so integers are also not enclosed in quotes
quote <- '"';
}

paste('"', i, '" : ', quote, dtf[,i], quote, sep='')
}

objs <- apply(sapply(clnms, name.value), 1, function(x){paste(x, collapse=', ')})
objs <- paste('{', objs, '}')

# res <- paste('[', paste(objs, collapse=', '), ']')
res <- paste('[', paste(objs, collapse=',\n'), ']') # added newline for formatting output

return(res)
}

Answer 1

Building upon Andrie's idea with apply , you can get exactly what you want by modifying the tmp variable before calling toJSON . 在Andrie与思想建设apply ，你可以得到你想要什么通过修改tmp调用之前变量toJSON 。

library(RJSONIO)
modified <- list(
  traits = colnames(tmp),
  values = unname(apply(tmp, 1, function(x) as.data.frame(t(x))))
)
cat(toJSON(modified))

Answer 2

Using the package jsonlite : 使用jsonlite包：

> jsonlite::toJSON(list(traits = names(tmp), values = tmp), pretty = TRUE)
{
  "traits": ["gender", "age", "welcoming", "proud", "tidy", "unique"],
  "values": [
    {
      "gender": 1,
      "age": 30,
      "welcoming": 4,
      "proud": 4,
      "tidy": 4,
      "unique": 4
    },
    {
      "gender": 2,
      "age": 34,
      "welcoming": 4,
      "proud": 2,
      "tidy": 4,
      "unique": 4
    },
    {
      "gender": 1,
      "age": 34,
      "welcoming": 5,
      "proud": 3,
      "tidy": 4,
      "unique": 5
    },
    {
      "gender": 2,
      "age": 33,
      "welcoming": 2,
      "proud": 3,
      "tidy": 2,
      "unique": 4
    },
    {
      "gender": 2,
      "age": 28,
      "welcoming": 4,
      "proud": 3,
      "tidy": 4,
      "unique": 4
    },
    {
      "gender": 2,
      "age": 26,
      "welcoming": 3,
      "proud": 2,
      "tidy": 4,
      "unique": 3
    }
  ]
}

Answer 3

Building further on Andrie and Richie's ideas, use alply instead of apply to avoid converting numbers to characters: 进一步构建Andrie和Richie的想法，使用alply而不是apply以避免将数字转换为字符：

library(RJSONIO)
library(plyr)
modified <- list(
  traits = colnames(tmp),
  values = unname(alply(tmp, 1, identity))
)
cat(toJSON(modified))

plyr's alply is similar to apply but returns a list automatically; plyr的alply类似于apply但会自动返回一个列表; whereas without the more complicated function inside Richie Cotton's answer, apply would return a vector or array. 而在Richie Cotton的答案中没有更复杂的函数， apply会返回一个向量或数组。 And those extra steps, including t , mean that if your dataset has any non-numeric columns, the numbers will get converted to strings. 这些额外的步骤，包括t ，意味着如果您的数据集有任何非数字列，则数字将转换为字符串。 So use of alply avoids that concern. 因此，使用alply可以避免这种担忧。

For example, take your tmp dataset and add 例如，获取您的tmp数据集并添加

tmp$grade <- c("A","B","C","D","E","F")

Then compare this code (with alply ) vs the other example (with apply ). 然后将此代码（与alply ）与另一个示例（使用apply ）进行比较。

Answer 4

It seems to me you can do this by sending each row of your data.frame to JSON with the appropriate apply statement. 在我看来，你可以通过使用适当的apply语句将data.frame每一行发送到JSON来实现。

For a single row: 对于单行：

library(RJSONIO)

> x <- toJSON(tmp[1, ])
> cat(x)
{
 "gender": 1,
"age":     30,
"welcoming": 4,
"proud": 4,
"tidy": 4,
"unique": 4 
}

The entire data.frame : 整个data.frame ：

x <- apply(tmp, 1, toJSON)
cat(x)
{
 "gender": 1,
"age":     30,
"welcoming": 4,
"proud": 4,
"tidy": 4,
"unique": 4 
} {

...

} {
 "gender": 2,
"age":     26,
"welcoming": 3,
"proud": 2,
"tidy": 4,
"unique": 3 
}

Answer 5

Another option is to use the split to split your data.frame with N rows into N data.frames with 1 row. 另一种选择是使用split将具有N行的data.frame split为具有1行的N个data.frames。

library(RJSONIO)
modified <- list(
   traits = colnames(tmp),
   values = split(tmp, seq_len(nrow(tmp)))
)
cat(toJSON(modified))

从R格式化JSON输出的策略

问题描述

5 个解决方案

解决方案1
14 2011-11-28 13:17:33

解决方案2
10 已采纳 2015-07-02 04:32:48

解决方案3
9 2012-11-22 04:17:05

解决方案4
4 2011-11-27 23:09:16

解决方案5
2 2012-12-06 07:55:04

从R格式化JSON输出的策略

问题描述

5 个解决方案

解决方案1 14 2011-11-28 13:17:33

解决方案2 10 已采纳 2015-07-02 04:32:48

解决方案3 9 2012-11-22 04:17:05

解决方案4 4 2011-11-27 23:09:16

解决方案5 2 2012-12-06 07:55:04

解决方案1
14 2011-11-28 13:17:33

解决方案2
10 已采纳 2015-07-02 04:32:48

解决方案3
9 2012-11-22 04:17:05

解决方案4
4 2011-11-27 23:09:16

解决方案5
2 2012-12-06 07:55:04