将R中的JSON读取到data.frame

Question

I have list of JSON values (actually it's a text file where every line is one JSON object). 我有JSON值列表（实际上是一个文本文件，其中每一行都是一个JSON对象）。 Like this: 像这样：

{ "id": 1, "name": "john", "age": 18, "education": "master" }
{ "id": 2, "name": "jack", "job": "clerk" }
...

Some of the values can be missing (eg first item doesn't have "job" value and second item doesn't have "education" and "age"). 某些值可能会丢失（例如，第一项没有“工作”值，第二项没有“教育”和“年龄”）。

I need to create data frame in R and fill all missing column values as NAs (if field with unique name exists in at least one row). 我需要在R中创建数据框，并将所有缺少的列值填充为NA（如果至少一行中存在唯一名称的字段）。 How to achieve this easier? 如何轻松实现呢？

What I already done - I installed "rjson" package and parsed these lines to R lists. 我已经完成的工作-我安装了“ rjson”包，并将这些行解析为R列表。 Let's assume that lines variable is a character vector of lines. 假设lines变量是line的字符向量。

library(rjson)
lines <- // initialize "lines" var here
jsons <- sapply(lines, fromJSON)

"jsons" variable became "list of lists" (every JSON object is converted to list in R terminology). “ jsons”变量变成“列表列表”（每个JSON对象都用R术语转换为列表）。 How to convert it to data.frame? 如何将其转换为data.frame？

I want to see the following data frame for the example I provided: 我想为我提供的示例查看以下数据框：

"id" | "name" | "age" | "education" | "job"
-------------------------------------------
1    | "john" |  18   |  "master"   |   NA
2    | "jack  |  NA   |     NA      | "clerk"

Answer 1

From plyr you can use rbind.fill to add the NAs for you 在plyr您可以使用rbind.fill为您添加NA

library(plyr)
rbind.fill(sapply(jsons, data.frame), jsons)

#   id name age education   job
# 1  1 john  18    master  <NA>
# 2  2 jack  NA      <NA> clerk

or from data.table 或从data.table

library(data.table)
rbindlist(jsons, fill=T)

and dplyr 和dplyr

library(dplyr)
bind_rows(sapply(jsons, data.frame))

Answer 2

Future me, correcting past me's mistakes. 未来我，纠正过去的错误。 It would make more sense to use jsonlite 's stream_in 使用jsonlite的stream_in会更有意义

stream_in(txtfile)

# To test on `txt` from below, try:
# stream_in(textConnection(txt))

# Found 2 records...
# Imported 2 records. Simplifying...
#  id name age education   job
#1 NA john  18    master  <NA>
#2  2 jack  NA      <NA> clerk

Use the jsonlite package's fromJSON function, after making a few inline edits to your original text data (I've also edited the first piece of id data to include an explicit null value, to show that it deals with this): 在对原始文本数据进行了一些内联编辑之后，请使用jsonlite包的fromJSON函数（我还编辑了第一组id数据以包括一个显式的null值，以表明它可以处理此问题）：

fromJSON(paste0("[", gsub("}\n", "},\n", txt), "]"))
#  id name age education   job
#1 NA john  18    master  <NA>
#2  2 jack  NA      <NA> clerk

All I did was add a little formatting to wrap all the JSON lines together in [ and ] and add a comma at the end of each closing } - resulting in an output like the below which can be processed all at once by jsonlite::fromJSON : 我所做的只是添加了一点格式，将所有JSON行包装在[和]并在每个结束jsonlite::fromJSON }末尾添加一个逗号-产生了如下所示的输出，可以由jsonlite::fromJSON一次全部处理：

[{"1":"one"},{"2":"two"}]

Where txt was your lines of data as presented, with a null in the id variable: txt是您显示的数据行，其中id变量为null ：

txt <- "{ \"id\": null, \"name\": \"john\", \"age\": 18, \"education\": \"master\" }
{ \"id\": 2, \"name\": \"jack\", \"job\": \"clerk\" }"

将R中的JSON读取到data.frame

问题描述

2 个解决方案

解决方案1
3 已采纳 2015-07-22 22:25:43

解决方案2
3 2015-07-22 22:57:42

将R中的JSON读取到data.frame

问题描述

2 个解决方案

解决方案1 3 已采纳 2015-07-22 22:25:43

解决方案2 3 2015-07-22 22:57:42

解决方案1
3 已采纳 2015-07-22 22:25:43

解决方案2
3 2015-07-22 22:57:42