简体   繁体   中英

Convert R data table column from JSON to data table

I have a column that contains JSON data as in the following example,

library(data.table)
test <- data.table(a = list(1,2,3), 
           info = list("{'duration': '10', 'country': 'US'}", 
                       "{'duration': '20', 'country': 'US'}",
                       "{'duration': '30', 'country': 'GB', 'width': '20'}"))

I want to convert the last column to equivalent R storage, which would look similar to,

res <- data.table(a = list(1, 2, 3),
                  duration = list(10, 20, 30),
                  country = list('US', 'US', 'GB'),
                  width = list(NA, NA, 20))

Since I have 500K rows with different contents I would look for a quick way to do this.

A variation without the need to separate out the JSON string

library(data.table)
library(jsonlite)

test[, info := gsub("'", "\"", info)]
test[, rbindlist(lapply(info, fromJSON), use.names = TRUE, fill = TRUE)]

#    duration country width
# 1:       10      US    NA
# 2:       20      US    NA
# 3:       30      GB    20

Parse the JSON first, then build the data.frame (or data.table):

json_string <- paste(c("[{'duration': '10', 'country': 'US'}", 
    "{'duration': '20', 'country': 'US'}",
  "{'duration': '30', 'country': 'GB'}",
  "{'width': '20'}]"), collapse=", ")

# JSON standard requires double quotes
json_string <- gsub("'", "\"", json_string)

library("jsonlite")
fromJSON(json_string)

#  duration country width
# 1       10      US  <NA>
# 2       20      US  <NA>
# 3       30      GB  <NA>
# 4     <NA>    <NA>    20

This isn't exactly what you asked for as your JSON doesn't associate 'width' with the previous record, you might need to do some manipulation first:

json_string <- paste(c("[{'duration': '10', 'country': 'US'}", 
    "{'duration': '20', 'country': 'US'}",
  "{'duration': '30', 'country': 'GB', 'width': '20'}]"), 
  collapse=", ")

json_string <- gsub("'", "\"", json_string)
df <- jsonlite::fromJSON(json_string)
data.table::as.data.table(df)

#    duration country width
# 1:       10      US    NA
# 2:       20      US    NA
# 3:       30      GB    20

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM