I have a column that contains JSON data as in the following example,
library(data.table)
test <- data.table(a = list(1,2,3),
info = list("{'duration': '10', 'country': 'US'}",
"{'duration': '20', 'country': 'US'}",
"{'duration': '30', 'country': 'GB', 'width': '20'}"))
I want to convert the last column to equivalent R storage, which would look similar to,
res <- data.table(a = list(1, 2, 3),
duration = list(10, 20, 30),
country = list('US', 'US', 'GB'),
width = list(NA, NA, 20))
Since I have 500K rows with different contents I would look for a quick way to do this.
A variation without the need to separate out the JSON string
library(data.table)
library(jsonlite)
test[, info := gsub("'", "\"", info)]
test[, rbindlist(lapply(info, fromJSON), use.names = TRUE, fill = TRUE)]
# duration country width
# 1: 10 US NA
# 2: 20 US NA
# 3: 30 GB 20
Parse the JSON first, then build the data.frame (or data.table):
json_string <- paste(c("[{'duration': '10', 'country': 'US'}",
"{'duration': '20', 'country': 'US'}",
"{'duration': '30', 'country': 'GB'}",
"{'width': '20'}]"), collapse=", ")
# JSON standard requires double quotes
json_string <- gsub("'", "\"", json_string)
library("jsonlite")
fromJSON(json_string)
# duration country width
# 1 10 US <NA>
# 2 20 US <NA>
# 3 30 GB <NA>
# 4 <NA> <NA> 20
This isn't exactly what you asked for as your JSON doesn't associate 'width' with the previous record, you might need to do some manipulation first:
json_string <- paste(c("[{'duration': '10', 'country': 'US'}",
"{'duration': '20', 'country': 'US'}",
"{'duration': '30', 'country': 'GB', 'width': '20'}]"),
collapse=", ")
json_string <- gsub("'", "\"", json_string)
df <- jsonlite::fromJSON(json_string)
data.table::as.data.table(df)
# duration country width
# 1: 10 US NA
# 2: 20 US NA
# 3: 30 GB 20
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.