[英]Reshaping JSON output from a grouped dataframe in R
I have an R dataframe of the form: 我有一个形式的R数据框:
Country Region Year V1 V2
AAAA XXXX 2001 12 13
BBBB YYYY 2001 14 15
AAAA XXXX 2002 36 56
AAAA XXXX 1999 45 67
and would like to generate a JSON equivalent of the form: 并希望生成与以下形式等效的JSON:
[
{"Country": "AAAA",
"Region":"XXXX",
"V1": [ [1999,45], [2001,12] , [2002,36] ],
"V2":[ [1999,67], [2001,13] , [2002,56] ]
},
{"Country": "BBBB",
"Region":"YYYY",
"V1":[ [2001,14] ],
"V2":[ [2001,15] ]
}
]
I'm imagining this requires: 我在想这需要:
but am struggling to find a way to do it? 但是正在努力寻找一种方法吗?
Here is another way to do this. 这是执行此操作的另一种方法。
dat <- read.table(textConnection("Country Region Year V1 V2
AAAA XXXX 2001 12 13
BBBB YYYY 2001 14 15
AAAA XXXX 2002 36 56
AAAA XXXX 1999 45 67"), header = TRUE)
We add two helper functions to zip vectors together and a custom sort function which sorts a list by elements in a given position. 我们将两个辅助函数添加到zip向量中,并添加了一个自定义排序函数,该函数按给定位置的元素对列表进行排序。
#' Pluck element
pluck_ = function (element){
function(x) x[[element]]
}
#' Zip two vectors
zip_ <- function(..., names = F){
x = list(...)
y = lapply(seq_along(x[[1]]), function(i) lapply(x, pluck_(i)))
if (names) names(y) = seq_along(y)
return(y)
}
#' Sort a vector based on elements at a given position
sort_ <- function(v, i = 1){
v[sort(sapply(v, '[[', i), index.return = T)$ix]
}
Time to put things together and use the split-apply-combine
magic to get the output you seek. 是时候将这些东西放在一起,并使用
split-apply-combine
魔术来获取您想要的输出了。
library(plyr)
dat2 <- dlply(dat, .(Country, Region), function(d){
list(
Country = d$Country[1],
Region = d$Region[1],
V1 = sort_(zip_(d$Year, d$V1)),
V2 = sort_(zip_(d$Year, d$V2))
)
})
cat(rjson::toJSON(setNames(dat2, NULL)))
This gives you the output 这给你输出
[
{"Country":"AAAA",
"Region":"XXXX",
"V1":[[1999,45],[2001,12],[2002,36]],
"V2":[[1999,67],[2001,13],[2002,56]]
},
{"Country":"BBBB",
"Region":"YYYY",
"V1":[[2001,14]],
"V2":[[2001,15]]
}
]
Here's a sort of messy function to do this (you could easily add sorting by year of the V1 and V2 arrays): 这是一种杂乱的功能(您可以轻松地按V1和V2数组的年份添加排序):
dat <- read.table(textConnection(
'Country Region Year V1 V2
AAAA XXXX 2001 12 13
BBBB YYYY 2001 14 15
AAAA XXXX 2002 36 56
AAAA XXXX 1999 45 67'
), header=TRUE, stringsAsFactors=FALSE)
library(plyr); library(RJSONIO)
myfunc <- function(nn)
{
tt <- split(nn, nn$Country)
bar <- function(w){
foo <- function(x, y, z) paste(x[y], x[z], sep=",")
V1 <- as.character(apply(w, 1, foo, y="Year", z="V1"))
V2 <- as.character(apply(w, 1, foo, y="Year", z="V2"))
datlist <- list(Country = unique(w$Country),
Region = unique(w$Region),
V1 = V1, V2=V2)
}
datlist <- lapply(tt, bar)
names(datlist) <- NULL
RJSONIO::toJSON(datlist)
}
cat(myfunc(dat))
[
{
"Country": "AAAA",
"Region": "XXXX",
"V1": [ "2001,12", "2002,36", "1999,45" ],
"V2": [ "2001,13", "2002,56", "1999,67" ]
},
{
"Country": "BBBB",
"Region": "YYYY",
"V1": "2001,14",
"V2": "2001,15"
}
]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.