[英]Data frame rows to nested list elements by groups
I have a data frame like this 我有这样的数据框
id key value
1 x a 1
2 x b 2
3 y a 3
4 y b 4
read.table(text = "id key value
x a 1
x b 2
y a 3
y b 4", header = TRUE, sep = "\t")
And I would like to get a list for each id
with sub lists for each key
我想得到每个id
列表,每个key
都有子列表
So with my example the expected output would be : 所以在我的例子中,预期的输出将是:
$x
$x$a
$x$a$value
[1] 1
$x$b
$x$b$value
[1] 2
$y
$y$a
$y$a$value
[1] 3
$y$b
$y$b$value
[1] 4
list(
x = list(
a = list(value = 1),
b = list(value = 2)
),
y = list(
a = list(value = 3),
b = list(value = 4)
)
)
I can achieve it with nested lapply
and split
but I think there should be a more straightforward way to do it. 我可以通过嵌套lapply
和split
实现它,但我认为应该有一种更简单的方法来实现它。
Any help would be appreciated. 任何帮助,将不胜感激。
Two methods - one using base
and the other using plyr
- to split your data frame by a group, apply a function over each group, and return the results in a list. 两个方法 - 一个使用base
,另一个使用plyr
- 按组拆分数据框,在每个组上应用一个函数,并在列表中返回结果。
Use base::split.data.frame()
followed by an lapply()
to extract the value
element for each unique id
- key
pair. 使用base::split.data.frame()
后跟lapply()
来提取每个唯一id
- key
对的value
元素。
# split data frame
# based on 'id' and 'key' pairs
df.split <-
split.data.frame(
x = df
, f = list( df$id, df$key )
)
# keep only the value
# element within each list
df.split <-
lapply(
X = df.split
, FUN = function( i )
i[["value"]]
)
# view results
df.split
# $x.a
# [1] 1
#
# $y.a
# [1] 3
#
# $x.b
# [1] 2
#
# $y.b
# [1] 4
# end of script #
Use plyr::dlply()
to do the same thing, without the need for lapply()
. 使用plyr::dlply()
来做同样的事情,而不需要lapply()
。
# load necessary packages
library( plyr )
# splits df by the 'id' and 'key' variables
# and return the 'value' for each pairing
df.split <-
dlply(
.data = df
, .variables = c( "id", "key" )
, .fun = function(i) i[["value"]]
)
# view results
df.split
# $x.a
# [1] 1
#
# $x.b
# [1] 2
#
# $y.a
# [1] 3
#
# $y.b
# [1] 4
#
# attr(,"split_type")
# [1] "data.frame"
# attr(,"split_labels")
# id key
# 1 x a
# 2 x b
# 3 y a
# 4 y b
# end of script #
@Colonel Beauvel's answer to the SO post Emulate split() with dplyr group_by: return a list of data frames was helpful in answering this question. @Colonel Beauvel回答SO帖子Emulate split()与dplyr group_by:返回数据框列表有助于回答这个问题。
One solution with limited number of split
and nested *apply
: 具有有限数量的split
和嵌套*apply
一种解决方案*apply
:
lapply(split(df, df$id), function(x) setNames(apply(x, 1L, function(x) as.list(x["value"])), x[["key"]]))
Nested lapply
and split
alternative : 嵌套lapply
和split
替代:
lapply(split(df, df$id), function(x) lapply(split(x["value"], x$key), as.list))
Improvments are welcome ! 欢迎改进!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.