[英]Error in performing aggregation (eg. sum, mean) with values_fn using pivot_wider() in R
我有一个具有以下格式的数据集:
> library(tidyverse)
> library(tibble)
>
>
> data<-data.frame(ID=c(1,1,2,2,3,3,3,3,4,4),
+ Radius=c(5,5,5,5,10,10,15,15,10,10),
+ neighb_ID=c(1,11,2,12,3,4,7,8,3,4),
+ var_neighb=c(50,20,30,40,15,100,70,60,15,100))
> data
ID1 Radius neighb_ID var_neighb
1 1 5 1 50
2 1 5 11 20
3 2 5 2 30
4 2 5 12 40
5 3 10 3 15
6 3 10 4 100
7 3 15 7 70
8 3 15 8 60
9 4 10 3 15
10 4 10 4 100
>
现在我想 pivot 这个数据,以便为每个ID
按Radius
聚合var_neighb
。 例如,对于sum
和mean
,我希望实现下表:
ID1 Svar_neighb_Radius_5 Svar_neighb_Radius_10 Svar_neighb_Radius_15
1 1 20 0 0
2 2 40 0 0
3 3 0 100 130
4 4 0 15 0
Mvar_neighb_Radius_5 Mvar_neighb_Radius_10 Mvar_neighb_Radius_15
1 20 0 0
2 40 0 0
3 0 100 65
4 0 15 0
>
我尝试使用以下代码执行此操作:
> agdata<-data %>%
+ pivot_wider(
+ names_from = Radius,
+ values_from = var_neighb,
+ values_fn = sum,
+ values_fill = 0
+ )
我只收到以下错误:
Error in values_fn[[value]] : object of type 'builtin' is not subsettable
此外,即使我取出values_fn = sum,
,我也会收到以下错误: Error in values_fill[[value]]: subscript out of bounds
。
有人可以帮助我解决这些问题以实现我的目标吗?
编辑:对不起,我忽略了 output 表的一个重要要求:聚合应该是sum
和mean
并且不应该包括var_neighb
的值,其中neighb_ID
等于ID
。 output 表data_out
需要按sum
和mean
进行聚合。 所以我更新了data
。
values_fn
和values_fill
应该命名为列表:
library(tidyverse)
data <- data.frame(
ID=c(1,1,2,2,3,3,3,4,4),
Radius=c(5,5,5,5,10,10,15,10,10),
neighb_ID=c(1,11,2,12,3,4,7,3,4),
var_neighb=c(50,20,30,40,15,100,70,15,100)
)
data %>%
select(-neighb_ID) %>%
pivot_wider(
names_from = Radius,
values_from = var_neighb,
values_fn = list(var_neighb = sum),
values_fill = list(var_neighb = 0),
names_prefix = "var_neighb_Radius_"
)
# # A tibble: 4 x 4
# ID var_neighb_Radius_5 var_neighb_Radius_10 var_neighb_Radius_15
# <dbl> <dbl> <dbl> <dbl>
# 1 1 70 0 0
# 2 2 70 0 0
# 3 3 0 115 70
# 4 4 0 115 0
更新要删除ID == neighb_ID
的值,只需使用过滤器:
data %>%
filter(ID != neighb_ID) %>%
select(-neighb_ID) %>%
pivot_wider(
names_from = Radius,
values_from = var_neighb,
values_fn = list(var_neighb = sum),
values_fill = list(var_neighb = 0),
names_prefix = "var_neighb_Radius_"
)
不太确定您对“按mean
和按sum
进行聚合”的理解 - 您不能在单个列中进行两个不同的聚合,但您可以制作两个枢轴并将它们连接在一起:
library(dplyr)
inner_join(
data %>%
filter(ID != neighb_ID) %>%
select(-neighb_ID) %>%
pivot_wider(
names_from = Radius,
values_from = var_neighb,
values_fn = list(var_neighb = sum),
values_fill = list(var_neighb = 0),
names_prefix = "var_neighb_Radius_sum_"
),
data %>%
filter(ID != neighb_ID) %>%
select(-neighb_ID) %>%
pivot_wider(
names_from = Radius,
values_from = var_neighb,
values_fn = list(var_neighb = mean),
values_fill = list(var_neighb = 0),
names_prefix = "var_neighb_Radius_mean_"
),
by = "ID"
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.