[英]R add rows to grouped df using dplyr
I have a grouped df
and I would like to add additional rows to the top of the groups that match with a variable ( item_code
) from the df. 我有一个分组的
df
,我想在组的顶部添加额外的行,这些行与df中的变量( item_code
)匹配。 The additional rows do not have an id
column. 其他行没有
id
列。 The additional rows should not be duplicated within the groups of df
. 不应在
df
组内复制其他行。
Example data: 示例数据:
df <- as.tibble(data.frame(id=rep(1:3,each=2),
item_code=c("A","A","B","B","B","Z"),
score=rep(1,6)))
additional_rows <- as.tibble(data.frame(item_code=c("A","Z"),
score=c(6,6)))
What I tried 我尝试了什么
I found this post and tried to apply it: Add row in each group using dplyr and add_row() 我发现这篇文章并尝试应用它: 使用dplyr和add_row()在每个组中添加行
df %>% group_by(id) %>% do(add_row(additional_rows %>%
filter(item_code %in% .$item_code)))
What I get: 我得到了什么:
# A tibble: 9 x 3
# Groups: id [3]
id item_code score
<int> <fct> <dbl>
1 1 A 6
2 1 Z 6
3 1 NA NA
4 2 A 6
5 2 Z 6
6 2 NA NA
7 3 A 6
8 3 Z 6
9 3 NA NA
What I am looking for: 我在找什么:
# A tibble: 6 x 3
id item_code score
<int> <fct> <dbl>
1 1 A 6
2 1 A 1
3 1 A 1
4 2 B 1
5 2 B 1
6 3 B 1
7 3 Z 6
8 3 Z 1
This should do the trick: 这应该做的伎俩:
library(plyr)
df %>%
join(subset(df, item_code %in% additional_rows$item_code, select = c(id, item_code)) %>%
join(additional_rows) %>%
subset(!duplicated(.)), type = "full") %>%
arrange(id, item_code, -score)
Not sure if its the best way, but it works 不确定它是否是最佳方式,但它确实有效
Edit: to get the score in the same order added the other arrange terms 编辑:以相同的顺序获得分数,添加其他安排条款
Edit 2: alright, there should now be no duplicated rows added from the additional rows as per your comment 编辑2:好的,根据您的评论,现在应该没有从其他行添加重复的行
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.