[英]R data.table group by continuous values
I need some help with grouping data by continuous values.我需要一些帮助来按连续值对数据进行分组。
If I have this data.table如果我有这个 data.table
dt <- data.table::data.table( a = c(1,1,1,2,2,2,2,1,1,2), b = seq(1:10), c = seq(1:10)+1 )
a b c
1: 1 1 2
2: 1 2 3
3: 1 3 4
4: 2 4 5
5: 2 5 6
6: 2 6 7
7: 2 7 8
8: 1 8 9
9: 1 9 10
10: 2 10 11
I need a group for every following equal values in column a.我需要为 a 列中的每个以下相等值创建一个组。 Of this group i need the first (also min possible) value of column b and the last (also max possible) value of column c.在这组中,我需要 b 列的第一个(也是最小可能)值和 c 列的最后一个(也是最大可能)值。
Like this:像这样:
a b c
1: 1 1 4
2: 2 4 8
3: 1 8 10
4: 2 10 11
Thank you very much for your help.非常感谢您的帮助。 I do not get it solved alone.我不是一个人解决的。
Probably we can try也许我们可以试试
> dt[, .(a = a[1], b = b[1], c = c[.N]), rleid(a)][, -1]
a b c
1: 1 1 4
2: 2 4 8
3: 1 8 10
4: 2 10 11
An option with dplyr
dplyr
选项
library(dplyr)
dt %>%
group_by(grp = cumsum(c(TRUE, diff(a) != 0))) %>%
summarise(across(a:b, first), c = last(c)) %>%
select(-grp)
-output -输出
# A tibble: 4 × 3
a b c
<dbl> <int> <dbl>
1 1 1 4
2 2 4 8
3 1 8 10
4 2 10 11
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.