R data.table 按连续值分组

Question

I need some help with grouping data by continuous values.我需要一些帮助来按连续值对数据进行分组。

If I have this data.table如果我有这个 data.table

dt <- data.table::data.table( a = c(1,1,1,2,2,2,2,1,1,2), b = seq(1:10), c = seq(1:10)+1 )
 
    a  b  c
 1: 1  1  2
 2: 1  2  3
 3: 1  3  4
 4: 2  4  5
 5: 2  5  6
 6: 2  6  7
 7: 2  7  8
 8: 1  8  9
 9: 1  9 10
10: 2 10 11

I need a group for every following equal values in column a.我需要为 a 列中的每个以下相等值创建一个组。 Of this group i need the first (also min possible) value of column b and the last (also max possible) value of column c.在这组中，我需要 b 列的第一个（也是最小可能）值和 c 列的最后一个（也是最大可能）值。

Like this:像这样：

Thank you very much for your help.非常感谢您的帮助。 I do not get it solved alone.我不是一个人解决的。

Answer 1

Probably we can try也许我们可以试试

> dt[, .(a = a[1], b = b[1], c = c[.N]), rleid(a)][, -1]
   a  b  c
1: 1  1  4
2: 2  4  8
3: 1  8 10
4: 2 10 11

Answer 2

An option with dplyr dplyr选项

library(dplyr)
dt %>% 
  group_by(grp = cumsum(c(TRUE, diff(a) != 0))) %>%
    summarise(across(a:b, first), c = last(c)) %>%
  select(-grp)

-output -输出

# A tibble: 4 × 3
      a     b     c
  <dbl> <int> <dbl>
1     1     1     4
2     2     4     8
3     1     8    10
4     2    10    11

R data.table 按连续值分组

问题描述

2 个解决方案

解决方案1
4 已采纳 2022-11-15 13:58:48

解决方案2
1 2022-11-15 16:48:58

R data.table 按连续值分组

问题描述

2 个解决方案

解决方案1 4 已采纳 2022-11-15 13:58:48

解决方案2 1 2022-11-15 16:48:58

解决方案1
4 已采纳 2022-11-15 13:58:48

解决方案2
1 2022-11-15 16:48:58