计算每次连续运行的中位数

Question

I have a data.frame as below;我有一个如下的data.frame ；

df <- data.frame(ID = c(2,3,5,8,9,10,12,13,14,15,16),
             value = c(1,2,3,4,5,6,7,8,9,10,11))
> df
   ID value
1   2     1
2   3     2
3   5     3
4   8     4
5   9     5
6  10     6
7  12     7
8  14     8
9  15     9
10 16    10
11 17    11

Here, I would like to obtain the list of medians when ID is consecutive.在这里，我想获取ID连续时的中位数列表。 For example, ID in the first two row shows 2,3 , which is consecutive.例如，前两行中的ID显示2,3 ，这是连续的。 In this case, I would like to obtain the median of value in the first two rows, which should be在这种情况下，我想获得前两行中value的中位数，应该是

> median(c(1,2))
[1] 1.5

Then, next consecutive ID are 8,9,10 , 14,15,16,17 .然后，下一个连续的ID是8,9,10 , 14,15,16,17 。 The corresponding medians should be相应的中位数应该是

> median(c(4,5,6))
[1] 5
> median(c(8,9,10,11))
[1] 9.5

Then, what I finally want is the data.frame like below然后，我最终想要的是如下的data.frame

   ID   median
1   2    1.5
2   8    5
3  14    9.5

I wonder rle might be useful, but I am not sure how I implement this.我想知道rle可能有用，但我不确定如何实现它。 Do you have any suggestion to implement this?你有什么建议来实施这个吗？ I would be grateful for any suggestion.我将不胜感激任何建议。

Answer 1

Here is a data.table option这是一个data.table选项

setDT(df)[
  ,
  if (.N > 1) data.table(ID = min(ID), value = median(value)),
  .(grp = cumsum(c(TRUE, diff(ID) != 1)))
][
  ,
  grp := NULL
][]

which gives这使

   ID value
1:  2   1.5
2:  8   5.0
3: 12   9.0

计算每次连续运行的中位数

问题描述

1 个解决方案

解决方案1
0 已采纳 2021-04-08 22:16:27

计算每次连续运行的中位数

问题描述

1 个解决方案

解决方案1 0 已采纳 2021-04-08 22:16:27

解决方案1
0 已采纳 2021-04-08 22:16:27