[英]Calculating median in each consecutive run
I have a data.frame
as below;我有一个如下的data.frame
;
df <- data.frame(ID = c(2,3,5,8,9,10,12,13,14,15,16),
value = c(1,2,3,4,5,6,7,8,9,10,11))
> df
ID value
1 2 1
2 3 2
3 5 3
4 8 4
5 9 5
6 10 6
7 12 7
8 14 8
9 15 9
10 16 10
11 17 11
Here, I would like to obtain the list of medians when ID
is consecutive.在这里,我想获取ID
连续时的中位数列表。 For example, ID
in the first two row shows 2,3
, which is consecutive.例如,前两行中的ID
显示2,3
,这是连续的。 In this case, I would like to obtain the median of value
in the first two rows, which should be在这种情况下,我想获得前两行中value
的中位数,应该是
> median(c(1,2))
[1] 1.5
Then, next consecutive ID
are 8,9,10
, 14,15,16,17
.然后,下一个连续的ID
是8,9,10
, 14,15,16,17
。 The corresponding medians should be相应的中位数应该是
> median(c(4,5,6))
[1] 5
> median(c(8,9,10,11))
[1] 9.5
Then, what I finally want is the data.frame
like below然后,我最终想要的是如下的data.frame
ID median
1 2 1.5
2 8 5
3 14 9.5
I wonder rle
might be useful, but I am not sure how I implement this.我想知道rle
可能有用,但我不确定如何实现它。 Do you have any suggestion to implement this?你有什么建议来实施这个吗? I would be grateful for any suggestion.我将不胜感激任何建议。
Here is a data.table
option这是一个data.table
选项
setDT(df)[
,
if (.N > 1) data.table(ID = min(ID), value = median(value)),
.(grp = cumsum(c(TRUE, diff(ID) != 1)))
][
,
grp := NULL
][]
which gives这使
ID value
1: 2 1.5
2: 8 5.0
3: 12 9.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.