R计算某些行的组中位数和最后一行

Question

I'm working with grouping and median, I'd like to have a grouping of a data.frame with the median of certain rows (not all) and the last value. 我正在使用分组和中位数，我想对data.frame进行分组，其中data.frame某些行（并非全部）的中位数和最后一个值。
My data are something like this: 我的数据是这样的：

 test <- data.frame(
id = c('A','A','A','A','A','B','B','B','B','B','C','C','C','C'),
value = c(1,2,3,4,5,3,4,5,1,8,3,4,2,9))
> test
   id value
1   A     1
2   A     2
3   A     3
4   A     4
5   A     5
6   B     3
7   B     4
8   B     5
9   B     1
10  B     8
11  C     3
12  C     4
13  C     2
14  C     9

For each id , I need the median of the three (number may vary, in this case three) central rows, then the last value. 对于每个id ，我需要三个中间行（中值可能有所不同，在这种情况下为三个）中间行的中位数，然后是最后一个值。
I've tried first of all with only one id . 我首先尝试了一个id 。

test_a <- test[which(test$id == 'A'),]
> test_a
  id value
1  A     1
2  A     2
3  A     3
4  A     4
5  A     5

The desired output is this for this one, Having this: 所需的输出为此，具有以下内容：

median(test_a[(nrow(test_a)-3):(nrow(test_a)-1),]$value) # median of three central values
tail(test_a,1)$value                                     # last value

I used this: 我用这个：

library(tidyverse)

test_a %>% group_by(id) %>%
  summarise(m = median(test_a[(nrow(test_a)-3):(nrow(test_a)-1),]$value),
            last = tail(test_a,1)$value) %>%
  data.frame()
  id m last
1  A 3    5

But when I tried to generalize to all id: 但是当我尝试归纳为所有id时：

test %>% group_by(id) %>%
   summarise(m = median(test[(nrow(test)-3):(nrow(test)-1),]$value),
             last = tail(test,1)$value) %>%
   data.frame()
  id m last
1  A 3    9
2  B 3    9
3  C 3    9

I think that the formulas take the full dataset to calculate last value and median, but I cannot imagine how to make it works. 我认为公式可以使用完整的数据集来计算最后一个值和中位数，但是我无法想象如何使它起作用。 Thanks in advance. 提前致谢。

Answer 1

This works: 这有效：

test %>% 
  group_by(id) %>%
  summarise(m = median(value[(length(value)-3):(length(value)-1)]),
            last = value[length(value)])

# A tibble: 3 x 3
      id     m  last
  <fctr> <dbl> <dbl>
1      A     3     5
2      B     4     8
3      C     4     9

You just refer to variable value instead of the whole dataset within summarise . 你只是参考变量value ，而不是内部的整个数据集summarise 。

Edit: Here's a generalized version. 编辑：这是一个广义的版本。

test %>% 
  group_by(id) %>%
  summarise(m = ifelse(length(value) == 1, value, 
                       ifelse(length(value) == 2, median(value), 
                              median(value[(ceiling(length(value)/2)-1):(ceiling(length(value)/2)+1)])),
            last = value[length(value)])

If a group has only one row, the value itself will be stored in m . 如果一组只有一行，则值本身将存储在m 。 If it has only two rows, the median of these two rows will be stored in m . 如果只有两行，则这两行的median将存储在m 。 If it has three or more rows, the middle three rows will be chosen dynamically and the median of those will be stored in m . 如果它具有三行或更多行，则将动态选择中间三行，并将其中median存储在m 。

R计算某些行的组中位数和最后一行

问题描述

1 个解决方案

解决方案1
0 已采纳 2018-06-28 08:32:57

R计算某些行的组中位数和最后一行

问题描述

1 个解决方案

解决方案1 0 已采纳 2018-06-28 08:32:57

解决方案1
0 已采纳 2018-06-28 08:32:57