不同百分比长度的平均值

Question

I have a data frame that looks something like this:我有一个看起来像这样的数据框：

ID ID	Time时间	Value价值
A一个	0 0	84 84
A一个	1 1	90 90
A一个	2 2	76 76
A一个	3 3	98 98
B乙	0 0	64 64
B乙	1 1	81 81
C C	0 0	89 89
C C	1 1	76 76

I need to take the mean of the first 10% of values for each ID.我需要取每个 ID 的前 10% 值的平均值。

I used to do a similar process with the slice_head function, but previously I had taken the same length for each variable and used aggregate (grouped by ID) with the new data frame.我曾经使用 slice_head 函数执行类似的过程，但之前我为每个变量采用了相同的长度，并将聚合（按 ID 分组）与新数据帧一起使用。 Now that the lengths of each ID are different, slice keeps giving an error.现在每个 ID 的长度都不同了， slice 一直报错。

I have attempted map2 and lapply, but I cannot quite get it to work.我已经尝试过 map2 和 lapply，但我不能完全让它工作。

Answer 1

In the original data, I am hoping that the number of data points for each ID will be far more在原始数据中，我希望每个 ID 的数据点数会多得多

jnk<-data.frame(ID=c(rep("A",4),rep("B",2),rep("C",2)),Time=c(0,1,2,3,0,1,0,1),Value=c(84,90,76,98,64,81,89,76))

Mean of 1st 10% of each ID:每个 ID 的第一个 10% 的平均值：

> mean(jnk$Value[which(jnk$ID=="A")[1:(length(jnk$Value[which(jnk$ID=="A")])*0.1)]])
> mean(jnk$Value[which(jnk$ID=="B")[1:(length(jnk$Value[which(jnk$ID=="B")])*0.1)]])
> mean(jnk$Value[which(jnk$ID=="C")[1:(length(jnk$Value[which(jnk$ID=="C")])*0.1)]])

You will get mean values of 1st 10% with sufficiently high number of data points corresponding to each ID您将获得 1st 10% 的平均值，其中每个 ID 对应的数据点数量足够多

Answer 2

I am not sure if this is what you want, but you could use slice with ceiling to get the first 10% percent per group and then summarise the mean of your value column like this:我不确定这是否是你想要的，但你可以使用slice with ceiling来获得每组的前 10%，然后summarise你的 value 列的mean ，如下所示：

df <- data.frame(ID = c("A", "A", "A", "A", "B", "B", "C", "C"),
                 Time = c(0, 1, 2, 3, 0, 1, 0, 1),
                 Value = c(84, 90, 76, 98, 64, 81, 89, 76))

library(dplyr)
df %>%
  group_by(ID) %>%
  slice(1:ceiling(0.1 * n())) %>%
  summarise(avg_value = mean(Value))
#> # A tibble: 3 × 2
#>   ID    avg_value
#>   <chr>     <dbl>
#> 1 A            84
#> 2 B            64
#> 3 C            89

^{Created on 2022-07-08 by the reprex package (v2.0.1)}^{由reprex 包（v2.0.1）于 2022-07-08 创建}

Please note: results are weird, because of less data.请注意：由于数据较少，结果很奇怪。

不同百分比长度的平均值

问题描述

2 个解决方案

解决方案1
0 2022-07-08 07:19:56

解决方案2
0 2022-07-08 08:09:44

不同百分比长度的平均值

问题描述

2 个解决方案

解决方案1 0 2022-07-08 07:19:56

解决方案2 0 2022-07-08 08:09:44

解决方案1
0 2022-07-08 07:19:56

解决方案2
0 2022-07-08 08:09:44