简体   繁体   English

不同百分比长度的平均值

[英]Average of Values of Different Percentage Lengths

I have a data frame that looks something like this:我有一个看起来像这样的数据框:

ID ID Time时间 Value价值
A一个 0 0 84 84
A一个 1 1 90 90
A一个 2 2 76 76
A一个 3 3 98 98
B 0 0 64 64
B 1 1 81 81
C C 0 0 89 89
C C 1 1 76 76

I need to take the mean of the first 10% of values for each ID.我需要取每个 ID 的前 10% 值的平均值。

I used to do a similar process with the slice_head function, but previously I had taken the same length for each variable and used aggregate (grouped by ID) with the new data frame.我曾经使用 slice_head 函数执行类似的过程,但之前我为每个变量采用了相同的长度,并将聚合(按 ID 分组)与新数据帧一起使用。 Now that the lengths of each ID are different, slice keeps giving an error.现在每个 ID 的长度都不同了, slice 一直报错。

I have attempted map2 and lapply, but I cannot quite get it to work.我已经尝试过 map2 和 lapply,但我不能完全让它工作。

In the original data, I am hoping that the number of data points for each ID will be far more在原始数据中,我希望每个 ID 的数据点数会多得多

jnk<-data.frame(ID=c(rep("A",4),rep("B",2),rep("C",2)),Time=c(0,1,2,3,0,1,0,1),Value=c(84,90,76,98,64,81,89,76))

Mean of 1st 10% of each ID:每个 ID 的第一个 10% 的平均值:

> mean(jnk$Value[which(jnk$ID=="A")[1:(length(jnk$Value[which(jnk$ID=="A")])*0.1)]])
> mean(jnk$Value[which(jnk$ID=="B")[1:(length(jnk$Value[which(jnk$ID=="B")])*0.1)]])
> mean(jnk$Value[which(jnk$ID=="C")[1:(length(jnk$Value[which(jnk$ID=="C")])*0.1)]]) 

You will get mean values of 1st 10% with sufficiently high number of data points corresponding to each ID您将获得 1st 10% 的平均值,其中每个 ID 对应的数据点数量足够多

I am not sure if this is what you want, but you could use slice with ceiling to get the first 10% percent per group and then summarise the mean of your value column like this:我不确定这是否是你想要的,但你可以使用slice with ceiling来获得每组的前 10%,然后summarise你的 value 列的mean ,如下所示:

df <- data.frame(ID = c("A", "A", "A", "A", "B", "B", "C", "C"),
                 Time = c(0, 1, 2, 3, 0, 1, 0, 1),
                 Value = c(84, 90, 76, 98, 64, 81, 89, 76))

library(dplyr)
df %>%
  group_by(ID) %>%
  slice(1:ceiling(0.1 * n())) %>%
  summarise(avg_value = mean(Value))
#> # A tibble: 3 × 2
#>   ID    avg_value
#>   <chr>     <dbl>
#> 1 A            84
#> 2 B            64
#> 3 C            89

Created on 2022-07-08 by the reprex package (v2.0.1)reprex 包(v2.0.1)于 2022-07-08 创建

Please note: results are weird, because of less data.请注意:由于数据较少,结果很奇怪。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM