[英]Average of Values of Different Percentage Lengths
I have a data frame that looks something like this:我有一个看起来像这样的数据框:
ID ![]() |
Time![]() |
Value![]() |
---|---|---|
A![]() |
0 ![]() |
84 ![]() |
A![]() |
1 ![]() |
90 ![]() |
A![]() |
2 ![]() |
76 ![]() |
A![]() |
3 ![]() |
98 ![]() |
B![]() |
0 ![]() |
64 ![]() |
B![]() |
1 ![]() |
81 ![]() |
C ![]() |
0 ![]() |
89 ![]() |
C ![]() |
1 ![]() |
76 ![]() |
I need to take the mean of the first 10% of values for each ID.我需要取每个 ID 的前 10% 值的平均值。
I used to do a similar process with the slice_head function, but previously I had taken the same length for each variable and used aggregate (grouped by ID) with the new data frame.我曾经使用 slice_head 函数执行类似的过程,但之前我为每个变量采用了相同的长度,并将聚合(按 ID 分组)与新数据帧一起使用。 Now that the lengths of each ID are different, slice keeps giving an error.
现在每个 ID 的长度都不同了, slice 一直报错。
I have attempted map2 and lapply, but I cannot quite get it to work.我已经尝试过 map2 和 lapply,但我不能完全让它工作。
In the original data, I am hoping that the number of data points for each ID will be far more在原始数据中,我希望每个 ID 的数据点数会多得多
jnk<-data.frame(ID=c(rep("A",4),rep("B",2),rep("C",2)),Time=c(0,1,2,3,0,1,0,1),Value=c(84,90,76,98,64,81,89,76))
Mean of 1st 10% of each ID:每个 ID 的第一个 10% 的平均值:
> mean(jnk$Value[which(jnk$ID=="A")[1:(length(jnk$Value[which(jnk$ID=="A")])*0.1)]])
> mean(jnk$Value[which(jnk$ID=="B")[1:(length(jnk$Value[which(jnk$ID=="B")])*0.1)]])
> mean(jnk$Value[which(jnk$ID=="C")[1:(length(jnk$Value[which(jnk$ID=="C")])*0.1)]])
You will get mean values of 1st 10% with sufficiently high number of data points corresponding to each ID您将获得 1st 10% 的平均值,其中每个 ID 对应的数据点数量足够多
I am not sure if this is what you want, but you could use slice
with ceiling
to get the first 10% percent per group and then summarise
the mean
of your value column like this:我不确定这是否是你想要的,但你可以使用
slice
with ceiling
来获得每组的前 10%,然后summarise
你的 value 列的mean
,如下所示:
df <- data.frame(ID = c("A", "A", "A", "A", "B", "B", "C", "C"),
Time = c(0, 1, 2, 3, 0, 1, 0, 1),
Value = c(84, 90, 76, 98, 64, 81, 89, 76))
library(dplyr)
df %>%
group_by(ID) %>%
slice(1:ceiling(0.1 * n())) %>%
summarise(avg_value = mean(Value))
#> # A tibble: 3 × 2
#> ID avg_value
#> <chr> <dbl>
#> 1 A 84
#> 2 B 64
#> 3 C 89
Created on 2022-07-08 by the reprex package (v2.0.1)由reprex 包(v2.0.1)于 2022-07-08 创建
Please note: results are weird, because of less data.请注意:由于数据较少,结果很奇怪。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.