[英]R dplyr: calculate within-factor differences, for each factor independently
[英]Calculate a mean, by a condition, within a factor [r]
我想要計算結果變量的簡單均值,但僅針對與另一個運行變量的最大實例相關聯的結果,按因子分組。
當然,計算的統計量可以代替任何其他函數,並且組內的評估可以是任何其他函數。
library(data.table) #1.9.5
dt <- data.table(name = rep(LETTERS[1:7], each = 3),
target = rep(c(0,1,2), 7),
filter = 1:21)
dt
## name target filter
## 1: A 0 1
## 2: A 1 2
## 3: A 2 3
## 4: B 0 4
## 5: B 1 5
## 6: B 2 6
## 7: C 0 7
使用此框架,所需的輸出應返回滿足2的標准的目標平均值。
就像是:
dt[ , .(mFilter = which.max(filter),
target = target), by = name][ ,
mean(target), by = c("name", "mFilter")]
...似乎很接近,但並沒有完全正確。
解決方案應該返回:
## name V1
## 1: A 2
## 2: B 2
## 3: ...
你可以這樣做:
dt[, .(meantarget = mean(target[filter == max(filter)])), by = name]
# name meantarget
# 1: A 2
# 2: B 2
# 3: C 2
# 4: D 2
# 5: E 2
# 6: F 2
# 7: G 2
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.