[英]summarise() a subset of cases with dplyr
數據:
structure(list(subjnum = c(1L, 1L, 1L, 1L, 1L, 1L), expVers = structure(c(2L,
2L, 2L, 2L, 2L, 2L), .Label = c("Angry", "Happy"), class = "factor"),
intendedSOA = c(1000L, 1000L, 100L, 100L, 50L, 50L), compatability = structure(c(1L,
1L, 1L, 1L, 1L, 1L), .Label = c("Comp", "Incomp"), class = "factor"),
T1RT = c(229L, 229L, 277L, 277L, 280L, 280L), T2RT = c(791L,
791L, 563L, 563L, 760L, 760L), T1ACC = c(1L, 1L, 1L, 1L,
1L, 1L), T2ACC = c(1L, 1L, 1L, 1L, 1L, 1L)), row.names = c(NA,
6L), class = "data.frame")
我想通過它們的mean()
來總結列 T1RT、T2RT、T1ACC 和 T2ACC,並通過數據中的其他變量/因素(subjnum、intendedSOA、兼容性、expVers)來組織這些值。 但是,變量 T1RT 和 T2RT 摘要的摘要不應包括T1ACC or T2ACC == 0
的試驗,但變量 T1ACC 和 T2ACC 的摘要應包括所有值(無條件求和)。 我嘗試通過執行以下操作在summarise()
中包含一個if()
參數:
> backcomplong2 <- ACC %>%
+ select(subjnum, expVers,intendedSOA, compatability, T1RT, T2RT, T1ACC, T2ACC)%>%
+ group_by(subjnum, compatability, expVers, intendedSOA)%>%
+ summarise(T1RT = if(T1ACC == 1 && T2ACC == 1) round(mean(T1RT)),
+ T2RT = if(T1ACC == 1 && T2ACC == 1) round(mean(T2RT)),
+ T1ACC = mean(T1ACC),
+ T2ACC = mean(T2ACC))
但收到此錯誤:
Problem with `summarise()` input `T1RT`.
x Input `T1RT` must be a vector, not NULL.
i Input `T1RT` is `if (T1ACC == 1 && T2ACC == 1) round(mean(T1RT))`.
i The error occured in group 14: subjnum = 3, compatability = "Comp", expVers = "Happy", intendedSOA = 100.
****** 請注意,我的可重現數據不會返回相同的錯誤 ********
較大的數據(我沒有在此處提供,因為它太大而無法使用dput()
粘貼到此問題中)返回錯誤。
我想我錯誤地使用了if()
語句,也許我可以嘗試if_else()
代替? 另一種解決方法是簡單地執行summarise()
兩次,一次用於 RT,另一次用於 ACC,但這更簡潔。
你放一個if
來檢查一個條件,但不要放一個else
並告訴當條件不滿足時需要做什么。 因此,它返回NULL
object 導致錯誤。
如果您只對那些您想要取mean
的值進行子集化,那么您不需要if
/ else
在這里。 嘗試這個:
library(dplyr)
ACC %>%
group_by(subjnum, compatability, expVers, intendedSOA)%>%
summarise(T1RT = mean(T1RT[T1ACC == 1 & T2ACC == 1]),
T2RT = mean(T2RT[T1ACC == 1 & T2ACC == 1]),
T1ACC = mean(T1ACC),
T2ACC = mean(T2ACC))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.