[英]Conditional recode and sum in R
我的(樣本)數據如下所示:
mydata <- structure(list(x1 = c(0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L), x2 = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L,
2L, 3L, 4L, 4L), x3 = c(1L, 3L, 5L, 1L, 3L, 5L, 1L, 4L, 5L, 2L,
1L, 5L, 6L, 6L), week = c(0L, 0L, 0L, 0L, 0L, 0L, 1L, 30L, 50L,
22L, 52L, 36L, 25L, 26L), newar1 = c(0L, 0L, 2L, 0L, 0L, 2L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L), newvar2 = c(0L, 2L, 0L, 0L,
2L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L)), .Names = c("x1", "x2",
"x3", "week", "newar1", "newvar2"), class = "data.frame", row.names = c(NA,
-14L))
x1 x2 x3 week
0 1 1 0
0 2 3 0
0 3 5 0
0 1 1 0
0 2 3 0
0 3 5 0
1 1 1 1
1 2 4 30
1 3 5 50
1 1 2 22
1 2 1 52
1 3 5 36
1 4 6 25
1 4 6 26
我想創建1個新變量newvar1
:
如果x1 = 0 =>我想計算整個數據集中x1等於1的次數(僅其他行,不包括自己的觀察值),但是僅計算x2和x3的相同組合的行以及周數大於24。
如果x1 = 1 =>我想計算x1等於1的整個數據集中的次數,但只計算x2和x3的組合相同的行以及周數減去25大於零的行((第25周)> 0))。
“和”是指如果條件成立,x1等於1的次數。
“如果”是指我只想在滿足if的條件成立時求和x1。 基本上我的問題是:如何僅根據條件求和某些值?
我的數據應如下所示:
x1 x2 x3 week newvar1
0 1 1 0 0
0 2 3 0 0
0 3 5 0 2
0 1 1 0 0
0 2 3 0 0
0 3 5 0 2
1 1 1 1 0
1 2 4 30 0
1 3 5 50 1
1 1 2 22 0
1 2 1 52 0
1 3 5 36 0
1 4 6 25 0
1 4 6 26 1
當前,我有以下代碼,但這未考慮x2=x3
和一周的約束。 有什么建議怎么做?
mydata[,newvar1:=sum(x1), by=list(x2,x3)]
我認為我們可以使用for
循環來完成此操作:
for(i in 1:nrow(mydata)){
if(mydata[i,1] == 0){ # x1 == 0
mydata[i,]$newvar1 =
sum(mydata[-i,1] == 1 & # count where x1 == 1
mydata[i,2] == mydata[-i,2] & # and where (x2 == x2) & (x3 == x3)
mydata[i,3] == mydata[-i,3] &
mydata[-i,4] > 24) # and week > 24
}else{ # x1 == 1
mydata[i,]$newvar1 =
sum(mydata[-i,1] == 1 & # count where x1 == 1
mydata[i,2] == mydata[-i,2] & # and where (x2 == x2) & (x3 == x3)
mydata[i,3] == mydata[-i,3] &
mydata[-i,4] > 25) # and week > 25
}
}
# mydata
# x1 x2 x3 week newvar1
# 1 0 1 1 0 0
# 2 0 2 3 0 0
# 3 0 3 5 0 2
# 4 0 1 1 0 0
# 5 0 2 3 0 0
# 6 0 3 5 0 2
# 7 1 1 1 1 0
# 8 1 2 4 30 0
# 9 1 3 5 50 1
# 10 1 1 2 22 0
# 11 1 2 1 52 0
# 12 1 3 5 36 1
# 13 1 4 6 25 1
# 14 1 4 6 26 0
或者,如果對於x1 == 1
您想比較所有行上的數據:
for(i in 1:nrow(mydata)){
if(mydata[i,1] == 0){ # x1 == 0
mydata[i,]$newvar1 =
sum(mydata[-i,1] == 1 & #count where x1 = 1
mydata[i,2] == mydata[-i,2] & # and where (x2 == x2) & (x3 == x3)
mydata[i,3] == mydata[-i,3] &
mydata[-i,4] > 24) # and week > 24
}else{
mydata[i,]$newvar1 =
sum(mydata[,1] == 1 &
mydata[i,2] == mydata[,2] &
mydata[i,3] == mydata[,3] &
mydata[,4] > 25)
}
}
# mydata
# x1 x2 x3 week newvar1
# 1 0 1 1 0 0
# 2 0 2 3 0 0
# 3 0 3 5 0 2
# 4 0 1 1 0 0
# 5 0 2 3 0 0
# 6 0 3 5 0 2
# 7 1 1 1 1 0
# 8 1 2 4 30 1
# 9 1 3 5 50 2
# 10 1 1 2 22 0
# 11 1 2 1 52 1
# 12 1 3 5 36 2
# 13 1 4 6 25 1
# 14 1 4 6 26 1
mydata$newvar1 <- ifelse(mydata$x1==0, sapply(seq_len(nrow(mydata)), function(i) with (mydata, sum(x1[week > 25 & x2==x2[i] & x3==x3[i]]))), 0)
mydata$newvar1 <- ifelse(mydata$x1==1, sapply(seq_len(nrow(mydata)), function(i) with (mydata, sum(x1[week < week[i] & week[i]!=0 & week-week[i]<25 & x2==x2[i] & x3==x3[i]]))), mydata$newvar1)
使用dplyr
:
library(dplyr)
mydata %>% group_by(x2, x3) %>%
mutate(newvar1 = ifelse(x1 == 0,
sum(x1 * week > 24),
sum(x1 * week > 25) - (week > 25) * (x1 == 1)))
# Source: local data frame [14 x 6]
# Groups: x2, x3 [7]
#
# x1 x2 x3 week newvar2 newvar1
# <int> <int> <int> <int> <int> <int>
# 1 0 1 1 0 0 0
# 2 0 2 3 0 2 0
# 3 0 3 5 0 0 2
# 4 0 1 1 0 0 0
# 5 0 2 3 0 2 0
# 6 0 3 5 0 0 2
# 7 1 1 1 1 0 0
# 8 1 2 4 30 0 0
# 9 1 3 5 50 1 1
# 10 1 1 2 22 0 0
# 11 1 2 1 52 0 0
# 12 1 3 5 36 0 1
# 13 1 4 6 25 0 1
# 14 1 4 6 26 0 0
else
條件中的怪異位- (week > 25) * (x1 == 1)
是從否則會與自己匹配的行中減去1。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.