簡體   English   中英

R中的條件重新編碼和總和

[英]Conditional recode and sum in R

我的(樣本)數據如下所示:

mydata <- structure(list(x1 = c(0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L), x2 = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
2L, 3L, 4L, 4L), x3 = c(1L, 3L, 5L, 1L, 3L, 5L, 1L, 4L, 5L, 2L, 
1L, 5L, 6L, 6L), week = c(0L, 0L, 0L, 0L, 0L, 0L, 1L, 30L, 50L, 
22L, 52L, 36L, 25L, 26L), newar1 = c(0L, 0L, 2L, 0L, 0L, 2L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L), newvar2 = c(0L, 2L, 0L, 0L, 
2L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L)), .Names = c("x1", "x2", 
"x3", "week", "newar1", "newvar2"), class = "data.frame", row.names = c(NA, 
-14L))



x1  x2  x3  week
0   1   1   0
0   2   3   0
0   3   5   0
0   1   1   0
0   2   3   0
0   3   5   0
1   1   1   1
1   2   4   30
1   3   5   50
1   1   2   22
1   2   1   52
1   3   5   36
1   4   6   25
1   4   6   26

我想創建1個新變量newvar1

  • 如果x1 = 0 =>我想計算整個數據集中x1等於1的次數(僅其他行,不包括自己的觀察值),但是僅計算x2和x3的相同組合的行以及周數大於24。

  • 如果x1 = 1 =>我想計算x1等於1的整個數據集中的次數,但只計算x2和x3的組合相同的行以及周數減去25大於零的行((第25周)> 0))。

“和”是指如果條件成立,x1等於1的次數。

“如果”是指我只想在滿足if的條件成立時求和x1。 基本上我的問題是:如何僅根據條件求和某些值?

我的數據應如下所示:

x1  x2  x3  week newvar1
0   1   1   0    0       
0   2   3   0    0       
0   3   5   0    2       
0   1   1   0    0       
0   2   3   0    0       
0   3   5   0    2       
1   1   1   1    0       
1   2   4   30   0       
1   3   5   50   1       
1   1   2   22   0       
1   2   1   52   0       
1   3   5   36   0       
1   4   6   25   0       
1   4   6   26   1       

當前,我有以下代碼,但這未考慮x2=x3和一周的約束。 有什么建議怎么做?

mydata[,newvar1:=sum(x1), by=list(x2,x3)]

我認為我們可以使用for循環來完成此操作:

for(i in 1:nrow(mydata)){
 if(mydata[i,1] == 0){ # x1 == 0
   mydata[i,]$newvar1 = 
    sum(mydata[-i,1] == 1 & # count where x1 == 1
        mydata[i,2] == mydata[-i,2] & # and where (x2 == x2) & (x3 == x3)
        mydata[i,3] == mydata[-i,3] & 
        mydata[-i,4] > 24) # and week > 24
 }else{ # x1 == 1
    mydata[i,]$newvar1 = 
        sum(mydata[-i,1] == 1 & # count where x1 == 1
            mydata[i,2] == mydata[-i,2] & # and where (x2 == x2) & (x3 == x3)
            mydata[i,3] == mydata[-i,3] & 
            mydata[-i,4] > 25) # and week > 25
    }
}

# mydata
#    x1 x2 x3 week newvar1
# 1   0  1  1    0       0
# 2   0  2  3    0       0
# 3   0  3  5    0       2
# 4   0  1  1    0       0
# 5   0  2  3    0       0
# 6   0  3  5    0       2
# 7   1  1  1    1       0
# 8   1  2  4   30       0
# 9   1  3  5   50       1
# 10  1  1  2   22       0
# 11  1  2  1   52       0
# 12  1  3  5   36       1
# 13  1  4  6   25       1
# 14  1  4  6   26       0

或者,如果對於x1 == 1您想比較所有行上的數據:

for(i in 1:nrow(mydata)){
    if(mydata[i,1] == 0){ # x1 == 0
        mydata[i,]$newvar1 = 
            sum(mydata[-i,1] == 1 & #count where x1 = 1
                    mydata[i,2] == mydata[-i,2] & # and where (x2 == x2) & (x3 == x3)
                    mydata[i,3] == mydata[-i,3] & 
                    mydata[-i,4] > 24) # and week > 24
    }else{
        mydata[i,]$newvar1 = 
            sum(mydata[,1] == 1 &
                mydata[i,2] == mydata[,2] & 
                mydata[i,3] == mydata[,3] & 
                mydata[,4] > 25)
    }
}

# mydata
#    x1 x2 x3 week newvar1
# 1   0  1  1    0       0
# 2   0  2  3    0       0
# 3   0  3  5    0       2
# 4   0  1  1    0       0
# 5   0  2  3    0       0
# 6   0  3  5    0       2
# 7   1  1  1    1       0
# 8   1  2  4   30       1
# 9   1  3  5   50       2
# 10  1  1  2   22       0
# 11  1  2  1   52       1
# 12  1  3  5   36       2
# 13  1  4  6   25       1
# 14  1  4  6   26       1
mydata$newvar1 <- ifelse(mydata$x1==0, sapply(seq_len(nrow(mydata)), function(i) with (mydata, sum(x1[week > 25 & x2==x2[i] & x3==x3[i]]))), 0)
mydata$newvar1 <- ifelse(mydata$x1==1, sapply(seq_len(nrow(mydata)), function(i) with (mydata, sum(x1[week < week[i] & week[i]!=0 & week-week[i]<25 & x2==x2[i] & x3==x3[i]]))), mydata$newvar1)

使用dplyr

library(dplyr)
mydata %>% group_by(x2, x3) %>%
    mutate(newvar1 = ifelse(x1 == 0,
                            sum(x1 * week > 24),
                            sum(x1 * week > 25) - (week > 25) * (x1 == 1)))
# Source: local data frame [14 x 6]
# Groups: x2, x3 [7]
# 
#       x1    x2    x3  week newvar2 newvar1
#    <int> <int> <int> <int>   <int>   <int>
# 1      0     1     1     0       0       0
# 2      0     2     3     0       2       0
# 3      0     3     5     0       0       2
# 4      0     1     1     0       0       0
# 5      0     2     3     0       2       0
# 6      0     3     5     0       0       2
# 7      1     1     1     1       0       0
# 8      1     2     4    30       0       0
# 9      1     3     5    50       1       1
# 10     1     1     2    22       0       0
# 11     1     2     1    52       0       0
# 12     1     3     5    36       0       1
# 13     1     4     6    25       0       1
# 14     1     4     6    26       0       0

else條件中的怪異位- (week > 25) * (x1 == 1)是從否則會與自己匹配的行中減去1。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM