简体   繁体   中英

I want to add incrementing successive columns of a dataframe

I have a dataframe with 20 columns. I want to first calculate the sum of first two columns, then the next three columns and keep adding till i have the sum of all 20 columns. The DataFrame only has 0's and 1's. If the sum of k number of columns exceeds five, i want to change the values of the remaining columns to 0.I am unable to create a loop to do the same

0 0 0 1 1 0 1 1 1 0 0 0 0 1 0 0
0 1 1 0 0 0 0 0 1 1 0 0 0 0 1 0
1 0 1 0 0 0 1 1 1 0 0 0 0 1 0 1
0 1 1 0 0 0 0 0 0 0 0 0 0 1 1 0
0 1 0 1 0 1 0 0 0 0 0 0 0 1 1 0

Eg in the first row since we have achieved the sum=5 in the 9 th column, I want to change the remaining values to 0 ie the third last value to 0.

EDIT: Base R solution

df <- as.data.frame(dt)


tdf <- data.frame(t(df))

tdf$X1[cumsum(tdf$X1)>5] <- 0
tdf$X2[cumsum(tdf$X2)>5] <- 0
tdf$X3[cumsum(tdf$X3)>5] <- 0
tdf$X4[cumsum(tdf$X4)>5] <- 0
tdf$X5[cumsum(tdf$X5)>5] <- 0


t(tdf)
#>    V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16
#> X1  0  0  0  1  1  0  1  1  1   0   0   0   0   0   0   0
#> X2  0  1  1  0  0  0  0  0  1   1   0   0   0   0   1   0
#> X3  1  0  1  0  0  0  1  1  1   0   0   0   0   0   0   0
#> X4  0  1  1  0  0  0  0  0  0   0   0   0   0   1   1   0
#> X5  0  1  0  1  0  1  0  0  0   0   0   0   0   1   1   0

You can t() your dataframe first.

And use cumsum function and t the result back

library(data.table)

dt <- fread('0 0 0 1 1 0 1 1 1 0 0 0 0 1 0 0
0 1 1 0 0 0 0 0 1 1 0 0 0 0 1 0
1 0 1 0 0 0 1 1 1 0 0 0 0 1 0 1
0 1 1 0 0 0 0 0 0 0 0 0 0 1 1 0
0 1 0 1 0 1 0 0 0 0 0 0 0 1 1 0')

tdt <- data.table(t(dt))

tdt[,names(tdt):=lapply(.SD,function(x) {x[cumsum(x)>5] <- 0
                                         x})]

t(tdt)
#>    [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
#> V1    0    0    0    1    1    0    1    1    1     0     0     0     0     0
#> V2    0    1    1    0    0    0    0    0    1     1     0     0     0     0
#> V3    1    0    1    0    0    0    1    1    1     0     0     0     0     0
#> V4    0    1    1    0    0    0    0    0    0     0     0     0     0     1
#> V5    0    1    0    1    0    1    0    0    0     0     0     0     0     1
#>    [,15] [,16]
#> V1     0     0
#> V2     1     0
#> V3     0     0
#> V4     1     0
#> V5     1     0

Created on 2020-04-23 by the reprex package (v0.3.0)

Here is an option with dplyr

library(dplyr)
df1 %>%
   t %>% 
   as_tibble %>%
   mutate_all(~ replace(., cumsum(.) > 5, 0)) %>% 
   t

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM