简体   繁体   中英

join columns recursively in R

Hello I have a data frame of 245 columns but to add some sets and generate new columns try to do it recursively as follows

cl1<-sample(1:4,10,replace=TRUE)
cl2<-sample(1:4,10,replace=TRUE)
cl3<-sample(1:4,10,replace=TRUE)
cl4<-sample(1:4,10,replace=TRUE)
cl5<-sample(1:4,10,replace=TRUE)
cl6<-sample(1:4,10,replace=TRUE)
dat<-data.frame(cl1,cl2,cl3,cl4,cl5,cl6)

my intention is to add column 1 with column 3 and 5, likewise column 2 with 4 and 6 and in the end obtain a dataframe with two columns

在此处输入图像描述

and you should pay me something like that在此处输入图像描述

I have programmed the following code



revisar<- function(a){
  todos = list()
  i=1
  j=3
  l=5
  k=1
  while(i<=2 ){
    
    cl<-a[,i]
    cl2<-a[,j]
    cl3<-a[,l]
    cl[is.na(cl)] <- 0
    cl2[is.na(cl2)] <- 0
    cl3[is.na(cl3)] <- 0
    
    colu<-cl+cl2+cl3
    
    col<-cbind(colu,colu)
    
    i<-i+1
    j<-j+1
    l<-l+1
    k<-k+1
  }
 
  return(col)
}

it turns out that it only returns column 2 repeated twice and I must replicate the same thing to join those 245 columns.7

I would like to know what is failing the example

base R

Literal programming:

with(dat, data.frame(s1 = cl1+cl3+cl5, s2 = cl2+cl4+cl6))
#    s1 s2
# 1   7 11
# 2   7  7
# 3   4 11
# 4   4 10
# 5   9  8
# 6  12  5
# 7   7  6
# 8   7 10
# 9   4  9
# 10  6  5

Programmatically,

L <- list(s1 = c(1,3,5), s2 = c(2,4,6))
out <- data.frame(lapply(L, function(z) do.call(rowSums, list(as.matrix(dat[,z])))))
out
#    s1 s2
# 1   7 11
# 2   7  7
# 3   4 11
# 4   4 10
# 5   9  8
# 6  12  5
# 7   7  6
# 8   7 10
# 9   4  9
# 10  6  5

dplyr

library(dplyr)
dat %>%
  transmute(
    s1 = rowSums(cbind(cl1, cl3, cl5)),
    s2 = rowSums(cbind(cl2, cl4, cl6))
  )

or programmatically using purrr :

purrr::map_dfc(L, ~ rowSums(dat[, .]))

Data

set.seed(42)
# your `dat` above

Here is an alternative general approach:

Here we sum all uneven columns -> s1 and all even columns -> s2:

library(dplyr)

dat %>%
  rowwise() %>% 
  mutate(s1 = sum(c_across(seq(1,ncol(dat),2)), na.rm = TRUE),
         s2 = sum(c_across(seq(2,ncol(dat),2)), na.rm = TRUE))
     cl1   cl2   cl3   cl4   cl5   cl6    s1    s2
   <int> <int> <int> <int> <int> <int> <int> <int>
 1     1     1     3     2     3     2     7     5
 2     2     4     1     4     2     3     5    11
 3     2     2     2     2     1     3     5     7
 4     2     4     4     3     1     4     7    11
 5     2     4     4     3     2     2     8     9
 6     3     3     3     2     2     2     8     7
 7     2     1     1     2     1     4     4     7
 8     2     4     1     3     2     3     5    10
 9     3     1     1     2     3     4     7     7
10     2     4     1     3     4     4     7    11

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM