简体   繁体   中英

Splitting a dataframe into smaller chunks

Let's say I have the following data.frame

> df <- data.frame(a=rep(1:3),b=rep(1:3),c=rep(4:6),d=rep(4:6))
> df
  a b c d
1 1 1 4 4
2 2 2 5 5
3 3 3 6 6

I would like to be able to split this df into two frames df1 and df2 . I want df1 to be the first two columns of df and for df2 to be the second two columns of df . Is there a way to do this in code so that I do not have to do the following manually:

> df1 <- df[,1:2]
> df1
  a b
1 1 1
2 2 2
3 3 3
> df2 <- df[,3:4]
> df2
  c d
1 4 4
2 5 5
3 6 6

It would be nice because the problem I'm dealing with has a variable number of columns and I just want to be able to create n = ncol(df)/2 data frames. So if there were 2 more columns in the example above, df3 would be df[,5:6] .

Thank you!

Assuming your data.frame has a pair number of columns here is a very short code:

>lapply(seq(1,length(df),2), function(u) df[u:(u+1)])
[[1]]
  a b
1 1 1
2 2 2
3 3 3

[[2]]
  c d
1 4 4
2 5 5
3 6 6

This can help:

df <- data.frame(a=rep(1:3),b=rep(1:3),c=rep(4:6),d=rep(4:6),e=rep(4:6),f=rep(4:6) )

mylist <- list()
for ( i in 1:ncol(df) ) {
  if (i %% 2 == 0) {
    mylist[[length(mylist)+1]] <-  df[, (i-1):i ]
  }
}

Output:

> mylist
[[1]]
  a b
1 1 1
2 2 2
3 3 3

[[2]]
  c d
1 4 4
2 5 5
3 6 6

[[3]]
  e f
1 4 4
2 5 5
3 6 6

I am using 6 columns here to show you that it works for any number of columns (assuming an even number of columns). All the dataframes you want are stored in a list (so you have a list of dataframes) and you can access each dataframe as mylist[[ <number_here> ]] .

Hope this helps!

The below method should work with both even and odd number of columns

 fsplit <- function(df, n, Ncol=TRUE){
      lst <- lapply(split(seq_along(df), as.numeric(gl(ncol(df),
                               n, ncol(df)))), function(i) df[i])
        if(Ncol){
             lst[sapply(lst, ncol)==n]
             }
        else {
           lst
            }
       }

fsplit(df,2)
fsplit(df,3)
fsplit(df,3,FALSE)
fsplit(df1,2)

data

 set.seed(24)
 df1 <- as.data.frame(matrix(sample(1:10, 7*3, replace=TRUE), ncol=7))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM