简体   繁体   English

将数据帧分成较小的块

[英]Splitting a dataframe into smaller chunks

Let's say I have the following data.frame 假设我有以下data.frame

> df <- data.frame(a=rep(1:3),b=rep(1:3),c=rep(4:6),d=rep(4:6))
> df
  a b c d
1 1 1 4 4
2 2 2 5 5
3 3 3 6 6

I would like to be able to split this df into two frames df1 and df2 . 我希望能够将此df分为两个帧df1df2 I want df1 to be the first two columns of df and for df2 to be the second two columns of df . 我希望df1df的前两列,而df2df两列。 Is there a way to do this in code so that I do not have to do the following manually: 有没有一种方法可以在代码中执行此操作,因此我不必手动执行以下操作:

> df1 <- df[,1:2]
> df1
  a b
1 1 1
2 2 2
3 3 3
> df2 <- df[,3:4]
> df2
  c d
1 4 4
2 5 5
3 6 6

It would be nice because the problem I'm dealing with has a variable number of columns and I just want to be able to create n = ncol(df)/2 data frames. 很好,因为我要处理的问题的列数可变,我只想能够创建n = ncol(df)/2数据帧。 So if there were 2 more columns in the example above, df3 would be df[,5:6] . 因此,如果上面的示例中还有2列,则df3将为df[,5:6]

Thank you! 谢谢!

Assuming your data.frame has a pair number of columns here is a very short code: 假设您的data.frame具有成对的列数,这是一个很短的代码:

>lapply(seq(1,length(df),2), function(u) df[u:(u+1)])
[[1]]
  a b
1 1 1
2 2 2
3 3 3

[[2]]
  c d
1 4 4
2 5 5
3 6 6

This can help: 这可以帮助:

df <- data.frame(a=rep(1:3),b=rep(1:3),c=rep(4:6),d=rep(4:6),e=rep(4:6),f=rep(4:6) )

mylist <- list()
for ( i in 1:ncol(df) ) {
  if (i %% 2 == 0) {
    mylist[[length(mylist)+1]] <-  df[, (i-1):i ]
  }
}

Output: 输出:

> mylist
[[1]]
  a b
1 1 1
2 2 2
3 3 3

[[2]]
  c d
1 4 4
2 5 5
3 6 6

[[3]]
  e f
1 4 4
2 5 5
3 6 6

I am using 6 columns here to show you that it works for any number of columns (assuming an even number of columns). 我在这里使用6列向您展示它适用于任何数量的列(假设列数为偶数)。 All the dataframes you want are stored in a list (so you have a list of dataframes) and you can access each dataframe as mylist[[ <number_here> ]] . 所需的所有数据帧都存储在一个列表中(因此您有一个数据帧列表),并且可以将每个数据帧作为mylist[[ <number_here> ]]

Hope this helps! 希望这可以帮助!

The below method should work with both even and odd number of columns 以下方法应适用于evenodd

 fsplit <- function(df, n, Ncol=TRUE){
      lst <- lapply(split(seq_along(df), as.numeric(gl(ncol(df),
                               n, ncol(df)))), function(i) df[i])
        if(Ncol){
             lst[sapply(lst, ncol)==n]
             }
        else {
           lst
            }
       }

fsplit(df,2)
fsplit(df,3)
fsplit(df,3,FALSE)
fsplit(df1,2)

data 数据

 set.seed(24)
 df1 <- as.data.frame(matrix(sample(1:10, 7*3, replace=TRUE), ncol=7))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM