[英]Splitting a dataframe into smaller chunks
Let's say I have the following data.frame
假设我有以下
data.frame
> df <- data.frame(a=rep(1:3),b=rep(1:3),c=rep(4:6),d=rep(4:6))
> df
a b c d
1 1 1 4 4
2 2 2 5 5
3 3 3 6 6
I would like to be able to split this df
into two frames df1
and df2
. 我希望能够将此
df
分为两个帧df1
和df2
。 I want df1
to be the first two columns of df
and for df2
to be the second two columns of df
. 我希望
df1
是df
的前两列,而df2
是df
两列。 Is there a way to do this in code so that I do not have to do the following manually: 有没有一种方法可以在代码中执行此操作,因此我不必手动执行以下操作:
> df1 <- df[,1:2]
> df1
a b
1 1 1
2 2 2
3 3 3
> df2 <- df[,3:4]
> df2
c d
1 4 4
2 5 5
3 6 6
It would be nice because the problem I'm dealing with has a variable number of columns and I just want to be able to create n = ncol(df)/2
data frames. 很好,因为我要处理的问题的列数可变,我只想能够创建
n = ncol(df)/2
数据帧。 So if there were 2 more columns in the example above, df3
would be df[,5:6]
. 因此,如果上面的示例中还有2列,则
df3
将为df[,5:6]
。
Thank you! 谢谢!
Assuming your data.frame
has a pair number of columns here is a very short code: 假设您的
data.frame
具有成对的列数,这是一个很短的代码:
>lapply(seq(1,length(df),2), function(u) df[u:(u+1)])
[[1]]
a b
1 1 1
2 2 2
3 3 3
[[2]]
c d
1 4 4
2 5 5
3 6 6
This can help: 这可以帮助:
df <- data.frame(a=rep(1:3),b=rep(1:3),c=rep(4:6),d=rep(4:6),e=rep(4:6),f=rep(4:6) )
mylist <- list()
for ( i in 1:ncol(df) ) {
if (i %% 2 == 0) {
mylist[[length(mylist)+1]] <- df[, (i-1):i ]
}
}
Output: 输出:
> mylist
[[1]]
a b
1 1 1
2 2 2
3 3 3
[[2]]
c d
1 4 4
2 5 5
3 6 6
[[3]]
e f
1 4 4
2 5 5
3 6 6
I am using 6 columns here to show you that it works for any number of columns (assuming an even number of columns). 我在这里使用6列向您展示它适用于任何数量的列(假设列数为偶数)。 All the dataframes you want are stored in a list (so you have a list of dataframes) and you can access each dataframe as
mylist[[ <number_here> ]]
. 所需的所有数据帧都存储在一个列表中(因此您有一个数据帧列表),并且可以将每个数据帧作为
mylist[[ <number_here> ]]
。
Hope this helps! 希望这可以帮助!
The below method should work with both even
and odd
number of columns 以下方法应适用于
even
和odd
列
fsplit <- function(df, n, Ncol=TRUE){
lst <- lapply(split(seq_along(df), as.numeric(gl(ncol(df),
n, ncol(df)))), function(i) df[i])
if(Ncol){
lst[sapply(lst, ncol)==n]
}
else {
lst
}
}
fsplit(df,2)
fsplit(df,3)
fsplit(df,3,FALSE)
fsplit(df1,2)
set.seed(24)
df1 <- as.data.frame(matrix(sample(1:10, 7*3, replace=TRUE), ncol=7))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.