[英]split data.frame into list based on row values across columns
I would like to split a data.frame into a list based on row values/characters across all columns of the data.frame. 我想将data.frame拆分为基于data.frame所有列的行值/字符的列表。
I wrote lists of data.frames to file using write.list {erer}
我使用
write.list {erer}
将write.list {erer}
列表写入文件
So now when I read them in again, they look like this: 所以现在当我再次阅读它们时,它们看起来像这样:
dummy data 虚拟数据
set.seed(1)
df <- cbind(data.frame(col1=c(sample(LETTERS, 4),"col1",sample(LETTERS, 7))),
data.frame(col2=c(sample(LETTERS, 4),"col2",sample(LETTERS, 7))),
data.frame(col3=c(sample(LETTERS, 4),"col3",sample(LETTERS, 7))))
col1 col2 col3
1 G E Q
2 J R D
3 N J G
4 U Y I
5 col1 col2 col3
6 F M A
7 W R J
8 Y X U
9 P I H
10 N Y K
11 B T M
12 E E Y
And I would like to split into lists by c("col1","col2","col3")
producing 我想通过
c("col1","col2","col3")
分成列表
[[1]]
col1 col2 col3
1 G E Q
2 J R D
3 N J G
4 U Y I
[[2]]
col1 col2 col3
1 F M A
2 W R J
3 Y X U
4 P I H
5 N Y K
6 B T M
7 E E Y
Feels like it should be straightforward using split
, but my attempts so far have failed. 感觉它应该是直截了当使用
split
,但到目前为止我的尝试都失败了。 Also, as you see, I can't split by a certain row interval. 另外,如您所见,我无法按某个行间隔进行拆分。
Any pointers would be highly appreciated, thanks! 任何指针都将受到高度赞赏,谢谢!
Try 尝试
lapply(split(d1, cumsum(grepl(names(d1)[1], d1$col1))), function(x) x[!grepl(names(d1)[1], x$col1),])
#$`0`
# col1 col2 col3
#1 G E Q
#2 J R D
#3 N J G
#4 U Y I
#$`1`
# col1 col2 col3
#6 F M A
#7 W R J
#8 Y X U
#9 P I H
#10 N Y K
#11 B T M
#12 E E Y
This should be general, if you want to split if a line is exactly like the colnames
: 这应该是一般的,如果你想要拆分一条线就像一个完全相同的
colnames
:
dfSplit<-split(df,cumsum(Reduce("&",Map("==",df,colnames(df)))))
for (i in 2:length(dfSplit)) dfSplit[[i]]<-dfSplit[[i]][-1,]
The second line can be written a little more R-style as @DavidArenburg suggested in the comments. 正如@DavidArenburg在评论中所建议的那样,第二行可以写成更多R风格。
dfSplit[-1] <- lapply(dfSplit[-1], function(x) x[-1, ])
It has also the added benefit of doing nothing if dfSplit
has length 1 (opposite to my original second line, which would throw an error). 如果
dfSplit
长度为1(与我原来的第二行相反,这会产生错误),它还可以无所作为。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.