简体   繁体   English

根据跨列的行值将data.frame拆分为列表

[英]split data.frame into list based on row values across columns

I would like to split a data.frame into a list based on row values/characters across all columns of the data.frame. 我想将data.frame拆分为基于data.frame所有列的行值/字符的列表。

I wrote lists of data.frames to file using write.list {erer} 我使用write.list {erer}write.list {erer}列表写入文件

So now when I read them in again, they look like this: 所以现在当我再次阅读它们时,它们看起来像这样:

dummy data 虚拟数据

set.seed(1)
df <- cbind(data.frame(col1=c(sample(LETTERS, 4),"col1",sample(LETTERS, 7))),
            data.frame(col2=c(sample(LETTERS, 4),"col2",sample(LETTERS, 7))),
            data.frame(col3=c(sample(LETTERS, 4),"col3",sample(LETTERS, 7))))
   col1 col2 col3
1     G    E    Q
2     J    R    D
3     N    J    G
4     U    Y    I
5  col1 col2 col3
6     F    M    A
7     W    R    J
8     Y    X    U
9     P    I    H
10    N    Y    K
11    B    T    M
12    E    E    Y

And I would like to split into lists by c("col1","col2","col3") producing 我想通过c("col1","col2","col3")分成列表

[[1]]
       col1 col2 col3
    1     G    E    Q
    2     J    R    D
    3     N    J    G
    4     U    Y    I

[[2]]     
       col1 col2 col3
    1     F    M    A
    2     W    R    J
    3     Y    X    U
    4     P    I    H
    5     N    Y    K
    6     B    T    M
    7     E    E    Y

Feels like it should be straightforward using split , but my attempts so far have failed. 感觉它应该是直截了当使用split ,但到目前为止我的尝试都失败了。 Also, as you see, I can't split by a certain row interval. 另外,如您所见,我无法按某个行间隔进行拆分。

Any pointers would be highly appreciated, thanks! 任何指针都将受到高度赞赏,谢谢!

Try 尝试

lapply(split(d1, cumsum(grepl(names(d1)[1], d1$col1))), function(x) x[!grepl(names(d1)[1], x$col1),])
#$`0`
#  col1 col2 col3
#1    G    E    Q
#2    J    R    D
#3    N    J    G
#4    U    Y    I

#$`1`
#   col1 col2 col3
#6     F    M    A
#7     W    R    J
#8     Y    X    U
#9     P    I    H
#10    N    Y    K
#11    B    T    M
#12    E    E    Y

This should be general, if you want to split if a line is exactly like the colnames : 这应该是一般的,如果你想要拆分一条线就像一个完全相同的colnames

dfSplit<-split(df,cumsum(Reduce("&",Map("==",df,colnames(df)))))
for (i in 2:length(dfSplit)) dfSplit[[i]]<-dfSplit[[i]][-1,]

The second line can be written a little more R-style as @DavidArenburg suggested in the comments. 正如@DavidArenburg在评论中所建议的那样,第二行可以写成更多R风格。

dfSplit[-1] <- lapply(dfSplit[-1], function(x) x[-1, ])

It has also the added benefit of doing nothing if dfSplit has length 1 (opposite to my original second line, which would throw an error). 如果dfSplit长度为1(与我原来的第二行相反,这会产生错误),它还可以无所作为。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM