I would like to split a data.frame into a list based on row values/characters across all columns of the data.frame.
I wrote lists of data.frames to file using write.list {erer}
So now when I read them in again, they look like this:
dummy data
set.seed(1)
df <- cbind(data.frame(col1=c(sample(LETTERS, 4),"col1",sample(LETTERS, 7))),
data.frame(col2=c(sample(LETTERS, 4),"col2",sample(LETTERS, 7))),
data.frame(col3=c(sample(LETTERS, 4),"col3",sample(LETTERS, 7))))
col1 col2 col3
1 G E Q
2 J R D
3 N J G
4 U Y I
5 col1 col2 col3
6 F M A
7 W R J
8 Y X U
9 P I H
10 N Y K
11 B T M
12 E E Y
And I would like to split into lists by c("col1","col2","col3")
producing
[[1]]
col1 col2 col3
1 G E Q
2 J R D
3 N J G
4 U Y I
[[2]]
col1 col2 col3
1 F M A
2 W R J
3 Y X U
4 P I H
5 N Y K
6 B T M
7 E E Y
Feels like it should be straightforward using split
, but my attempts so far have failed. Also, as you see, I can't split by a certain row interval.
Any pointers would be highly appreciated, thanks!
Try
lapply(split(d1, cumsum(grepl(names(d1)[1], d1$col1))), function(x) x[!grepl(names(d1)[1], x$col1),])
#$`0`
# col1 col2 col3
#1 G E Q
#2 J R D
#3 N J G
#4 U Y I
#$`1`
# col1 col2 col3
#6 F M A
#7 W R J
#8 Y X U
#9 P I H
#10 N Y K
#11 B T M
#12 E E Y
This should be general, if you want to split if a line is exactly like the colnames
:
dfSplit<-split(df,cumsum(Reduce("&",Map("==",df,colnames(df)))))
for (i in 2:length(dfSplit)) dfSplit[[i]]<-dfSplit[[i]][-1,]
The second line can be written a little more R-style as @DavidArenburg suggested in the comments.
dfSplit[-1] <- lapply(dfSplit[-1], function(x) x[-1, ])
It has also the added benefit of doing nothing if dfSplit
has length 1 (opposite to my original second line, which would throw an error).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.