简体   繁体   中英

r: how to remove row numbers coming from a separate list in multiple data frames using lapply

I have many data frames organized in a list object. And I have a second list of vectors, that contain row numbers that I want to remove in my data frames. The rows to be removed are different for each data frame. Therefore the number of elements in the list of data frames is equal to the number of elements in the list of vectors. Here the code I've tried out:

test_list<-vector(mode="list",5)
test_list<-lapply(test_list, function(x) data.frame(1,1:10,"c"))
vec_list<-vector(mode="list",5)
vec_list<-lapply(vec_list, function (x) x<-sample(seq(1,10),4))
clean_list<-lapply(test_list, function (x,y) clean_list<-x[-y,],vec_list)

当具有多个对应对象列表时,使用Mapmapply函数比lapply更自然地使用。

Map(function(l, v) l[-v,], test_list, vec_list)

If you want to use lapply , one way is:

  lapply(seq_along(test_list), function(i) test_list[[i]][-vec_list[[i]],])

Benchmarks

On a medium list data,

 set.seed(45)
 test_list<-vector(mode="list",25e3)
 test_list<-lapply(test_list, function(x) data.frame(1,1:10,"c"))
 vec_list<-vector(mode="list",25e3)
 vec_list<-lapply(vec_list, function (x) x<-sample(seq(1,10),4))

 library(microbenchmark)
 f1 <- function() lapply(seq_along(test_list), function(i) test_list[[i]][-vec_list[[i]],])
 f2 <- function() Map(function(l, v) l[-v,], test_list, vec_list)

 microbenchmark(f1(), f2(), unit="relative", times=25L)
 #Unit: relative
 #expr       min        lq  median       uq       max neval
 #f1() 0.9874164 0.9977816 1.00573 1.000419 0.9837334    25
 #f2() 1.0000000 1.0000000 1.00000 1.000000 1.0000000    25

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM