简体   繁体   中英

Add rows to list of dataframes from another dataframe

Let's have a list lis

chicago = data.frame('city' = rep('chicago'), 'year' = c(2018,2019,2020), 'population' = c(100, 105, 110))
paris = data.frame('city' = rep('paris'), 'year' = c(2018,2019,2020), 'population' = c(200, 205, 210))
berlin = data.frame('city' = rep('berlin'), 'year' = c(2018,2019,2020), 'population' = c(300, 305, 310))
bangalore = data.frame('city' = rep('bangalore'), 'year' = c(2018,2019,2020), 'population' = c(400, 405, 410))
lis = list(chicago = chicago, paris = paris, berlin = berlin, bangalore = bangalore)

Now I have a new df containing latest data for each city ,

df = data.frame('city' = c('chicago', 'paris', 'berlin', 'bangalore'), 'year' = rep(2021), 'population' = c(115, 215, 315, 415))

I want to add each row of df to lis based on city .

I do it by,

#convert to datframe
lis = dplyr::bind_rows(lis)
#rbind
lis = rbind(lis, df)
#again convert to list
lis = split(lis, lis$city)

which is inefficient for large datsets. Is their any efficient alternate for large datsets ?

Thank you.

Edit

Unit: seconds
 expr      min       lq     mean   median       uq      max neval
 ac() 22.43719 23.17452 27.85401 24.80335 25.62127 43.23373     5

The list contains 2239 dataframes and dimension of each dataframe is 310x15 . Each of these dataframe grow daily.

We may use imap to loop over the list , and filter the 'df' based on the names of the list to append the row in each of the list elements

library(dplyr)
library(purrr)
lis2 <- imap(lis, ~ .x %>%
                   bind_rows(df %>%
                              filter(city == .y)))

-output

> lis2
$chicago
     city year population
1 chicago 2018        100
2 chicago 2019        105
3 chicago 2020        110
4 chicago 2021        115

$paris
   city year population
1 paris 2018        200
2 paris 2019        205
3 paris 2020        210
4 paris 2021        215

$berlin
    city year population
1 berlin 2018        300
2 berlin 2019        305
3 berlin 2020        310
4 berlin 2021        315

$bangalore
       city year population
1 bangalore 2018        400
2 bangalore 2019        405
3 bangalore 2020        410
4 bangalore 2021        415

Or using base R with Map and rbind

Map(function(x, nm) rbind(x, df[df$city == nm,]), lis, names(lis))

Or use rbindlist from data.table

library(data.table)
rbindlist(c(lis, list(df)))[, .(split(.SD, city))]$V1

Or a slightly more efficient, will be with split

Map(rbind, lis, split(df, df$city)[names(lis)])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM