将行添加到来自另一个数据帧的数据帧列表

Question

Let's have a list lis让我们有一个列表lis

chicago = data.frame('city' = rep('chicago'), 'year' = c(2018,2019,2020), 'population' = c(100, 105, 110))
paris = data.frame('city' = rep('paris'), 'year' = c(2018,2019,2020), 'population' = c(200, 205, 210))
berlin = data.frame('city' = rep('berlin'), 'year' = c(2018,2019,2020), 'population' = c(300, 305, 310))
bangalore = data.frame('city' = rep('bangalore'), 'year' = c(2018,2019,2020), 'population' = c(400, 405, 410))
lis = list(chicago = chicago, paris = paris, berlin = berlin, bangalore = bangalore)

Now I have a new df containing latest data for each city ,现在我有一个新的df包含每个city最新数据，

df = data.frame('city' = c('chicago', 'paris', 'berlin', 'bangalore'), 'year' = rep(2021), 'population' = c(115, 215, 315, 415))

I want to add each row of df to lis based on city .我想根据city将df每一行添加到lis 。

I do it by,我这样做，

#convert to datframe
lis = dplyr::bind_rows(lis)
#rbind
lis = rbind(lis, df)
#again convert to list
lis = split(lis, lis$city)

which is inefficient for large datsets.这对于大型数据集效率低下。 Is their any efficient alternate for large datsets ?对于大型数据集，它们是否有任何有效的替代方案？

Thank you.谢谢你。

Edit编辑

Unit: seconds
 expr      min       lq     mean   median       uq      max neval
 ac() 22.43719 23.17452 27.85401 24.80335 25.62127 43.23373     5

The list contains 2239 dataframes and dimension of each dataframe is 310x15 .该列表包含2239数据帧，每个数据帧的尺寸为310x15 。 Each of these dataframe grow daily.这些数据帧中的每一个每天都在增长。

Answer 1

We may use imap to loop over the list , and filter the 'df' based on the names of the list to append the row in each of the list elements我们可以使用imap循环遍历list ，并根据list名称filter 'df' 以将行附加到每个list元素中

library(dplyr)
library(purrr)
lis2 <- imap(lis, ~ .x %>%
                   bind_rows(df %>%
                              filter(city == .y)))

-output -输出

> lis2
$chicago
     city year population
1 chicago 2018        100
2 chicago 2019        105
3 chicago 2020        110
4 chicago 2021        115

$paris
   city year population
1 paris 2018        200
2 paris 2019        205
3 paris 2020        210
4 paris 2021        215

$berlin
    city year population
1 berlin 2018        300
2 berlin 2019        305
3 berlin 2020        310
4 berlin 2021        315

$bangalore
       city year population
1 bangalore 2018        400
2 bangalore 2019        405
3 bangalore 2020        410
4 bangalore 2021        415

Or using base R with Map and rbind或者使用带有Map和rbind base R

Map(function(x, nm) rbind(x, df[df$city == nm,]), lis, names(lis))

Or use rbindlist from data.table或者使用rbindlist的data.table

library(data.table)
rbindlist(c(lis, list(df)))[, .(split(.SD, city))]$V1

Or a slightly more efficient, will be with split或者稍微更有效的，将与split

Map(rbind, lis, split(df, df$city)[names(lis)])

将行添加到来自另一个数据帧的数据帧列表

问题描述

1 个解决方案

解决方案1
3 已采纳 2021-10-28 17:08:59

将行添加到来自另一个数据帧的数据帧列表

问题描述

1 个解决方案

解决方案1 3 已采纳 2021-10-28 17:08:59

解决方案1
3 已采纳 2021-10-28 17:08:59