简体   繁体   English

将行添加到来自另一个数据帧的数据帧列表

[英]Add rows to list of dataframes from another dataframe

Let's have a list lis让我们有一个列表lis

chicago = data.frame('city' = rep('chicago'), 'year' = c(2018,2019,2020), 'population' = c(100, 105, 110))
paris = data.frame('city' = rep('paris'), 'year' = c(2018,2019,2020), 'population' = c(200, 205, 210))
berlin = data.frame('city' = rep('berlin'), 'year' = c(2018,2019,2020), 'population' = c(300, 305, 310))
bangalore = data.frame('city' = rep('bangalore'), 'year' = c(2018,2019,2020), 'population' = c(400, 405, 410))
lis = list(chicago = chicago, paris = paris, berlin = berlin, bangalore = bangalore)

Now I have a new df containing latest data for each city ,现在我有一个新的df包含每个city最新数据,

df = data.frame('city' = c('chicago', 'paris', 'berlin', 'bangalore'), 'year' = rep(2021), 'population' = c(115, 215, 315, 415))

I want to add each row of df to lis based on city .我想根据citydf每一行添加到lis

I do it by,我这样做,

#convert to datframe
lis = dplyr::bind_rows(lis)
#rbind
lis = rbind(lis, df)
#again convert to list
lis = split(lis, lis$city)

which is inefficient for large datsets.这对于大型数据集效率低下。 Is their any efficient alternate for large datsets ?对于大型数据集,它们是否有任何有效的替代方案

Thank you.谢谢你。

Edit编辑

Unit: seconds
 expr      min       lq     mean   median       uq      max neval
 ac() 22.43719 23.17452 27.85401 24.80335 25.62127 43.23373     5

The list contains 2239 dataframes and dimension of each dataframe is 310x15 .该列表包含2239数据帧,每个数据帧的尺寸为310x15 Each of these dataframe grow daily.这些数据帧中的每一个每天都在增长。

We may use imap to loop over the list , and filter the 'df' based on the names of the list to append the row in each of the list elements我们可以使用imap循环遍历list ,并根据list名称filter 'df' 以将行附加到每个list元素中

library(dplyr)
library(purrr)
lis2 <- imap(lis, ~ .x %>%
                   bind_rows(df %>%
                              filter(city == .y)))

-output -输出

> lis2
$chicago
     city year population
1 chicago 2018        100
2 chicago 2019        105
3 chicago 2020        110
4 chicago 2021        115

$paris
   city year population
1 paris 2018        200
2 paris 2019        205
3 paris 2020        210
4 paris 2021        215

$berlin
    city year population
1 berlin 2018        300
2 berlin 2019        305
3 berlin 2020        310
4 berlin 2021        315

$bangalore
       city year population
1 bangalore 2018        400
2 bangalore 2019        405
3 bangalore 2020        410
4 bangalore 2021        415

Or using base R with Map and rbind或者使用带有Maprbind base R

Map(function(x, nm) rbind(x, df[df$city == nm,]), lis, names(lis))

Or use rbindlist from data.table或者使用rbindlistdata.table

library(data.table)
rbindlist(c(lis, list(df)))[, .(split(.SD, city))]$V1

Or a slightly more efficient, will be with split或者稍微更有效的,将与split

Map(rbind, lis, split(df, df$city)[names(lis)])

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将一个 dataframe 的变量添加到包含更多行的数据帧列表中 - Add variables of one dataframe to a list of dataframes which contain more rows 从数据帧列表向数据帧添加行,直到列表中每个数据帧的行数为1000 - Add rows to dataframes from list of dataframes until number of rows is 1000 for every data frame in list 将占位符行添加到数据框列表 - Add placeholder rows to a list of dataframes 如何从数据帧列表中添加单个 dataframe - How to add put together a single dataframe from a list of dataframes 从数据帧列表中的特定 dataframe 中的特定 position 添加一列 - Add a column in a specific position in a specific dataframe from a list of dataframes 从列表中的数据帧行创建新的数据帧 - Creating new dataframes from rows of dataframes in list 通过每个数据帧中的行数对数据帧列表进行重新排序 - Reorder a list of dataframes by the number of rows in each dataframe 在数据框列表中按天对数据框的行进行分组 - Grouping rows of a dataframe by day in a list of dataframes 通过在两个数据框中的某些列中找到最接近的值,从一个数据框中提取行以加入另一个数据框 - extract rows from one dataframe to join another dataframe by finding closest value in certain columns in both dataframes 将数据帧列表中的 dataframe 绑定到不同数据帧列表中的 dataframe - Rbind a dataframe from a list of dataframes to a dataframe in a different list of dataframes
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM