[英]How to modify multiple dataframes in rstudio?
I am working with multiple data frames(over 20) and I like to make a loop which add two new columns of mean value of both columns in every data frames.我正在处理多个数据帧(超过 20 个),我喜欢创建一个循环,在每个数据帧中添加两个新列的平均值。 I like to use loop because amount of data frames can alter.
我喜欢使用循环,因为数据帧的数量可以改变。
Example of data:数据示例:
df_1:
Width Thickness
1 1000 1
2 1500 2
df_2:
1 1200 3
2 1200 4
3 1000 2
df_3:
1 1200 3
2 1500 4
desired outcome:
df_1:
Width Thickness mean_width mean_thick
1 1000 1 1250 1.5
2 1500 2 1250 1.5
You can get all the dataframes in a list based on the pattern in their name using ls
and mget
.您可以使用
ls
和mget
根据名称中的模式获取列表中的所有数据框。 We can then use lapply
to add new columns to each dataframe然后我们可以使用
lapply
向每个 dataframe 添加新列
new_data <- lapply(mget(ls(pattern = 'df_\\d+')), function(x) {
x[paste0('mean_', names(x))] <- as.list(colMeans(x, na.rm = TRUE))
x
})
new_data
will have list of dataframes in them, if you want the changes to be reflected in the original dataframes use list2env
: new_data
将在其中包含数据帧列表,如果您希望更改反映在原始数据帧中,请使用list2env
:
list2env(new_data, .GlobalEnv)
I would suggest making a list of dataframes and then applying a function over that list.我建议制作一个数据框列表,然后在该列表上应用 function。
Below I'm using tidyverse's map function but this is also achievable using base R and the apply family of functions:下面我使用的是 tidyverse 的 map function 但这也可以使用基础 R 和 apply 系列函数来实现:
library(tidyverse)
df_list <- list(df_1, df_2, df_3)
map(df_list, mutate, mean_width = mean(Width), mean_thick = mean(Thickness))
It would be better to create a single dataset and then do a group by operation最好创建单个数据集,然后按操作进行分组
library(dplyr)
mget(ls(pattern = 'df_\\d+')) %>%
bind_rows(.id = 'grp') %>%
group_by(grp) %>%
mutate(across(everything(), mean, na.rm = TRUE, .names = "mean_{col}")) %>%
ungroup
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.