[英]Function over multiple dataframes in R
I have a set of survey design data for each quarter/year in RDs format on my disk.我的磁盘上有一组 RDs 格式的每个季度/年的调查设计数据。 The data is like this:数据是这样的:
Year Quarter Age
2010 1 27
2010 1 32
2010 1 34
...
I'm using the function svymean(formula=~Age, na.rm = T, design = data20101) to estimate the mean of the age variable for each year/quarter file.我正在使用函数 svymean(formula=~Age, na.rm = T, design = data20101) 来估计每个年/季度文件的年龄变量的平均值。 I would like to run this more efficiently in a way that I could run the function and then save the results in one single data frame.我想以一种可以运行该函数的方式更有效地运行它,然后将结果保存在一个数据框中。
The output I'm looking for is to produce such a dataframe:我正在寻找的输出是生成这样的数据帧:
Year Quarter Mean_Age
2010 1 31.1
2010 1 32.4
2010 1 30.9
2010 1 34.5
2010 2 36.3
2010 2 31.2
2010 2 30.8
2010 2 35.6
...
Regards,问候,
lapply and package dplyr should do the work. lapply 和 package dplyr 应该可以完成这项工作。 Here is an example.这是一个例子。
library(dplyr)
df1 <- data.frame(cbind("Year" = rep(2010, 6),
"Quarter" = c(1, 1, 1, 2, 2, 2),
"Age" = c(27, 32, 34, 30, 28, 21))
)
df2 <- data.frame(cbind("Year" = rep(2010, 6),
"Quarter" = c(1, 1, 1, 2, 2, 2),
"Age" = c(23, 19, 31, 41, 26, 23))
)
df.list <- list(df1, df2)
mean.list <- lapply(df.list, function(x){
x %>%
group_by(Year, Quarter) %>%
summarize(Mean_Age = mean(Age, na.rm = TRUE))
})
mean.df <- do.call(rbind, mean.list)
mean.df
The result will be结果将是
# A tibble: 4 x 3
# Groups: Year [1]
Year Quarter Mean_Age
<dbl> <dbl> <dbl>
1 2010 1 31
2 2010 2 26.3
3 2010 1 24.3
4 2010 2 30
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.