简体   繁体   English

如何从不同的数据框中获取某些列的平均值?

[英]How to get mean of certain columns from different dataframes?

Problem问题

I want to use a function to get the mean numbers from different dataframes.我想使用 function 从不同的数据帧中获取平均数。

Example dataframes示例数据框

WEEK.1 <- 1:52
DIF.4 <- runif(52, 0, 20)
df <- as.data.frame(cbind(WEEK.1, DIF.4))

WEEK.2 <- 1:52
DIF.5 <- runif(52, 0, 20)
df2 <- as.data.frame(cbind(WEEK.2, DIF.5))

WEEK.3 <- 1:52
DIF.6 <- runif(52, 0, 20)
df3 <- as.data.frame(cbind(WEEK.3, DIF.6))

Attempt试图

a) Without a function, I'd have to repeat the following three times: a) 如果没有 function,我将不得不重复以下三遍:

colnames(df) <- str_replace_all(colnames(df), "\\.\\d", "")
df_ <- filter(df, WEEK >= 26)
mean(df_$DIF) 

b) My attempt to do so with a function: b)我尝试使用 function 这样做:

meanChange <- function(x) {
  colnames(x) <- str_replace_all(colnames(x), "\\.\\d", "")
  x_ <- filter(x, WEEK >= 26)
  mean(x_$DIF) 
}

x <- c("df", "df2", "df3")
changes <- as.data.frame(lapply(x, meanChange))

Result结果

Error in `colnames<-`(`*tmp*`, value = character(0)) : 
  attempt to set 'colnames' on an object with less than two dimensions
Called from: `colnames<-`(`*tmp*`, value = character(0))
Browse[1]> Q

Help would be appreciated.帮助将不胜感激。

This is a possibile solution:这是一个可能的解决方案:

# create a list of your dataframes
dfl <- list(df, df2, df3)

# apply mean function 
lapply(1:3, function(i) mean(dfl[[i]][[2]][which(dfl[[i]][[1]] >= 26)]))

Index i selects the single dataframe into list dfl .索引i选择单个 dataframe 到列表dfl中。 [[1]] and [[2]] select respectively column 1 and column 2 of each dataframe. [[1]][[2]] select 分别是 dataframe 的第 1 列和第 2 列。


Another possible solution is:另一种可能的解决方案是:

lapply(dfl, function(x) mean(x[[2]][which(x[[1]] >= 26)]))

Here lapply works directly with dfl without index.这里lapply直接与没有索引的dfl一起工作。


NOTE : both solutions return a list as a result.注意:两种解决方案都会返回一个列表作为结果。 If you want a vector use sapply rather than lapply如果你想要一个矢量使用sapply而不是lapply

  • You are passing y as an argument to the function but using x inside.您将y作为参数传递给 function 但在内部使用x
  • For the last line I think you mean mean(x_$DIF)对于最后一行,我认为您的意思mean(x_$DIF)
library(tidyverse)

meanChange <- function(x) {
  colnames(x) <- str_replace_all(colnames(x), "\\.\\d", "")
  x_ <- filter(x, WEEK >= 26)
  mean(x_$DIF) 
}
  • Put the dataframes in a list and use sapply to apply the function.将数据帧放在一个列表中,然后使用sapply应用 function。
y <- list(df, df2, df3)
sapply(y, meanChange)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM