简体   繁体   English

将用户定义的函数应用于数据帧列表

[英]Apply a user defined function to a list of data frames

I have a series of data frames structured similarly to this: 我有一系列与此类似的数据帧:

df <- data.frame(x = c('notes','year',1995:2005), y = c(NA,'value',11:21))  
df2 <- data.frame(x = c('notes','year',1995:2005), y = c(NA,'value',50:60))

In order to clean them I wrote a user defined function with a set of cleaning steps: 为了清理它们,我编写了一个带有一系列清理步骤的用户定义函数:

clean <- function(df){
  colnames(df) <- df[2,]
  df <- df[grep('^[0-9]{4}', df$year),]
  return(df)
}

I'd now like to put my data frames in a list: 我现在想将数据框放在列表中:

df_list <- list(df,df2)

and clean them all at once. 并立即清洁它们。 I tried 我试过了

lapply(df_list, clean)

and

for(df in df_list){
  clean(df)
}

But with both methods I get the error: 但是用这两种方法我都会收到错误:

Error in df[2, ] : incorrect number of dimensions

What's causing this error and how can I fix it? 是什么导致此错误,我该如何解决? Is my approach to this problem wrong? 我对这个问题的解决方法是错误的吗?

You are close, but there is one problem in code. 您很亲密,但是代码中有一个问题。 Since you have text in your dataframe's columns, the columns are created as factors and not characters. 由于您在数据框的列中有文本,因此将这些列创建为要素而不是字符。 Thus your column naming does not provide the expected result. 因此,您的列命名不能提供预期的结果。

#need to specify strings to factors as false
df <- data.frame(x = c('notes','year',1995:2005), y = c(NA,'value',11:21), stringsAsFactors = FALSE)  
df2 <- data.frame(x = c('notes','year',1995:2005), y = c(NA,'value',50:60), stringsAsFactors = FALSE)

clean <- function(df){
  colnames(df) <- df[2,]
  #need to specify the column to select the rows
  df <- df[grep('^[0-9]{4}', df$year),]

  #convert the columns to numeric values
    df[, 1:ncol(df)] <- apply(df[, 1:ncol(df)], 2, as.numeric)

  return(df)
}

df_list <- list(df,df2)
lapply(df_list, clean)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM