简体   繁体   English

如何在R中的函数内部索引数据框列

[英]How to index dataframe column inside a function in R

I have a function that takes in a dataframe, a percentile threshold, and the name of a given column, and computes all values that are above this threshold in the given column as a new column (0 for <, and 1 for >=). 我有一个函数,它接受一个数据帧,一个百分位数阈值和给定列的名称,并将给定列中高于此阈值的所有值计算为一个新列(0表示<,1表示> =) 。 However, it won't allow me to do the df$column_name inside the quantile function because column_name is not actually a column name, but a variable storing the actual column name. 但是,它不允许我在quantile函数中执行df$column_name ,因为column_name实际上不是列名,而是存储实际列名的变量。 Therefore df$column_name will return NULL . 因此, df$column_name将返回NULL Is there any way to work around this and keep the code forma somewhat similar to what it is currently? 有什么办法可以解决此问题,并使代码格式与当前格式有些相似? Or do I have to specify the actual numerical column value instead of the name? 还是我必须指定实际的数字列值而不是名称? While I can do this, it is definitely not as convenient/comprehensible as just passing in the column name. 虽然我可以做到这一点,但绝对不如仅传递列名那样方便/可理解。

func1 <- function(df, threshold, column_name) {
  threshold_value <- quantile(df$column_name, c(threshold)) 
  new_df <- df %>%
    mutate(ifelse(column_name > threshold_value, 1, 0)) 
  return(new_df)
}

Thank you so much for your help! 非常感谢你的帮助!

I modified your function as follows. 我修改了您的功能,如下所示。 Now the function can take a data frame, a threshold, and a column name. 现在,该函数可以获取数据帧,阈值和列名。 This function only needs the base R. 此功能只需要基数R。

# Modified function
func1 <- function(df, threshold, column_name) {
  threshold_value <- quantile(df[[column_name]], threshold) 
  new_df <- df
  new_df[["new_col"]] <- ifelse(df[[column_name]] > threshold_value, 1, 0) 
  return(new_df)
}

# Take the trees data frame as an example
head(trees)
#   Girth Height Volume
# 1   8.3     70   10.3
# 2   8.6     65   10.3
# 3   8.8     63   10.2
# 4  10.5     72   16.4
# 5  10.7     81   18.8
# 6  10.8     83   19.7

# Apply the function
func1(trees, 0.5, "Volume")
#    Girth Height Volume new_col
# 1    8.3     70   10.3       0
# 2    8.6     65   10.3       0
# 3    8.8     63   10.2       0
# 4   10.5     72   16.4       0
# 5   10.7     81   18.8       0
# 6   10.8     83   19.7       0
# 7   11.0     66   15.6       0
# 8   11.0     75   18.2       0
# 9   11.1     80   22.6       0
# 10  11.2     75   19.9       0
# 11  11.3     79   24.2       0
# 12  11.4     76   21.0       0
# 13  11.4     76   21.4       0
# 14  11.7     69   21.3       0
# 15  12.0     75   19.1       0
# 16  12.9     74   22.2       0
# 17  12.9     85   33.8       1
# 18  13.3     86   27.4       1
# 19  13.7     71   25.7       1
# 20  13.8     64   24.9       1
# 21  14.0     78   34.5       1
# 22  14.2     80   31.7       1
# 23  14.5     74   36.3       1
# 24  16.0     72   38.3       1
# 25  16.3     77   42.6       1
# 26  17.3     81   55.4       1
# 27  17.5     82   55.7       1
# 28  17.9     80   58.3       1
# 29  18.0     80   51.5       1
# 30  18.0     80   51.0       1
# 31  20.6     87   77.0       1

If you still want to use , it is essential to learn how to deal with non-standard evaluation. 如果仍然要使用 ,那么必须学习如何处理非标准评估。 Please see this to learn more ( https://cran.r-project.org/web/packages/dplyr/vignettes/programming.html ). 请查看此内容以了解更多信息( https://cran.r-project.org/web/packages/dplyr/vignettes/programming.html )。 The following code will also works. 以下代码也将起作用。

library(dplyr)

func2 <- function(df, threshold, column_name) {
  col_en <- enquo(column_name)
  threshold_value <- quantile(df %>% pull(!!col_en), threshold)
  new_df <- df %>%
    mutate(new_col := ifelse(!!col_en >= threshold_value, 1, 0))
  return(new_df)
}

func2(trees, 0.5, Volume)
#    Girth Height Volume new_col
# 1    8.3     70   10.3       0
# 2    8.6     65   10.3       0
# 3    8.8     63   10.2       0
# 4   10.5     72   16.4       0
# 5   10.7     81   18.8       0
# 6   10.8     83   19.7       0
# 7   11.0     66   15.6       0
# 8   11.0     75   18.2       0
# 9   11.1     80   22.6       0
# 10  11.2     75   19.9       0
# 11  11.3     79   24.2       1
# 12  11.4     76   21.0       0
# 13  11.4     76   21.4       0
# 14  11.7     69   21.3       0
# 15  12.0     75   19.1       0
# 16  12.9     74   22.2       0
# 17  12.9     85   33.8       1
# 18  13.3     86   27.4       1
# 19  13.7     71   25.7       1
# 20  13.8     64   24.9       1
# 21  14.0     78   34.5       1
# 22  14.2     80   31.7       1
# 23  14.5     74   36.3       1
# 24  16.0     72   38.3       1
# 25  16.3     77   42.6       1
# 26  17.3     81   55.4       1
# 27  17.5     82   55.7       1
# 28  17.9     80   58.3       1
# 29  18.0     80   51.5       1
# 30  18.0     80   51.0       1
# 31  20.6     87   77.0       1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM