简体   繁体   English

是否可以将 na.rm 全局设置为 TRUE?

[英]Is it possible to set na.rm to TRUE globally?

For commands like max the option na.rm is set by default to FALSE .对于像max的命令,选项na.rm默认设置为FALSE I understand why this is a good idea in general, but I'd like to turn it off reversibly for a while -- ie during a session.我理解为什么这是一个好主意,但我想可逆地关闭它一段时间 - 即在会话期间。

How can I require R to set na.rm = TRUE whenever it is an option?我怎样才能要求 R 设置na.rm = TRUE每当它是一个选项? I found我发现

options(na.action = na.omit)

but this doesn't work.但这不起作用。 I know that I can set a na.rm=TRUE option for each and every function I write.我知道我可以为我编写的每个函数设置一个na.rm=TRUE选项。

my.max <- function(x) {max(x, na.rm=TRUE)}

But that's not what I am looking for.但这不是我要找的。 I'm wondering if there's something I could do more globally/universally instead of doing it for each function.我想知道是否有什么我可以在全球/普遍上做更多的事情,而不是为每个功能做。

One workaround (dangerous), is to do the following :一种解决方法(危险)是执行以下操作:

  1. List all functions that have na.rm as argument.列出所有以na.rm作为参数的函数。 Here I limited my search to the base package.在这里,我将搜索限制在基本包中。
  2. Fetch each function and add this line at the beginning of its body: na.rm = TRUE获取每个函数并在其主体的开头添加这一行: na.rm = TRUE
  3. Assign the function back to the base package.将该功能分配回基础包。

So first I store in a list (ll) all functions having na.rm as argument:所以首先我将所有以na.rm作为参数的函数存储在一个列表 (ll) 中:

uses_arg <- function(x,arg) 
  is.function(fx <- get(x)) && 
  arg %in% names(formals(fx))
basevals <- ls(pos="package:base")      
na.rm.f <- basevals[sapply(basevals,uses_arg,'na.rm')]

EDIT better method to get all na.rm's argument functions (thanks to mnel comment)编辑更好的方法来获取所有 na.rm 的参数函数(感谢 mnel 评论)

Funs <- Filter(is.function,sapply(ls(baseenv()),get,baseenv()))
na.rm.f <- names(Filter(function(x) any(names(formals(args(x)))%in% 'na.rm'),Funs))

So na.rm.f list looks like:所以na.rm.f列表看起来像:

 [1] "all"                     "any"                     "colMeans"                "colSums"                
 [5] "is.unsorted"             "max"                     "mean.default"            "min"                    
 [9] "pmax"                    "pmax.int"                "pmin"                    "pmin.int"               
[13] "prod"                    "range"                   "range.default"           "rowMeans"               
[17] "rowsum.data.frame"       "rowsum.default"          "rowSums"                 "sum"                    
[21] "Summary.data.frame"      "Summary.Date"            "Summary.difftime"        "Summary.factor"         
[25] "Summary.numeric_version" "Summary.ordered"         "Summary.POSIXct"         "Summary.POSIXlt" 

Then for each function I change the body, the code is inspired from data.table package (FAQ 2.23) that add one line to the start of rbind.data.frame and cbind.data.frame .然后对于我更改主体的每个函数,代码的灵感来自data.table包(常见问题解答 2.23),该包在rbind.data.framecbind.data.frame的开头添加一行。

ll <- lapply(na.rm.f,function(x)
  {
  tt <- get(x)
  ss = body(tt)
  if (class(ss)!="{") ss = as.call(c(as.name("{"), ss))
  if(length(ss) < 2) print(x)
  else{
    if (!length(grep("na.rm = TRUE",ss[[2]],fixed=TRUE))) {
      ss = ss[c(1,NA,2:length(ss))]
      ss[[2]] = parse(text="na.rm = TRUE")[[1]]
      body(tt)=ss
      (unlockBinding)(x,baseenv())
      assign(x,tt,envir=asNamespace("base"),inherits=FALSE)
      lockBinding(x,baseenv())
      }
    }
  })

No if you check , the first line of each function of our list :不,如果你检查,我们列表的每个函数的第一行:

unique(lapply(na.rm.f,function(x) body(get(x))[[2]]))
[[1]]
na.rm = TRUE

It is not possible to change na.rm to TRUE globally.不可能将na.rm全局更改为TRUE (See Hong Ooi's comment under the question.) (请参阅Hong Ooi 在问题下的评论。)

EDIT:编辑:

Unfortunately, the answer you don't want is the only one that works generally.不幸的是,您不想要的答案是唯一普遍有效的答案。 There's no global option for this like there is for na.action, which only affects modeling functions like lm, glm, etc (and even there, it isn't guaranteed to work in all cases).没有像 na.action 那样的全局选项,它只影响 lm、glm 等建模功能(即使在那里,也不能保证在所有情况下都能工作)。 – Hong Ooi Jul 2 '13 at 6:23 – Hong Ooi 2013 年 7 月 2 日 6:23

For my R package, I overwrote the existing functions mean and sum .对于我的 R 包,我覆盖了现有的函数meansum Thanks to the great Ben (comments below), I altered my functions to this:感谢伟大的 Ben(下面的评论),我改变了我的功能:

mean <- function(x, ..., na.rm = TRUE) {
  base::mean(x, ..., na.rm = na.rm)
}

After this, mean(c(2, NA, 3)) = 2.5 instead of NA .在此之后, mean(c(2, NA, 3)) = 2.5而不是NA

And for sum :对于sum

sum <- function(x, ..., na.rm = TRUE) {
  base::sum(x, ..., na.rm = na.rm)
}

This will yield sum(c(2, NA, 3)) = 5 instead of NA .这将产生sum(c(2, NA, 3)) = 5而不是NA

sum(c(2, NA, 3, NaN)) also works. sum(c(2, NA, 3, NaN))也有效。

There were several answers about changing na.rm argument globally already.已经有几个关于在全球范围内更改na.rm参数的答案。 I just want to notice about partial() function from purrr or pryr packages.我只想注意purrrpryr包中的partial()函数。 Using this function you can create a copy of existing function with predefined arguments :使用此函数,您可以使用预定义参数创建现有函数的副本:

library(purrr)
.mean <- partial(mean, na.rm = TRUE)

# Create sample vector
df <- c(1, 2, 3, 4, NA, 6, 7)

mean(df)
>[1] NA

.mean(df)
>[1] 3.833333

We can combine this tip with @agstudy answer and create copies of all functions with na.rm = TRUE argument:我们可以将此提示与@agstudy 答案结合起来,并使用na.rm = TRUE参数创建所有函数的副本:

library(purrr)

# Create a vector of function names https://stackoverflow.com/a/17423072/9300556
Funs <- Filter(is.function,sapply(ls(baseenv()),get,baseenv()))
na.rm.f <- names(Filter(function(x) any(names(formals(args(x)))%in% 'na.rm'),Funs))

# Create strings. Dot "." is optional
fs <- lapply(na.rm.f,
             function(x) paste0(".", x, "=partial(", x ,", na.rm = T)"))

eval(parse(text = fs)) 

So now, there are .all , .min , .max , etc. in our .GlobalEnv .所以,现在有.all.min.max ,等我们.GlobalEnv You can run them:你可以运行它们:

.min(df)
> [1] 1
.max(df)
> [1] 7
.all(df)
> [1] TRUE

To overwrite functions, just remove dot "."要覆盖函数,只需删除点“。” from lapply call.从 lapply 电话。 Inspired by this blogpost受到这篇博文的启发

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM