在 package 中将错误消息与错误条件分开

Question

Background背景

Packages can include a lot of functions.包可以包含很多功能。 Some of them require informative error messages, and perhaps some comments in the function to explain what/why is happening.其中一些需要信息性错误消息，并且可能需要 function 中的一些注释来解释发生了什么/为什么发生。 An example, f1 in a hypothetical f1.R file.例如，假设f1.R文件中的f1 。 All documentation and comments (both why the error and why the condition) in one place.所有文档和评论（错误的原因和条件的原因）都集中在一个地方。

f1 <- function(x){
  if(!is.character(x)) stop("Only characters suported")
  # user input ...
  # .... NaN problem in g()
  # .... 
  # ratio of magnitude negative integer i base ^ i is positive
  if(x < .Machine$longdouble.min.exp / .Machine$longdouble.min.exp) stop("oof, an error")
  log(x)
}

f1(-1)
# >Error in f1(-1) : oof, an error

I create a separate conds.R , specifying a function (and w warning, s suggestion) etc, for example.例如，我创建了一个单独的conds.R ，指定了w （和警告s建议）等。

e <- function(x){
  switch(
    as.character(x),
    "1" = "Only character supported",
    # user input ...
    # .... NaN problem in g()
    # .... 
    "2" = "oof, and error") |>
    stop()
}

Then in, say, f.R script I can define f2 as然后在f.R脚本中，我可以将f2定义为

f2 <- function(x){
  if(!is.character(x)) e(1)
  # ratio of magnitude negative integer i base ^ i is positive
  if(x < .Machine$longdouble.min.exp / .Machine$longdouble.min.exp) e(2)
  log(x)
}

f2(-1)
#> Error in e(2) : oof, and error

Which does throw the error, and on top of it a nice traceback & rerun with debug option in the console.确实会引发错误，并且在它之上有一个很好的回溯并在控制台中使用调试选项重新运行。 Further, as package maintainer I would prefer this as it avoids considering writing terse if statements + 1-line error message or aligning comments in a tryCatch statement.此外，作为 package 的维护者，我更喜欢这样做，因为它避免考虑编写简洁的 if 语句 + 1 行错误消息或在tryCatch语句中对齐注释。

Question问题

Is there a reason ( not opinion on syntax ) to avoid writing a conds.R in a package?是否有理由（不是对语法的看法）避免在conds.R中编写 conds.R？

Answer 1

There is no reason to avoid writing conds.R .没有理由避免写conds.R 。 This is very common and good practice in package development, especially as many of the checks you want to do will be applicable across many functions (like asserting the input is character, as you've done above. Here's a nice example from dplyr .这是 package 开发中非常常见和良好的做法，特别是因为您想要做的许多检查将适用于许多功能（如断言输入是字符，正如您在上面所做的那样。这是dplyr的一个很好的例子。

library(dplyr)

df <- data.frame(x = 1:3, x = c("a", "b", "c"), y = 4:6)
names(df) <- c("x", "x", "y")
df
#>   x x y
#> 1 1 a 4
#> 2 2 b 5
#> 3 3 c 6

df2 <- data.frame(x = 2:4, z = 7:9)

full_join(df, df2, by = "x")
#> Error: Input columns in `x` must be unique.
#> x Problem with `x`.

nest_join(df, df2, by = "x")
#> Error: Input columns in `x` must be unique.
#> x Problem with `x`.

traceback()
#> 7: stop(fallback)
#> 6: signal_abort(cnd)
#> 5: abort(c(glue("Input columns in `{input}` must be unique."), x = glue("Problem with {err_vars(vars[dup])}.")))
#> 4: check_duplicate_vars(x_names, "x")
#> 3: join_cols(tbl_vars(x), tbl_vars(y), by = by, suffix = c("", ""), keep = keep)
#> 2: nest_join.data.frame(df, df2, by = "x")
#> 1: nest_join(df, df2, by = "x")

Here, both functions rely code written in join-cols.R .在这里，这两个函数都依赖于join-cols.R编写的代码。 Both call join_cols() which in turn calls check_duplicate_vars() , which I've copied the source code from:两者都调用join_cols() ，后者又调用check_duplicate_vars() ，我从以下位置复制了源代码：

check_duplicate_vars <- function(vars, input, error_call = caller_env()) {
  dup <- duplicated(vars)
  if (any(dup)) {
    bullets <- c(
      glue("Input columns in `{input}` must be unique."),
      x = glue("Problem with {err_vars(vars[dup])}.")
    )
    abort(bullets, call = error_call)
  }
}

Although different in syntax from what you wrote, it's designed to provide the same behaviour, and shows it is possible to include in a package and no reason (from my understanding) not to do this.尽管语法与您编写的内容不同，但它旨在提供相同的行为，并表明可以包含在 package 中，并且没有理由（根据我的理解）不这样做。 However, I would add a few syntax points based on your code above:但是，我会根据您上面的代码添加一些语法点：

I would bundle the check ( if() statement) inside the package with the error raising to reduce repeating yourself in other areas you use the function.我会将 package 中的检查（ if()语句）与错误提升捆绑在一起，以减少在您使用 function 的其他区域重复自己。
It's often nicer to include the name of the variable or argument passed in so the error message is explicit, such as in the dplyr example above.包含传入的变量或参数的名称通常更好，因此错误消息是明确的，例如在上面的dplyr示例中。 This makes the error more clear to the user what is causing the problem, in this case, that the x column is not unique in df .这使用户更清楚错误是什么导致了问题，在这种情况下， x列在df中不是唯一的。
The traceback showing #> Error in e(2): oof, and error in your example is more obscure to the user, especially as e() is likely not exported in the NAMESPACE and they would need to parse the source code to understand where the error is generated.回溯显示#> Error in e(2): oof, and error对用户来说更加模糊，尤其是e()可能未在 NAMESPACE 中导出，他们需要解析源代码以了解在哪里产生错误。 If you use stop(..., .call = FALSE ) or passing the calling environment through the nested functions, like in join-cols.R , then you can avoid not helpful information in the traceback() .如果您使用stop(..., .call = FALSE ) 或通过嵌套函数传递调用环境，例如在join-cols.R中，那么您可以避免traceback()中的无用信息。 This is for instance suggested in Hadley's Advanced R :例如，在 Hadley 的Advanced R中建议这样做：

By default, the error message includes the call, but this is typically not useful (and recapitulates information that you can easily get from traceback() ), so I think it's good practice to use call. = FALSE默认情况下，错误消息包括调用，但这通常没有用（并且概括了您可以从traceback()轻松获得的信息），所以我认为使用call. = FALSE call. = FALSE

在 package 中将错误消息与错误条件分开

问题描述

Background背景

Question问题

1 个解决方案

解决方案1
1 2021-11-23 10:56:18

在 package 中将错误消息与错误条件分开

问题描述

Background背景

Question问题

1 个解决方案

解决方案1 1 2021-11-23 10:56:18

解决方案1
1 2021-11-23 10:56:18