简体   繁体   中英

Separating error message from error condition in package

Background

Packages can include a lot of functions. Some of them require informative error messages, and perhaps some comments in the function to explain what/why is happening. An example, f1 in a hypothetical f1.R file. All documentation and comments (both why the error and why the condition) in one place.

f1 <- function(x){
  if(!is.character(x)) stop("Only characters suported")
  # user input ...
  # .... NaN problem in g()
  # .... 
  # ratio of magnitude negative integer i base ^ i is positive
  if(x < .Machine$longdouble.min.exp / .Machine$longdouble.min.exp) stop("oof, an error")
  log(x)
}

f1(-1)
# >Error in f1(-1) : oof, an error

I create a separate conds.R , specifying a function (and w warning, s suggestion) etc, for example.

e <- function(x){
  switch(
    as.character(x),
    "1" = "Only character supported",
    # user input ...
    # .... NaN problem in g()
    # .... 
    "2" = "oof, and error") |>
    stop()
}

Then in, say, f.R script I can define f2 as

f2 <- function(x){
  if(!is.character(x)) e(1)
  # ratio of magnitude negative integer i base ^ i is positive
  if(x < .Machine$longdouble.min.exp / .Machine$longdouble.min.exp) e(2)
  log(x)
}

f2(-1)
#> Error in e(2) : oof, and error

Which does throw the error, and on top of it a nice traceback & rerun with debug option in the console. Further, as package maintainer I would prefer this as it avoids considering writing terse if statements + 1-line error message or aligning comments in a tryCatch statement.

Question

Is there a reason ( not opinion on syntax ) to avoid writing a conds.R in a package?

There is no reason to avoid writing conds.R . This is very common and good practice in package development, especially as many of the checks you want to do will be applicable across many functions (like asserting the input is character, as you've done above. Here's a nice example from dplyr .

library(dplyr)

df <- data.frame(x = 1:3, x = c("a", "b", "c"), y = 4:6)
names(df) <- c("x", "x", "y")
df
#>   x x y
#> 1 1 a 4
#> 2 2 b 5
#> 3 3 c 6

df2 <- data.frame(x = 2:4, z = 7:9)

full_join(df, df2, by = "x")
#> Error: Input columns in `x` must be unique.
#> x Problem with `x`.

nest_join(df, df2, by = "x")
#> Error: Input columns in `x` must be unique.
#> x Problem with `x`.

traceback()
#> 7: stop(fallback)
#> 6: signal_abort(cnd)
#> 5: abort(c(glue("Input columns in `{input}` must be unique."), x = glue("Problem with {err_vars(vars[dup])}.")))
#> 4: check_duplicate_vars(x_names, "x")
#> 3: join_cols(tbl_vars(x), tbl_vars(y), by = by, suffix = c("", ""), keep = keep)
#> 2: nest_join.data.frame(df, df2, by = "x")
#> 1: nest_join(df, df2, by = "x")

Here, both functions rely code written in join-cols.R . Both call join_cols() which in turn calls check_duplicate_vars() , which I've copied the source code from:

check_duplicate_vars <- function(vars, input, error_call = caller_env()) {
  dup <- duplicated(vars)
  if (any(dup)) {
    bullets <- c(
      glue("Input columns in `{input}` must be unique."),
      x = glue("Problem with {err_vars(vars[dup])}.")
    )
    abort(bullets, call = error_call)
  }
}

Although different in syntax from what you wrote, it's designed to provide the same behaviour, and shows it is possible to include in a package and no reason (from my understanding) not to do this. However, I would add a few syntax points based on your code above:

  • I would bundle the check ( if() statement) inside the package with the error raising to reduce repeating yourself in other areas you use the function.
  • It's often nicer to include the name of the variable or argument passed in so the error message is explicit, such as in the dplyr example above. This makes the error more clear to the user what is causing the problem, in this case, that the x column is not unique in df .
  • The traceback showing #> Error in e(2): oof, and error in your example is more obscure to the user, especially as e() is likely not exported in the NAMESPACE and they would need to parse the source code to understand where the error is generated. If you use stop(..., .call = FALSE ) or passing the calling environment through the nested functions, like in join-cols.R , then you can avoid not helpful information in the traceback() . This is for instance suggested in Hadley's Advanced R :

By default, the error message includes the call, but this is typically not useful (and recapitulates information that you can easily get from traceback() ), so I think it's good practice to use call. = FALSE call. = FALSE

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM