简体   繁体   English

错误处理 function 在 dplyr::mutate 中不起作用

[英]Error handling function not working within dplyr::mutate

Why is R complaining about an error when my function already handles errors?当我的 function 已经处理错误时,为什么 R 会抱怨错误?

I've created a function to grab the parent element of an href attribute which invariably is "<a>".我创建了一个 function 来获取 href 属性的父元素,该属性总是“<a>”。 The function has some error handling to return NA if it can't find the href attribute.如果找不到 href 属性,function 有一些错误处理返回 NA。

The function works just fine in isolation, but not in combination with dplyr::mutate. function 单独使用时效果很好,但不能与 dplyr::mutate 结合使用。 I can't figure out why that is.我不明白为什么会这样。

Minimal reproducible example:最小可重现示例:

# Create html doc
html.test <- "<a href=\"hello\"</a><a id=\"ctl00_ctl00_btnSearch\" data-action=\"search\" class=\"go\" href=\"javascript:__doPostBack('ctl00%24ctl00%24btnSearch','')\"><span>GO</span><i class=\"fal fa-search\"></i></a>" %>%
  minimal_html()

# Create function
fun.get.node.name <- function(href.target){
  # treat warnings as errors
  options(warn=2)  
  
  xpath <- paste0("//a/@href[.= \'", href.target, "\']/..")
  
  res <- try({
    node_name <- html_nodes(x = html.test, xpath = xpath) %>% html_name()
  }, silent = TRUE)
  
  if (inherits(res, "try-error")) {
    # print warnings as they occur
    options(warn=1)  
    return(NA)
  } else {
  # print warnings as they occur
  options(warn=1)
  return(node_name)
  }
}

Now, if I apply the function to the attribute href = "hello", it works fine both in isolation and when applied within dplyr::mutate:现在,如果我将 function 应用于属性 href =“hello”,它在隔离和在 dplyr::mutate 中应用时都可以正常工作:

href.target <- "hello"
fun.get.node.name(href.target)
[1] "a"

data.frame(href = href.target) %>% mutate(node_name = fun.get.node.name(href.target = href))
   href node_name
1 hello         a

But, if I apply the same function to the attribute href = "javascript:__doPostBack('ctl00%24ctl00%24btnSearch','')" (which for some reason can't be found) then the function works only in isolation and NOT when applied within dplyr::mutate:但是,如果我将相同的 function 应用于属性 href = "javascript:__doPostBack('ctl00%24ctl00%24btnSearch','')"(由于某种原因无法找到),那么 function 只能单独工作,不能工作在 dplyr::mutate 中应用时:

href.target <- "javascript:__doPostBack('ctl00%24ctl00%24btnSearch','')"
fun.get.node.name(href.target)
[1] NA

data.frame(href = href.target) %>% mutate(node_name = fun.get.node.name(href.target = href))
 Error: (converted from warning) Problem while computing `node_name = fun.get.node.name(href.target = href)`.
ℹ Invalid predicate [1206] 

Why is R complaining about an error when the function already handles errors?当 function 已经处理错误时,为什么 R 会抱怨错误?

Your function handles errors correctly, but the error message that pops up says it has been converted from a warning.你的function正确处理了错误,但是弹出的错误信息说是从警告转换过来的。 So your function should suppressWarnings as well and then it will work as expected.所以你的 function 也应该suppressWarnings然后它会按预期工作。

Although this solves your problem, it is still not clear why the warning is thrown inside the mutate() , but not outside of it.虽然这解决了您的问题,但仍然不清楚为什么在mutate()内部而不是外部抛出警告。

library(dplyr)
library(rvest)


# Create html doc
html.test <- "<a href=\"hello\"</a><a id=\"ctl00_ctl00_btnSearch\" data-action=\"search\" class=\"go\" href=\"javascript:__doPostBack('ctl00%24ctl00%24btnSearch','')\"><span>GO</span><i class=\"fal fa-search\"></i></a>" %>%
  minimal_html()

# Create function
fun.get.node.name <- function(href.target){
  # treat warnings as errors
  options(warn=2)  
  
  xpath <- paste0("//a/@href[.= \'", href.target, "\']/..")
  
  res <- try({
    node_name <- suppressWarnings(
      html_nodes(x = html.test, xpath = xpath) %>% html_name()
    )
  }, silent = TRUE)
  
  if (inherits(res, "try-error")) {
    # print warnings as they occur
    options(warn=1)  
    return(NA)
  } else {
    # print warnings as they occur
    options(warn=1)
    return(node_name)
  }
}

href.target <- "javascript:__doPostBack('ctl00%24ctl00%24btnSearch','')"
fun.get.node.name(href.target)
#> [1] NA

data.frame(href = href.target) %>%
  mutate(node_name = fun.get.node.name(href.target = href))
#>                                                      href node_name
#> 1 javascript:__doPostBack('ctl00%24ctl00%24btnSearch','')        NA

Created on 2022-11-19 with reprex v2.0.2创建于 2022-11-19,使用reprex v2.0.2

Using the insight provided by @TimTeaFan I have a solution where I take advantage of the fact that suppressWarnings() will return an empty character vector when the code within cannot find the href.使用@TimTeaFan 提供的见解,我有一个解决方案,我利用了当其中的代码找不到 href 时suppressWarnings()将返回一个空字符向量这一事实。 So I don't need to go down the error handling try path...所以我不需要 go 沿着错误处理try路径...

# Create function
fun.get.node.name <- function(href.target){

  xpath <- paste0("//a/@href[.= \'", href.target, "\']/..")
  
  node_name <- suppressWarnings(
    html_nodes(x = html.test, xpath = xpath) %>% html_name()
  )
  
  if (length(node_name) == 0){
    return(NA)
  } else {
    return(node_name)
  }
}

# Run
data.frame(href = href.target) %>% mutate(node_name = fun.get.node.name(href.target = href))

#> href                                                           node_name
#> javascript:__doPostBack('ctl00%24ctl00%24btnSearch','')        NA

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM