如何使用 R 变异 function 转换 NA 值？

Question

我正在尝试使用 function mutate是为了根据其他三个条件创建一个变量。 这些条件是使用case_when创建的，您可以在下面的代码中看到。

但我有一些使用 NA 值的条件，这些似乎导致mutate function 出现错误。 请检查一下：

# About the variables being used:

unique(x1)
#   [1]  1  0 NA

str(pemg$x1)
# num [1:1622989] 1 0 0 1 1 0 1 1 0 0 ...

unique(x2)
#   [1]  16  66  38  11   8   6  14  17  53  59  10  31  50  19  48  42  44  21  54  55  56  18  57  61  13  43   7   4  15
#  [30]  39   5  20   3  37  23  51  36  52  68  58  27  65  62   2  12  32  41  49  46  35  34  45  81  69  33  40   0  70
#  [59]   9  47  63  29  25  22  64  24  60  30  67  26  71  72  28   1  75  80  87  77  73  78  76  79  74  83  92 102  85
#  [88]  86  90  82  91  84  88  93  89  96  95 105 115 106  94 100  99  97 104  98 103 108 109 101 117 107 114 113  NA 112
# [117] 110 111

str(pemg$x2)
# num [1:1622989] 16 66 38 11 8 6 14 17 53 59 ...

unique(x3)
#   [1]  6  3  4  5  0  8  2  1 11  9 10  7 NA 15

str(pemg$anoest)
# num [1:1622989] 6 3 4 5 3 0 5 8 4 2 ...

    df <- mutate(df,
                   y = case_when(
                       x1 == 1 & x2 >=  7 & x3 ==  0 ~ 1,
                       x1 == 1 & x2 >=  8 & x3 ==  1 ~ 1,
                       x1 == 1 & x2 >= 10 & x3 ==  3 ~ 1,
                       x1 == 1 & x2 >= 11 & x3 ==  4 ~ 1,
                       x1 == 1 & x2 >= 12 & x3 ==  5 ~ 1,
                       x1 == 1 & x2 >= 13 & x3 ==  6 ~ 1,
                       x1 == 1 & x2 >= 14 & x3 ==  7 ~ 1,
                       x1 == 1 & x2 >= 15 & x3 ==  8 ~ 1,
                       x1 == 1 & x2 >= 16 & x3 ==  9 ~ 1,
                       x1 == 1 & x2 >= 17 & x3 == 10 ~ 1,
                       x1 == 1 & x2 >= 18 & x3 == 11 ~ 1,
                       x1 == 1 & !is.na(x3) ~ 0,
                       x1 == 1 & x3 %in% 12:16 ~ 0,
                       x2 %in% 0:7 ~ NA,
                       x2 > 18 ~ NA,
                       x1 == 0 ~ NA,
                       is.na(x3) ~ NA))

# Error: Problem with `mutate()` input `defasado`.
# x must be a double vector, not a logical vector.
# i Input `defasado` is `case_when(...)`.
# Run `rlang::last_error()` to see where the error occurred.

last_error()
# <error/dplyr_error>
# Problem with `mutate()` input `y`.
# x must be a double vector, not a logical vector.
# i Input `y` is `case_when(...)`.
# Backtrace:
#  1. dplyr::mutate(...)
#  2. dplyr:::mutate.data.frame(...)
#  3. dplyr:::mutate_cols(.data, ...)
#  Run `rlang::last_trace()` to see the full context.

last_trace()
# <error/dplyr_error>
# Problem with `mutate()` input `defasado`.
# x must be a double vector, not a logical vector.
# i Input `defasado` is `case_when(...)`.
# Backtrace:
#     x
#  1. +-dplyr::mutate(...)
#  2. \-dplyr:::mutate.data.frame(...)
#  3.   \-dplyr:::mutate_cols(.data, ...)
# <parent: error/rlang_error>
# must be a double vector, not a logical vector.
# Backtrace:
#     x
#  1. +-mask$eval_all_mutate(dots[[i]])
#  2. \-dplyr::case_when(...)
#  3.   \-dplyr:::replace_with(...)
#  4.     \-dplyr:::check_type(val, x, name)
#  5.       \-dplyr:::glubort(header, "must be {friendly_type_of(template)}, not {friendly_type_of(x)}.")

有人可以给我一个关于如何解决这个问题的提示吗？

Answer 1

这里的问题是你的case_when的结果。 if_else 形式 dplyr 比基础 R 的 ifelse 更严格 - 所有结果值都必须属于同一类型。 由于 case_when 是多个 if_else 的矢量化，因此您必须告诉 R output 应该是哪种类型的 NA：

library(dplyr)
# does not work
dplyr::tibble(d = c(6,2,4, NA, 5)) %>% 
  dplyr::mutate(v = case_when(d < 4 ~ 0,
                              is.na(d) ~ NA))
# works
dplyr::tibble(d = c(6,2,4, NA, 5)) %>% 
  dplyr::mutate(v = case_when(d < 4 ~ 0,
                              is.na(d) ~ NA_real_))

Answer 2

R 具有不同类型的 NA。 您使用的是逻辑类型，但您需要双重类型 NA_real_ 以便与您的其他条件的 output 保持一致。 有关更多信息，请参阅： https://stat.ethz.ch/R-manual/R-patched/library/base/html/NA.html

Answer 3

您需要确保您的NA是正确的 class。 在您的情况下，将NA放在as.numeric()中的~之后。 例如：

x2 %in% 0:7 ~ as.numeric(NA)

Answer 4

在base R中，我们可以构造一个逻辑向量并根据该逻辑向量将列值分配给NA 。 与case_when不同，我们不必真正指定NA的类型，因为它会自动转换。

df1$d[df1$d %in% 0:7] <- NA

此外，为了简单的操作，可以在base R中以紧凑的方式完成

如何使用 R 变异 function 转换 NA 值？

问题描述

4 个解决方案

解决方案1
2 已采纳 2021-03-07 15:41:02

解决方案2
1 2021-03-07 15:38:17

解决方案3
1 2021-03-07 15:40:57

解决方案4
1 2021-03-07 17:29:20

如何使用 R 变异 function 转换 NA 值？

问题描述

4 个解决方案

解决方案1 2 已采纳 2021-03-07 15:41:02

解决方案2 1 2021-03-07 15:38:17

解决方案3 1 2021-03-07 15:40:57

解决方案4 1 2021-03-07 17:29:20

解决方案1
2 已采纳 2021-03-07 15:41:02

解决方案2
1 2021-03-07 15:38:17

解决方案3
1 2021-03-07 15:40:57

解决方案4
1 2021-03-07 17:29:20