简体   繁体   English

根据列条件用 NA 替换值

[英]Replace values with NAs based on a column condition

I have a dataframe with different columns, one of which tells me if data in other columns can be "trusted" or not, containing a "yes" or a no" (column name: "inside_calibration_range"). What I would like to do is simply to replace the values in the whole row with "NA" every time I have a "no" in the "inside_calibration_range" column.我有一个包含不同列的 dataframe,其中一列告诉我其他列中的数据是否可以“信任”,包含“是”或“否”(列名:“inside_calibration_range”)。我想做什么每次我在“inside_calibration_range”列中有一个“否”时,只需将整行中的值替换为“NA”。

I gave it a look to dplyr::na_if and replace_with_na_all() functions, but (I may be wrong) it seems they do not accept conditions, but they replace specific values in the whole dataframe.我查看了 dplyr::na_if 和 replace_with_na_all() 函数,但(我可能错了)它们似乎不接受条件,但它们替换了整个 dataframe 中的特定值。

When cyl equal to 6 cannot be trusted in mtcars , we can mutate across everything to NA for that condition:当在mtcars中不能信任等于 6 的cyl时,我们可以在该条件下将across everything mutate为 NA:

library(tidyverse)
data(mtcars)
as_tibble(mtcars %>% mutate(across(everything(), ~replace(., cyl == 6 , NA))))

# A tibble: 32 × 11
     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
 1  NA      NA   NA     NA NA    NA     NA      NA    NA    NA    NA
 2  NA      NA   NA     NA NA    NA     NA      NA    NA    NA    NA
 3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1
 4  NA      NA   NA     NA NA    NA     NA      NA    NA    NA    NA
 5  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2
 6  NA      NA   NA     NA NA    NA     NA      NA    NA    NA    NA
 7  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4
 8  24.4     4  147.    62  3.69  3.19  20       1     0     4     2
 9  22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2
10  NA      NA   NA     NA NA    NA     NA      NA    NA    NA    NA
# … with 22 more rows
# ℹ Use `print(n = ...)` to see more rows

Select only some columns instead of all: Select 只有一些列而不是全部:

as_tibble(mtcars %>% mutate(across(c(mpg, disp), ~replace(., cyl == 6 , NA))))

# A tibble: 32 × 11
     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
 1  NA       6   NA    110  3.9   2.62  16.5     0     1     4     4
 2  NA       6   NA    110  3.9   2.88  17.0     0     1     4     4
 3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1
 4  NA       6   NA    110  3.08  3.22  19.4     1     0     3     1
 5  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2
 6  NA       6   NA    105  2.76  3.46  20.2     1     0     3     1
 7  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4
 8  24.4     4  147.    62  3.69  3.19  20       1     0     4     2
 9  22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2
10  NA       6   NA    123  3.92  3.44  18.3     1     0     4     4
# … with 22 more rows
# ℹ Use `print(n = ...)` to see more rows

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM