简体   繁体   English

使用dplyr case_when根据来自另一列的值更改NA值

[英]using dplyr case_when to alter NA values based on value from another column

structure(list(a = c(NA, 3, 4, NA, 3, "Council" , "Council", 1), b = c("Council A", 3, 4, 
"Council B", 6, 7, 2, 6), c = c(6, 3, 6, 5, 3, 6, 5, 3), d = c(6, 2, 4, 
5, 3, 7, 2, 6), e = c(1, 2, 4, 5, 6, 7, 6, 3), f = c(2, 3, 4, 
2, 2, 7, 5, 2)), .Names = c("a", "b", "c", "d", "e", "f"), row.names = c(NA, 
8L), class = "data.frame")

I am trying to convert objects in a using dplyr mutuate and case_when based on text in b . 我正在尝试基于b文本使用dplyr mutuate和case_when在a转换对象。 I want to convert values in a to Council if b contains Council in the string. 我想值转换成a ,如果理事会b包含理事会的字符串中。

The code i've used is DF %>% select(a, b) %>% mutate(a =case_when(grepl("Council", b) ~"Council")) 我使用的代码是DF %>% select(a, b) %>% mutate(a =case_when(grepl("Council", b) ~"Council"))

However all values become NA in a if they do not contain the string Council. 但是,如果所有值不包含字符串“理事会”,则它们在a中都将变为NA I've reviewed other posts and attempted various methods including ifelse . 我查看了其他帖子,并尝试了包括ifelse在内的各种方法。 I want to maintain the same dataframe just make any NA values in a be converted to Council but only in the cases where it is NA values. 我想保持相同的数据帧,只是将a中的任何NA值都转换为理事会,但仅在NA值的情况下。

From ?case_when 来自?case_when

If no cases match, NA is returned. 如果没有匹配的情况,则返回NA。

So for the cases when there is no "Council" word in b it returns NA . 因此,对于b没有“理事会”字的情况,它将返回NA

You need to define the TRUE argument in case_when and assign it to a to keep the values unchanged when the condition is not met. 您需要在case_when定义TRUE参数,并将其分配给a以在case_when条件时保持值不变。

library(dplyr)

df %>%  
 mutate(a = case_when(grepl("Council", b) ~"Council",
                        TRUE ~ a))


#        a         b c d e f
#1 Council Council A 6 6 1 2
#2       3         3 3 2 2 3
#3       4         4 6 4 4 4
#4 Council Council B 5 5 5 2
#5       3         6 3 3 6 2
#6 Council         7 6 7 7 7
#7 Council         2 5 2 6 5
#8       1         6 3 6 3 2

In this case you could also achieve your result using base R 在这种情况下,您还可以使用基数R实现结果

df$a[grepl("Council", df$b)] <- "Council"

You can also use str_detect from the package stringr to achieve your objective. 您还可以使用str_detect程序stringr来实现您的目标。

library(dplyr)
library(stringr)

df <- structure(list(a = c(NA, 3, 4, NA, 3, "Council" , "Council", 1), b = c("Council A", 3, 4, 
                                                                   "Council B", 6, 7, 2, 6), c = c(6, 3, 6, 5, 3, 6, 5, 3), d = c(6, 2, 4, 
                                                                                                                                  5, 3, 7, 2, 6), e = c(1, 2, 4, 5, 6, 7, 6, 3), f = c(2, 3, 4, 
                                                                                                                                                                                       2, 2, 7, 5, 2)), .Names = c("a", "b", "c", "d", "e", "f"), row.names = c(NA, 
                                                                                                                                                                                                                                                                8L), class = "data.frame")

df %>% 
  mutate(a=ifelse(str_detect(b,fixed("council",ignore_case = T)) & is.na(a),"Council",a))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Case_when 和/或 if_else dplyr - 当 NA 使用来自另一列的值时 - Case_when and or if_else dplyr - when NA use values from another column 如何根据另一个列值从 dplyr 使用 case_when() 分配值? - How can I assign a value with case_when() from dplyr based on another column value? dplyr:使用矩阵中的值子集创建带有 case_when 的新列 - dplyr: using values subset from matrix to create new column with case_when dplyr 变异 case_when 产生 NA - dplyr mutate case_when producing NA dplyr `case_when()` 不适用 - dplyr `case_when()` trouble with NA 使用 case_when() 和 filter() 根据 R 中一列中的值和另一列中的级别对数​​据框进行子集化 - using case_when() and filter() to subset a dataframe based on values in one column and levels in another column in R 根据从不同列获得的值创建新列,使用 R 中的 mutate() 和 case_when 函数 - Creating a new column based on values obtained from different column, using mutate() and case_when function in R 基于两个变量使用 dplyr case_when 时出错 - Error when using dplyr case_when based on two variables 使用case_when在dplyr的mutate中根据条件在数据框中创建新列 - using case_when inside dplyr's mutate to create a new column in dataframe based on conditions dyplr 使用带有 NA 值的 Mutate 和 Case_when - dyplr using Mutate and Case_when with NA values
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM