如何在多列中使用 ifelse 和 str_detect

Question

我有一個 dataframe 顯示死者（死者）的 ICD-10 代碼。 數據框中的每一行對應一個死者，每個死者都可以有多達 20 個條件作為導致他或她死亡的因素列出。 我想創建一個新列，顯示死者是否有任何 ICD-10 糖尿病代碼（1 表示是，0 表示否）。 糖尿病代碼在 E10-E14 范圍內，即糖尿病代碼必須以以下向量中的任何字符串開頭，但第四個 position 可以采用不同的值：

diabetes <- c("E10","E11","E12","E13","E14")

這是一個小的、虛構的數據示例：

original <- structure(list(acond1 = c("E112", "I250", "A419", "E149"), acond2 = c("I255", 
"B341", "F179", "F101"), acond3 = c("I258", "B348", "I10", "I10"
), acond4 = c("I500", "E669", "I694", "R092")), row.names = c(NA, 
-4L), class = c("tbl_df", "tbl", "data.frame"))

acond1	第二個	acond3	acond4
E112	I255	I258	I500
I250	B341	B348	E669
A419	F179	I10	I694
E149	F101	I10	R092

這是我想要的結果：

acond1	第二個	acond3	acond4	糖尿病
E112	I255	I258	I500	1
I250	B341	B348	E669	0
A419	F179	I10	I694	0
E149	F101	I10	R092	1

還有一些其他帖子（例如，在 dataframe 上使用 if else 跨多個列， Str_detect 多個列使用 cross ）關於此類問題，但我似乎無法將它們放在一起。 到目前為止，這是我沒有成功的嘗試：

library(tidyverse)
library(stringr)

#attempt 1
original %>%
  mutate_at(vars(contains("acond")), ifelse(str_detect(.,paste0("^(", 
  paste(diabetes, collapse = "|"), ")")), 1, 0))

#attempt 2
original %>%
  unite(col = "all_conditions", starts_with("acond"), sep = ", ", remove = FALSE) %>%
  mutate(diabetes = if_else(str_detect(.,paste0("^(", paste(diabetes, collapse = "|"), ")")), 1, 0))

任何幫助，將不勝感激。

Answer 1

這是使用apply的基本 R方法

dia <- paste(c("E10","E11","E12","E13","E14"), collapse="|")

df$diabetes <- apply(df, 1, function(x) any(grepl(dia,x)))*1

df
  acond1 acond2 acond3 acond4 diabetes
1   E112   I255   I258   I500        1
2   I250   B341   B348   E669        0
3   A419   F179    I10   I694        0
4   E149   F101    I10   R092        1

帶dplyr

library(dplyr)

df %>% 
  rowwise() %>% 
  mutate(diabetes=any(grepl(dia,c_across(starts_with("ac"))))*1) %>% 
  ungroup
# A tibble: 4 × 5
  acond1 acond2 acond3 acond4 diabetes
  <chr>  <chr>  <chr>  <chr>     <dbl>
1 E112   I255   I258   I500          1
2 I250   B341   B348   E669          0
3 A419   F179   I10    I694          0
4 E149   F101   I10    R092          1

數據

df <- structure(list(acond1 = c("E112", "I250", "A419", "E149"), acond2 = c("I255", 
"B341", "F179", "F101"), acond3 = c("I258", "B348", "I10", "I10"
), acond4 = c("I500", "E669", "I694", "R092")), class = "data.frame", row.names = c(NA, 
-4L))

Answer 2

如果我們想across wit ifelse和str_detect使用，那么我們可以：

為str_detect創建一個帶有paste和collapse的模式
across所有列進行mutate ，並使用帶有條件的匿名~ifelse和.names來控制新列
unite新列
來自閱讀器parse_number的readr技巧

diabetes <- c("E10","E11","E12","E13","E14")

pattern <- paste(diabetes, collapse = "|")

library(tidyverse)

original %>% 
  mutate(across(everything(), ~ifelse(str_detect(., pattern), 1, 0), .names = "new_{col}")) %>% 
  unite(New_Col, starts_with('new'), na.rm = TRUE, sep = ' ') %>% 
  mutate(diabetes = parse_number(New_Col), .keep="unused")

  acond1 acond2 acond3 acond4 diabetes
  <chr>  <chr>  <chr>  <chr>     <dbl>
1 E112   I255   I258   I500          1
2 I250   B341   B348   E669          0
3 A419   F179   I10    I694          0
4 E149   F101   I10    R092          1

Answer 3

library(tidyverse)

diabetes_pattern <- c("E10","E11","E12","E13","E14") %>% 
  str_c(collapse = "|")

original <-
  structure(
    list(
      acond1 = c("E112", "I250", "A419", "E149"),
      acond2 = c("I255", "B341", "F179", "F101"),
      acond3 = c("I258", "B348", "I10", "I10"),
      acond4 = c("I500", "E669", "I694", "R092")
    ),
    row.names = c(NA,-4L),
    class = c("tbl_df", "tbl", "data.frame")
  )

original %>% 
  rowwise() %>% 
  mutate(diabetes = +any(str_detect(string = c_across(everything()), pattern = diabetes_pattern)))
#> # A tibble: 4 x 5
#> # Rowwise: 
#>   acond1 acond2 acond3 acond4 diabetes
#>   <chr>  <chr>  <chr>  <chr>     <int>
#> 1 E112   I255   I258   I500          1
#> 2 I250   B341   B348   E669          0
#> 3 A419   F179   I10    I694          0
#> 4 E149   F101   I10    R092          1

original %>% 
  mutate(diabetes = rowSums(across(.cols = everything(), ~str_detect(.x, diabetes_pattern))))
#> # A tibble: 4 x 5
#>   acond1 acond2 acond3 acond4 diabetes
#>   <chr>  <chr>  <chr>  <chr>     <dbl>
#> 1 E112   I255   I258   I500          1
#> 2 I250   B341   B348   E669          0
#> 3 A419   F179   I10    I694          0
#> 4 E149   F101   I10    R092          1

^{由代表 package (v2.0.1) 於 2022 年 1 月 23 日創建}

如何在多列中使用 ifelse 和 str_detect

問題描述

3 個解決方案

解決方案1
0 2022-01-23 15:50:44

數據

解決方案2
0 2022-01-23 15:54:07

解決方案3
0 2022-01-23 15:55:38

如何在多列中使用 ifelse 和 str_detect

問題描述

3 個解決方案

解決方案1 0 2022-01-23 15:50:44

數據

解決方案2 0 2022-01-23 15:54:07

解決方案3 0 2022-01-23 15:55:38

解決方案1
0 2022-01-23 15:50:44

解決方案2
0 2022-01-23 15:54:07

解決方案3
0 2022-01-23 15:55:38