簡體   English   中英

使用 mutate() 在 tidy R grep 中添加基於多個條件的列

[英]add column based on multiple conditions with mutate() in tidy R grep

我正在努力對添加的列中的寬數據框進行分類,但基於多列的閾值 (>0)。 SO 上的先前示例需要列的完整名稱以及帶有 > 和 == 的 if else() 語句。 但是我需要能夠使用 grep() 或 contains() 來隔離基於公共字符串的列。

輸入數據框:

library(tidyverse)
df <- data.frame(
    "ID" = c("asdf","vfdkun", "seifu", "seijd", "qweri"),
    "elephant_zoo" = c(1,1,1,2,0), #Should not be useful there
    "rhino_zoo" = c(1,2,3,1,0), #Should not be useful there
    "hippo_zoo" = c(1,1,0,0,0),
    "elephant_wild_A" = c(0,0,1,1,3),
    "rhino_wild_A" = c(0,0,4,3,1),
    "elephant_wild_B" = c(0,0,0,0,0),
    "rhino_wild_C" = c(0,0,1,5,7),
    "hippo_wild_B" = c(0,0,0,0,0)) %>% 
  column_to_rownames(var = "ID")
df 

實際上,這有更多的列和行!

所需的輸出數據幀已對行( ZOOWILD )進行了CLASSIFICATION並對這些CLASSIFICATION了編譯。

df_goal <- data.frame(
    "ID" = c("asdf","vfdkun", "seifu", "seijd", "qweri"),
    "elephant_zoo" = c(1,1,1,2,2), #Should not be useful there
    "rhino_zoo" = c(1,2,3,1,2), #Should not be useful there
    "hippo_zoo" = c(1,1,0,0,2),
    "elephant_wild_A" = c(0,0,1,1,3),
    "rhino_wild_A" = c(1,0,4,3,1),
    "elephant_wild_B" = c(0,0,0,0,0),
    "rhino_wild_C" = c(6,0,1,5,7),
    "hippo_wild_B" = c(0,0,0,0,0)) %>% 
  column_to_rownames(var = "ID") %>% 
    add_column(ZOO = c("zoo", "zoo", "zoo", "zoo", "")) %>% 
    add_column(WILD = c("", "", "wild", "wild", "wild")) %>% 
    add_column(CLASSIFICATION = c("zoo only", "zoo only", "both", "both", "wild only"))
df_goal 

我希望使用mutate()case_when() ,但我無法正確選擇多列。 嘗試的例子:

# using an if else statement
df %>%
   mutate(ZOO = ifelse(select(contains("zoo")) > 0, "zoo", "F"))

# using mutate and case_when
df %>%
   mutate(ZOO = case_when(
       select(contains("zoo")) > 0 ~ "zoo",
       TRUE ~ ""))
 

我的實際數據框有更多類別,因此能夠將其分解為 ZOO 與 WILD,然后跟進已編譯的列。

您可以嘗試使用reducepurrr包。 人們可以使用一個中間函數any_cols到所作的代碼更清晰並與使用它across

library(tidyverse)
any_cols <- function(df) reduce(df, `|`)
df %>%
    mutate(ZOO = ifelse(any_cols(across(contains("zoo"), ~`>`(.,0))), "zoo", "F"))
  elephant_zoo rhino_zoo hippo_zoo elephant_wild_A rhino_wild_A elephant_wild_B rhino_wild_C hippo_wild_B ZOO
1            1         1         1               0            0               0            0            0 zoo
2            1         2         1               0            0               0            0            0 zoo
3            1         3         0               1            4               0            1            0 zoo
4            2         1         0               1            3               0            5            0 zoo
5            0         0         0               3            1               0            7            0   F

df %>%
    mutate(ZOO = 
             case_when(any_cols(across(contains("zoo"), ~`>`(.,0))) ~ "zoo", 
                       TRUE ~ "F"))
  elephant_zoo rhino_zoo hippo_zoo elephant_wild_A rhino_wild_A elephant_wild_B rhino_wild_C hippo_wild_B ZOO
1            1         1         1               0            0               0            0            0 zoo
2            1         2         1               0            0               0            0            0 zoo
3            1         3         0               1            4               0            1            0 zoo
4            2         1         0               1            3               0            5            0 zoo
5            0         0         0               3            1               0            7            0   F

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM