简体   繁体   English

根据其他分类值添加一列因子

[英]Adding a column of factors based on other categorical values

I have a massive dataset, and I want to add a factor to each value based on another factor.我有一个庞大的数据集,我想根据另一个因素为每个值添加一个因素。 Currently, my data look like this:目前,我的数据如下所示:

     Type      Value
 1   Wild      68.51
 2   Wild      91.94
 3   Captive   72.58
 4   Hybrid    85.38

But I want to add another column of factors - {Australia, Costa Rica, Brazil} - that is based on if animals are wild, captive or hybrids.但我想添加另一列因素 - {澳大利亚、哥斯达黎加、巴西} - 这是基于动物是野生的、圈养的还是杂交的。 The data frame should then look like this:数据框应如下所示:

     Type      Value    Status
 1   Wild      68.51    Costa Rica
 2   Wild      91.94    Costa Rica
 3   Captive   72.58    Australia
 4   Hybrid    85.38    Brazil 

Something like this using dplyr::case_when ?像这样使用dplyr::case_when

library(dplyr);
df %>%
    mutate(Status = case_when(
        Type == "Wild" ~ "Costa Rica",
        Type == "Captive" ~ "Australia",
        Type == "Hybrid" ~ "Brazil"));
#     Type Value     Status
#1    Wild 68.51 Costa Rica
#2    Wild 91.94 Costa Rica
#3 Captive 72.58  Australia
#4  Hybrid 85.38     Brazil

Sample data样本数据

df <- read.table(text =
    "Type      Value
    1   Wild      68.51
    2   Wild      91.94
    3   Captive   72.58
    4   Hybrid    85.38", header = T)

A base R option would be to create a named vector as key/value pairs and use that to match the column 'Type'一个base R选项是创建一个命名向量作为键/值对,并使用它来匹配列“类型”

df$Status <- setNames( c('Costa Rica', 'Australia', 'Brazil'), 
            c('Wild', 'Captive', 'Hybrid'))[as.character(df$Type)]
df
#      Type Value     Status
#1    Wild 68.51 Costa Rica
#2    Wild 91.94 Costa Rica
#3 Captive 72.58  Australia
#4  Hybrid 85.38     Brazil

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM