简体   繁体   English

根据两个同时条件(阈值和计数)分配因子水平

[英]Assign a level of a factor based in two simmultaneous conditions (threshold and count)

I need to assign a level of a factor (in a new column) based on how many times is a particular threshold met for a particular observation in a set of attributes. 我需要根据一组属性中特定观察值满足特定阈值的次数来分配因子级别(在新列中)。

Here is an example of a species database (n=26) and several attributes (n=6). 这是一个物种数据库(n = 26)和几个属性(n = 6)的示例。 I want to add a new column/variable based on how many times is a particular threshold met for a particular observation in the set of attributes. 我想基于属性集中特定观察值满足特定阈值多少次来添加新列/变量。 It would be great if there is a solution based on tidyverse logic 如果有一个基于tidyverse逻辑的解决方案那就太好了

Database for 26 species and 6 attributes 数据库有26种和6种属性

at1 <- rnorm(26,2,1)
at2 <- rnorm(26,1.6,1.2)
at3 <- rnorm(26,2,1)
at4 <- rnorm(26,1.6,1.2)
at5 <- rnorm(26,2,1)
at6 <- rnorm(26,1.6,1.2)
sp <-paste("sp_", letters, sep="")
data<-data.frame(sp,at1,at2,at3,at4,at5,at6)

condition 1: assign "high" level if at least three attributes exceed a threshold of 3 条件1:如果至少三个属性超过阈值3,则分配“高”级别

condition 2: assign "moderate" level if at least three attributes exceed a threshold of 2.5 条件2:如果至少三个属性超过阈值2.5,则分配“中等”级别

assign "low" if none of the above conditions is met 如果不满足以上条件,则分配“低”

The dplyr function case_when is good for when a column value depends on multiple conditions. dplyr函数case_when适用于列值取决于多个条件的情况。

# set.seed(4)

> data %>% 
    gather(key, value, -sp) %>%
    group_by(sp) %>%
    mutate(threshold = case_when(
        sum(value > 3.0) > 2 ~ 'high', 
        sum(value > 2.5) > 2 ~ 'moderate',
        TRUE                 ~ 'low')
    ) %>%
    spread(key, value)

# A tibble: 26 x 8
# Groups:   sp [26]
   sp    threshold   at1    at2   at3     at4   at5    at6
   <fct> <chr>     <dbl>  <dbl> <dbl>   <dbl> <dbl>  <dbl>
 1 sp_a  low       2.22   3.11  1.92   0.644  2.04   2.89 
 2 sp_b  low       1.46   2.69  2.44   1.54   0.274  2.40 
 3 sp_c  high      2.89   0.486 3.97   3.14   3.56   0.442
 4 sp_d  moderate  2.60   3.09  1.40   1.34   2.78  -0.770
 5 sp_e  low       3.64   1.78  1.45   0.910  0.901  0.898
 6 sp_f  moderate  2.69   2.86  2.70  -0.165  0.272  2.76 
 7 sp_g  low       0.719  0.695 1.84   0.361  2.43   2.26 
 8 sp_h  low       1.79  -0.179 3.35   0.0322 2.74   1.50 
 9 sp_i  moderate  3.90   2.63  0.931  0.594  2.87  -0.412
10 sp_j  high      3.78   1.11  3.06   0.243  2.31   3.06 
# ... with 16 more rows

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM