简体   繁体   English

如何根据 R 中其他列的特定条件创建新列?

[英]How do I create a new column based off of specific conditions of other columns in R?

I have the following data frame regarding a predator, prey interaction, SS=surf.smelt, SL = sandlance and H = herring.我有以下关于捕食者、猎物相互作用、SS = surf.smelt、SL = sandlance 和 H = 鲱鱼的数据框。

Essentially all I need is another column that states whether there was more than one species available for the predator during the interaction.基本上,我需要的只是另一列,说明在相互作用期间是否有多个物种可供捕食者使用。 For example, If you look at index 12, the Prey was SL, but there was 1000 sandlance and 100 herring available, I need a column that can simply show with a 1 or 0 if there were more than 2 species available for the predator.例如,如果您查看索引 12,猎物是 SL,但有 1000 条沙矛和 100 条鲱鱼可用,我需要一个列,如果有超过 2 个物种可供捕食者使用,我需要一个可以简单地显示为 1 或 0 的列。

If possible, I would also like to show what other species was available in long format如果可能的话,我还想以长格式显示其他物种

Date frame I have:我拥有的日期框架:

index   date     site  Pred  Prey  attack  passive surf.smelt  sandlance  herring
 17   2015-06-06  cb    JCK   SS     0       1         20         0          0
 26   2015-07-05  cb    JCK   SS     0       1         100        0          0
 12   2016-07-26  cb    JCK   SL     1       0         0         1000       100
 88   2016-07-26  cb    JCK   H      1       0         0         1000       1000
 89   2016-07-26  cb    JCK   H      1       0         0          0         100
 90   2018-08-21  cb    JCO   SL     1       0        100        500         0
100   2018-08-26  cb    JCO   SL     1       0         0         1000       100
108   2019-06-22  cb    JCO   SS     0       1        50          0         100

Data frame I want:我想要的数据框:

index   date     site  Pred  Prey  attack  passive surf.smelt  sandlance  herring OtherPrey?
 17   2015-06-06  cb    JCK   SS     0       1         20         0          0       0
 26   2015-07-05  cb    JCK   SS     0       1         100        0          0       0
 12   2016-07-26  cb    JCK   SL     1       0         0         1000       100      1  
 88   2016-07-26  cb    JCK   H      1       0         0         1000       1000     1
 89   2016-07-26  cb    JCK   H      1       0         0          0         100      0
 90   2018-08-21  cb    JCO   SL     1       0        100        500         0       1
100   2018-08-26  cb    JCO   SL     1       0         0         1000       100      1
108   2019-06-22  cb    JCO   SS     0       1        50          0         100      1

And if possible I would want to define the other species available:如果可能的话,我想定义其他可用的物种:

Data frame I want:

index   date     site  Pred  Prey  attack  passive surf.smelt  sandlance  herring OtherAvailable
 17   2015-06-06  cb    JCK   SS     0       1         20         0          0       0
 26   2015-07-05  cb    JCK   SS     0       1         100        0          0       0
 12   2016-07-26  cb    JCK   SL     1       0         0         1000       100      H  
 88   2016-07-26  cb    JCK   H      1       0         0         1000       1000     SL
 89   2016-07-26  cb    JCK   H      1       0         0          0         100      0
 90   2018-08-21  cb    JCO   SL     1       0        100        500         0       SS
100   2018-08-26  cb    JCO   SL     1       0         0         1000       100      H
108   2019-06-22  cb    JCO   SS     0       1        50          0         100      H

We can use dplyr::if_any :我们可以使用dplyr::if_any

library(dplyr)

df %>% mutate(other_prey = +if_any(surf.smelt:herring))

For the "OtherAvailable", we can use toString对于“OtherAvailable”,我们可以使用toString

library(dplyr)

df %>% rowwise %>%
       mutate(OtherAvailable = toString(names(across(surf.smelt:herring))[as.logical(surf.smelt:herring)]))

With dplyr :使用dplyr

df %>% 
   rowwise() %>% 
   mutate(Other_Prey=ifelse(sum(c_across(surf.smelt:herring)>0)>=2,1,0))
# A tibble: 8 x 11
# Rowwise: 
  index date       site  Pred  Prey  attack passive surf.smelt sandlance herring Other_Prey
  <int> <chr>      <chr> <chr> <chr>  <int>   <int>      <int>     <int>   <int>      <dbl>
1    17 2015-06-06 cb    JCK   SS         0       1         20         0       0          0
2    26 2015-07-05 cb    JCK   SS         0       1        100         0       0          0
3    12 2016-07-26 cb    JCK   SL         1       0          0      1000     100          1
4    88 2016-07-26 cb    JCK   H          1       0          0      1000    1000          1
5    89 2016-07-26 cb    JCK   H          1       0          0         0     100          0
6    90 2018-08-21 cb    JCO   SL         1       0        100       500       0          1
7   100 2018-08-26 cb    JCO   SL         1       0          0      1000     100          1
8   108 2019-06-22 cb    JCO   SS         0       1         50         0     100          1

It's not very elegant, but here's a shot.它不是很优雅,但这里有一个镜头。

library(dplyr)
library(stringr)


df <- data.frame(Prey = c("SS", "SS", "SL", "H", "H", "SL", "SL", "SS"),
             SS = c(20, 100, 0, 0, 0, 100, 0 ,50),
             SL = c(0,0,1000, 1000, 0, 500, 1000, 0),
             H = c(0,0,100,1000,100,0,100,100) )

tb <- df %>%
    mutate(SS = SS > 0, SL = SL > 0, H  = H > 0,
    OtherPrey =  (rowSums(across(where(is.logical))))-1)
    
tb1 <- which(as.matrix(tb[,2:4]), arr.ind = TRUE)
tb$Available <- tapply(names(tb[,2:4])[tb1[,2]], tb1[,1], paste, collapse=",")        


tb <- tb %>%
    mutate(other = str_replace(Available,Prey,""),
           AvailableOther = str_replace(other,",","")) %>%
    select(Prey, OtherPrey, AvailableOther) 

df_new <- cbind(df, tb)

df_new

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM