简体   繁体   English

根据 R 中其他列的条件创建新列

[英]Create new column based on conditions in other columns in R

For context I am determining behavior states of a seabird using GPS relocations, time, and speed.对于上下文,我正在使用 GPS 重定位、时间和速度来确定海鸟的行为状态。 In my case, I am trying to determine the if a foraging event occurred for each individual, given the speed they are traveling at between successive relocations, and the duration between relocations.就我而言,我试图确定每个人是否发生了觅食事件,考虑到他们在连续搬迁之间的行驶速度以及搬迁之间的持续时间。 For example, I would like to create a new column (named "event") based on the following condition for each "id": IF "speed" < 4 AND "duration" > 240 for consecutive relocations, THEN "event" = 1. If the condition is not met, then "event" = 0.例如,我想根据以下条件为每个“id”创建一个新列(名为“event”): IF "speed" < 4 AND "duration" > 240 用于连续重定位,THEN "event" = 1 . 如果条件不满足,则“事件”=0。

Below is some simplified sample data, and below that the desired output.下面是一些简化的示例数据,下面是所需的 output。 Thanks in advance提前致谢

# Sample data
id <- c(1,1,1,1,1,
        2,2,2,2,2,
        3,3,3,3,3,
        4,4,4,4,4)
time <- c("00:00", "00:02", "00:04", "00:06", "00:08",
          "00:00", "00:02", "00:04", "00:06", "00:08",
          "00:00", "00:02", "00:04", "00:06", "00:08",
          "00:00", "00:02", "00:04", "00:06", "00:08")

x <- c(-123.1, -123.3, -123.6, -123.2, -123.4,
       -123.0, -123.2, -123.9, -123.1, -123.3,
       -123.4, -123.7, -123.3, -123.5, -123.1,
       -123.8, -123.5, -123.1, -123.0, -123.9)

y <- c(37.1, 37.2, 37.3, 37.4, 37.5,
       37.0, 37.1, 37.2, 37.3, 37.4,
       37.3, 37.4, 37.5, 37.6, 37.7,
       37.2, 37.3, 37.4, 37.5, 37.6)

duration <- c(120, 120, 120, 120, 120, 
          120, 120, 120, 120, 120, 
          120, 120, 120, 120, 120,
          120, 120, 120, 120, 120)

speed <-c(3, 3, 3, 3, 5,
          2, 2, 2, 5, 5, 
          5, 5, 5, 2, 2,
          3, 3, 3, 3, 5)

data <- cbind(id, time, x, y, duration, speed)

> data
      id  time    x        y      duration speed
 [1,] "1" "00:00" "-123.1" "37.1" "120"    "3"  
 [2,] "1" "00:02" "-123.3" "37.2" "120"    "3"  
 [3,] "1" "00:04" "-123.6" "37.3" "120"    "3"  
 [4,] "1" "00:06" "-123.2" "37.4" "120"    "3"  
 [5,] "1" "00:08" "-123.4" "37.5" "120"    "5"  
 [6,] "2" "00:00" "-123"   "37"   "120"    "2"  
 [7,] "2" "00:02" "-123.2" "37.1" "120"    "2"  
 [8,] "2" "00:04" "-123.9" "37.2" "120"    "2"  
 [9,] "2" "00:06" "-123.1" "37.3" "120"    "5"  
[10,] "2" "00:08" "-123.3" "37.4" "120"    "5"  
[11,] "3" "00:00" "-123.4" "37.3" "120"    "5"  
[12,] "3" "00:02" "-123.7" "37.4" "120"    "5"  
[13,] "3" "00:04" "-123.3" "37.5" "120"    "5"  
[14,] "3" "00:06" "-123.5" "37.6" "120"    "2"  
[15,] "3" "00:08" "-123.1" "37.7" "120"    "2"  
[16,] "4" "00:00" "-123.8" "37.2" "120"    "3"  
[17,] "4" "00:02" "-123.5" "37.3" "120"    "3"  
[18,] "4" "00:04" "-123.1" "37.4" "120"    "3"  
[19,] "4" "00:06" "-123"   "37.5" "120"    "3"  
[20,] "4" "00:08" "-123.9" "37.6" "120"    "5" 
>

# Desired output

> data
      id  time    x        y      duration speed event
 [1,] "1" "00:00" "-123.1" "37.1" "120"    "3"   "1"  
 [2,] "1" "00:02" "-123.3" "37.2" "120"    "3"   "1"  
 [3,] "1" "00:04" "-123.6" "37.3" "120"    "3"   "1"  
 [4,] "1" "00:06" "-123.2" "37.4" "120"    "3"   "1"  
 [5,] "1" "00:08" "-123.4" "37.5" "120"    "5"   "0"  
 [6,] "2" "00:00" "-123"   "37"   "120"    "2"   "1"  
 [7,] "2" "00:02" "-123.2" "37.1" "120"    "2"   "1"  
 [8,] "2" "00:04" "-123.9" "37.2" "120"    "2"   "1"  
 [9,] "2" "00:06" "-123.1" "37.3" "120"    "5"   "0"  
[10,] "2" "00:08" "-123.3" "37.4" "120"    "5"   "0"  
[11,] "3" "00:00" "-123.4" "37.3" "120"    "5"   "0"  
[12,] "3" "00:02" "-123.7" "37.4" "120"    "5"   "0"  
[13,] "3" "00:04" "-123.3" "37.5" "120"    "5"   "0"  
[14,] "3" "00:06" "-123.5" "37.6" "120"    "2"   "0"  
[15,] "3" "00:08" "-123.1" "37.7" "120"    "2"   "0"  
[16,] "4" "00:00" "-123.8" "37.2" "120"    "3"   "1"  
[17,] "4" "00:02" "-123.5" "37.3" "120"    "3"   "1"  
[18,] "4" "00:04" "-123.1" "37.4" "120"    "3"   "1"  
[19,] "4" "00:06" "-123"   "37.5" "120"    "3"   "1"  
[20,] "4" "00:08" "-123.9" "37.6" "120"    "5"   "0"  
> 

Seems to be a run length question, using rle you can get the results you desire似乎是一个运行长度问题,使用rle你可以得到你想要的结果

data <- data.frame(id, time, x, y, duration, speed)
library(dplyr)

data |> 
  group_by(id, speed) |> 
  mutate(event = +(rle(speed)$lengths * duration > 240 & speed < 4))
  |> ungroup()
    id time      x     y duration speed event
   <dbl> <chr> <dbl> <dbl>    <dbl> <dbl> <int>
 1     1 00:00 -123.  37.1      120     3     1
 2     1 00:02 -123.  37.2      120     3     1
 3     1 00:04 -124.  37.3      120     3     1
 4     1 00:06 -123.  37.4      120     3     1
 5     1 00:08 -123.  37.5      120     5     0
 6     2 00:00 -123   37        120     2     1
 7     2 00:02 -123.  37.1      120     2     1
 8     2 00:04 -124.  37.2      120     2     1
 9     2 00:06 -123.  37.3      120     5     0
10     2 00:08 -123.  37.4      120     5     0
11     3 00:00 -123.  37.3      120     5     0
12     3 00:02 -124.  37.4      120     5     0
13     3 00:04 -123.  37.5      120     5     0
14     3 00:06 -124.  37.6      120     2     0
15     3 00:08 -123.  37.7      120     2     0
16     4 00:00 -124.  37.2      120     3     1
17     4 00:02 -124.  37.3      120     3     1
18     4 00:04 -123.  37.4      120     3     1
19     4 00:06 -123   37.5      120     3     1
20     4 00:08 -124.  37.6      120     5     0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM