繁体   English   中英

确定变量何时从R中的正数连续减少到负数

[英]Determining when a variable continuously decreases from positive to negative in R

注意:在使用小示例数据之前,我已经问过类似的问题 但是,原始数据有更多变化。 所以,我问的是一个不同的数据集和我尝试过的问题。 提到前一个问题是没有必要的。

背景和目标:

我有一个车辆(SV)的数据框,它跟随同一车道上的另一辆车(LV)。 数据帧由该车辆对Vehicle.ID2的唯一id,SV sacc的加速度(或减速度)以及SV和LV dV之间的速度差组成。 这些数据是连续的; 每0.1秒收集一次。
dV = SV速度 - LV速度。 如果SV比LV快,则dV为正,如果SV慢于LVdV为负。

我的目标是确定dV何时从dV连续减小到负值。

dV在数据中的插图:

在此输入图像描述

红点表明dV从较高的正值减少到负值。 这表示SV减速直到其速度变得与LV几乎相同。
问题是将其与其他下降趋势区分开来。

数据和我的代码:

> dput(v4)
structure(list(Vehicle.ID2 = c("1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229", "1244-1229", "1244-1229", "1244-1229", "1244-1229", 
"1244-1229"), sacc = c(1.33473, 1.28881, 1.12739, 0.92515, 0.79816, 
0.77705, 0.77709, 0.7018, 0.54152, 0.35914, 0.21269, 0.11578, 
0.05379, 0.00906, -0.03176, -0.0799, -0.14646, -0.23743, -0.33985, 
-0.42841, -0.51378, -0.67205, -0.98247, -1.41625, -1.81232, -1.98915, 
-1.86429, -1.48063, -0.96691, -0.46229, -0.05322, 0.22623, 0.36564, 
0.3728, 0.28016, 0.28016, 0.1318, -0.04474, -0.25664, -0.54304, 
-0.94809, -1.47003, -2.02141, -2.45992, -2.67431, -2.64058, -2.43398, 
-2.21456, -2.14105, -2.24268, -2.39479, -2.4354, -2.28833, -2.00501, 
-1.73923, -1.65306, -1.79031, -2.03431, -2.19941, -2.15638, -1.88802, 
-1.47322, -1.03007, -0.6549, -0.38989, -0.22549, -0.12923, -0.07337, 
-0.04122, -0.02295, -0.01268, -0.00696, -0.00378, -0.00201, -0.00102, 
-0.00041, 1e-05, 4e-04, 0.00089, 0.00156, 0.00254, 0.00383, 0.00562, 
0.00901, 0.01616, 0.02995, 0.05467, 0.09779, 0.17244, 0.29969, 
0.50292, 0.78585, 1.10851, 1.39059, 1.55465, 1.58478, 1.58478, 
1.53221, 1.44536, 1.3151, 1.10763, 0.83332, 0.55819, 0.34436, 
0.20481, 0.12039, 0.07033, 0.04069, 0.02289, 0.01212, 0.00597, 
0.00283, 0.00135, 0.00066, 0.00036, 0.00028, 0.00033, 0.00053, 
0.00095, 0.00174, 0.00319, 0.00585, 0.01067, 0.01935, 0.03485, 
0.06229, 0.11024, 0.11024, 0.19323, 0.3351, 0.56547, 0.89479, 
1.27622, 1.59639, 1.72741, 1.61302, 1.31188, 0.95446, 0.65358, 
0.44288, 0.28747, 0.14665, 0.01949, -0.06719, -0.09937, -0.09473, 
-0.08409, -0.08806, -0.11677, -0.18112, -0.30004, -0.49642, -0.77007, 
-1.05809, -1.23331, -1.18125, -0.90438, -0.53983, -0.25886, -0.14969, 
-0.18594, -0.26766, -0.27471, -0.12709, 0.14865, 0.41108, 0.5024, 
0.38064, 0.14679, -0.05745, -0.15241, -0.12129, 0.04399, 0.34721, 
0.73806, 1.09385, 1.27053, 1.17782, 0.83126, 0.36119, -0.03701, 
-0.20736, -0.13405, 0.04995, 0.17193, 0.15255, 0.03416, -0.11381, 
-0.29232, -0.55588, -0.91217, -1.27137, -1.50188, -1.5161, -1.30449, 
-0.91828, -0.44017, 0.03285, 0.40236, 0.5957, 0.59973, 0.47027, 
0.29558, 0.13746, 0.00655, -0.10752, -0.18416, -0.1507, 0.06662, 
0.45103, 0.86238, 1.11731, 1.1189, 0.91304), dV = c(4.21179, 
4.56119, 4.9035, 5.21174, 5.47554, 5.70108, 5.89899, 6.07561, 
6.23723, 6.39737, 6.57255, 6.76887, 6.97441, 7.16521, 7.31834, 
7.42281, 7.48066, 7.50046, 7.49184, 7.46365, 7.42074, 7.35858, 
7.2637, 7.125, 6.94731, 6.75368, 6.57596, 6.44293, 6.3723, 6.3708, 
6.43996, 6.57845, 6.77695, 7.01294, 7.2537, 7.2537, 7.46557, 
7.62299, 7.71318, 7.73335, 7.68319, 7.56209, 7.37473, 7.13693, 
6.87372, 6.6113, 6.36752, 6.14355, 5.92392, 5.68938, 5.43157, 
5.15601, 4.87316, 4.5876, 4.29195, 3.96996, 3.60889, 3.21271, 
2.80501, 2.41936, 2.0838, 1.8074, 1.57649, 1.36382, 1.14568, 
0.91604, 0.68813, 0.48505, 0.32763, 0.225309999999999, 0.174119999999999, 
0.15957, 0.16086, 0.15743, 0.1382, 0.10588, 0.0712099999999998, 
0.0426600000000006, 0.0217799999999997, 0.00577000000000005, 
-0.0093399999999999, -0.0279300000000005, -0.0556799999999997, 
-0.10074, -0.174270000000001, -0.28749, -0.44417, -0.63324, -0.82883, 
-0.99765, -1.10932, -1.14662, -1.11022, -1.01578, -0.88621, -0.74265, 
-0.74265, -0.598030000000001, -0.45855, -0.3301, -0.221340000000001, 
-0.13957, -0.0855199999999998, -0.0537800000000006, -0.0379100000000001, 
-0.0340199999999999, -0.0413199999999998, -0.06229, -0.10313, 
-0.17405, -0.28624, -0.44438, -0.63954, -0.84933, -1.04619, -1.20815, 
-1.32708, -1.40841, -1.46279, -1.49754, -1.51543, -1.51855, -1.51151, 
-1.49999, -1.4874, -1.4734, -1.45477, -1.45477, -1.42599, -1.37842, 
-1.29907, -1.17264, -0.98977, -0.75783, -0.503509999999999, -0.26115, 
-0.0548000000000002, 0.110539999999999, 0.243110000000001, 0.35304, 
0.44737, 0.53312, 0.62007, 0.716550000000001, 0.822159999999999, 
0.925949999999999, 1.0132, 1.07443, 1.1082, 1.11731, 1.10303, 
1.06236, 0.9904, 0.88733, 0.76534, 0.64763, 0.55687, 0.50178, 
0.473750000000001, 0.45518, 0.43117, 0.397699999999999, 0.36429, 
0.34917, 0.36642, 0.41341, 0.47028, 0.513599999999999, 0.53162, 
0.527559999999999, 0.512580000000001, 0.49927, 0.50059, 0.52937, 
0.59246, 0.68312, 0.77911, 0.85017, 0.87031, 0.82803, 0.729839999999999, 
0.59696, 0.4569, 0.33289, 0.23586, 0.1641, 0.10962, 0.0632599999999996, 
0.0138299999999996, -0.0532699999999995, -0.15096, -0.28166, 
-0.43363, -0.58579, -0.71575, -0.80599, -0.84704, -0.8392, -0.793299999999999, 
-0.72877, -0.66691, -0.62213, -0.597770000000001, -0.58982, -0.59375, 
-0.60753, -0.62777, -0.643879999999999, -0.63773, -0.59278, -0.506419999999999, 
-0.3943, -0.28177, -0.18967)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -214L), .Names = c("Vehicle.ID2", "sacc", 
"dV"))

我试着找到这个下降的第一个( start_flag )和最后一个( end_flag )点:

library(dplyr)
v4 <- v4 %>% 
  group_by(Vehicle.ID2) %>% 
  mutate(end_flag = sign(dV)-sign(lag(dV)),
         delta_dV = dV-lag(dV),
         start_flag = as.numeric(NA)) %>%
  ungroup()




flag_start <- function(df){
  for(i in nrow(df):2){
    if((sign(df$delta_dV[i]) + sign(df$dV[i])>0)&&(df$dV[i]-df$dV[i-1] > 0 ) && (df$sacc[i]<0)){
      df$start_flag[i] <- 1
      return(df)
      break
    }
  }
}

library(purrr)
library(tidyr)
library(dplyr)

v44 <- v4 %>%
  group_by(Vehicle.ID2) %>% 
  nest()

v45 <-v44 %>% 
  mutate(dV_app = map(data, flag_start))

v4 <- unnest(v45, dV_app)


v4 <- v4 %>% 
  group_by(Vehicle.ID2) %>% 
  mutate(flag = as.numeric(ifelse(end_flag==0 & start_flag==1, start_flag,
                                  ifelse(end_flag==-2 & sacc>0, end_flag, NA)))) %>% 
  ungroup()

您的输出似乎依赖于找到曲线与x轴交叉的那些索引,即从笛卡尔平面的正半部到负半部的x=0 这可以通过连续元素的逻辑连接来完成,如下所示:

goneg <- which(v4$dV[-nrow(v4)]>=0 & v4$dV[-1L]<0)+1L;
goneg;
## [1]  81 190

但我们也要求前一下降的起始指数。 这更难计算。 可以通过查找相邻元素的下降对的包含游程长度的起始索引来完成。

res <- with(rle(diff(v4$dV)<0),{
    ends <- cumsum(lengths);
    starts <- c(1L,ends[-length(ends)]+1L);
    i <- findInterval(goneg,starts,rightmost.closed=T);
    data.frame(start=starts[i],end=goneg);
});
res;
##   start end
## 1    73  81
## 2   179 190

这是结果图:

plotRes <- function(v4,res) {
    plot(seq_len(nrow(v4)),v4$dV,type='l');
    points(unlist(res),v4$dV[unlist(res)],col=rainbow(nrow(res)),pch=16L);
    abline(h=0);
};
plotRes(v4,res);

plot1

这似乎不是你所期望的。 原因是dV矢量实际上在指数72和73上略有上升,在其他方面持续下降的下降:

diff(v4$dV[72:73]);
## [1] 0.00129

我们可以做的是,不是采用连续下降(或上升)段的运行长度,而是应用一个限制最大允许上升的阈值,从而允许相邻索引的小幅上升:

threshold <- 0.01;
res <- with(rle(diff(v4$dV)<threshold),{
    ends <- cumsum(lengths);
    starts <- c(1L,ends[-length(ends)]+1L);
    i <- findInterval(goneg,starts,rightmost.closed=T);
    data.frame(start=starts[i],end=goneg);
});
res;
##   start end
## 1    40  81
## 2   179 190

由此产生的情节:

plotRes(v4,res);

plot2

library(ggplot2); library(dplyr)

首先,我们将数据点划分为减少的块。

chunks <- split(v4$dV, cumsum(seq_along(v4$dV) %in% which(c(1, diff(v4$dV)) >= 0)))

然后我们根据需要进行过滤。

appr <- chunks %>%
  sapply(function(x) max(x) > 0 & min(x) < 0 & length(x) > 1) %>%
  which %>%
  chunks[.] %>%
  unlist

最后我们准备我们的情节。

mutate(v4, p = row_number(),
       label = ifelse(dV %in% appr & dV > 0, "Approaching LV", "None")) %>%
  ggplot(aes(x = p, y = dV)) +
  geom_point(aes(color = label))

在此输入图像描述

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM