如何在R中的数据帧中找到一列中出现字符串最长的时间以及另一列中对应的第一个和最后一个值？

Question

数据

df <- data.frame(Vehid = rep(c(1,2,3), each=15), gap = c(rep(5,3), rep(7,2), 20,20,21,21,22,23,24,28,29,30, 20,20,21,21,22,23,24,28,29,30, rep(7,5), rep(5,3), rep(7,3), rep(5,3), 7, 20,24,26,28,30),
                                                          State = c(rep('Following',3), rep('.',2), rep('Following',10), rep('Following',10), rep('.',5), rep('Following',3), rep('.',3), rep('Following',3), '.', rep('Following',5)))

需要

Vehid是车辆的唯一ID， gap是车辆与另一车辆保持的距离， State Vehid车辆是否在“跟随”（交通工程中的技术术语）另一车辆。 您可以看到在此示例df ，一辆车辆可能有多个连续发生“跟随”事件的实例。 的. 表示“不关注”。 我想为每个Vehid查找Vehid的“跟随”状态，并找到该gap的第一个和最后一个值。

期望的输出

df2 <- data.frame(Vehid = rep(c(1,2,3), each=15), gap = c(rep(5,3), rep(7,2), 20,20,21,21,22,23,24,28,29,30, 20,20,21,21,22,23,24,28,29,30, rep(7,5), rep(5,3), rep(7,3), rep(5,3), 7, 20,24,26,28,30),
                 State = c(rep('Following',3), rep('.',2), rep('Following',10), rep('Following',10), rep('.',5), rep('Following',3), rep('.',3), rep('Following',3), '.', rep('Following',5)),
                 dx_safe = rep(20,45), dx_CC2 = rep(30,45))

dx_safe是出现时间最长的第一个Following字符串的gap ，而dx_CC2是最后出现的dx_CC2 。

我尝试了什么

我什么都不知道！ 请帮忙。

Answer 1

为了获得最长的运行时间，我几乎会使用rle函数。 我还将照顾通过split的Vehicle ID split数据，然后与do.call(rbind()) 。 我结束了

dx<-do.call(rbind, (lapply(split(df,df$Vehid), function(x){
    rr <- rle(x$State=="Following")
    i <- which.max(rr$lengths * rr$values)
    v <- x$gap[c((s<-sum(c(0,rr$lengths)[1:i]))+1, s+rr$lengths[i])]
    data.frame(Vehid=x$Vehid[1], dx_safe=v[1], dx_CC2=v[2])
})))

哪个返回

  Vehid dx_safe dx_CC2
1     1      20     30
2     2      20     30
3     3      20     30

然后，要重新加入原始数据，可以进行简单的合并。

merge(df, dx)

它应该看起来像您想要的输出。

如何在R中的数据帧中找到一列中出现字符串最长的时间以及另一列中对应的第一个和最后一个值？

问题描述

数据

需要

期望的输出

我尝试了什么

1 个解决方案

解决方案1
3 已采纳 2014-07-23 23:27:46

如何在R中的数据帧中找到一列中出现字符串最长的时间以及另一列中对应的第一个和最后一个值？

问题描述

数据

需要

期望的输出

我尝试了什么

1 个解决方案

解决方案1 3 已采纳 2014-07-23 23:27:46

解决方案1
3 已采纳 2014-07-23 23:27:46