[英]Combing data frame rows in R based on common values
Given a data frame:给定一个数据框:
> df <- data.frame( L=c('a','b','b'), t0=c(1,10,20), t1=c(9,19,39))
> df
L t0 t1
1 a 1 9
2 b 10 19
3 b 20 39
I want:
> df
L t0 t1
1 a 1 9
2 b 10 39
The identical values for df$L equals "b" signify that the start (t0) of the first instance of 'b' should be the new 't0' value and the new 't1' value of the last instance of (contiguous) 'b' should be the new 't1' value. df$L 的相同值等于 "b" 表示 'b' 第一个实例的开始 (t0) 应该是新的 't0' 值和 (contiguous) ' 最后一个实例的新 't1' 值b' 应该是新的 't1' 值。 In effect, if t0 and t1 are times, then I want to merge the time durations of adjacent rows that have the same value for 'L'.
实际上,如果 t0 和 t1 是时间,那么我想合并具有相同“L”值的相邻行的持续时间。
After grouping by 'L', summarise
to take the first
value of 't0' and last
value of 't1' (or min
and max
)通过“L”分组后,
summarise
采取first
“T0”和的值last
的“T1”的值(或min
和max
)
df %>%
group_by(L) %>%
summarise(t0 = first(t0), t1 = last(t1))
# A tibble: 2 x 3
# L t0 t1
# <fct> <dbl> <dbl>
#1 a 1 9
#2 b 10 39
Based on the OP's comments, if we are also grouping by adjacent similar elements in 'L', use rleid
根据 OP 的评论,如果我们还按“L”中相邻的相似元素进行分组,请使用
rleid
library(data.table)
df1 %>%
group_by(grp = rleid(L), L) %>%
summarise(t0 = first(t0), t1 = last(t1))
df1 <- data.frame( L=c('a','b','b','a','b','b'),
t0=c(1,10,20,40,60,70), t1=c(9,19,39,49,69,79))
You can split
by L
and return the range
.您可以按
L
split
并返回range
。
df <- do.call(rbind, lapply(split(df[-1], df[1]), range))
df
# [,1] [,2]
#a 1 9
#b 10 39
df <- data.frame(L=rownames(df), t0=df[,1], t1=df[,2])
df
# L t0 t1
#a a 1 9
#b b 10 39
Maybe you can try aggreate
and merge
也许你可以尝试
aggreate
和merge
res <- merge(aggregate(t0 ~ L,df,min),aggregate(t1 ~ L,df,max))
such that以至于
> res
L t0 t1
1 a 1 9
2 b 10 39
Using data.table
:使用
data.table
:
library(data.table)
setDT(df)
df[, .(t0 = t0[1], t1 = t1[.N]), by = L]
# L t0 t1
# 1: a 1 9
# 2: b 10 39
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.