[英]Calculate minimum distance between groups of points in data frame
my data frame looks like this:我的数据框如下所示:
Time, Value, Group
0, 1.0, A
1, 2.0, A
2, 3.0, A
0, 4.0, B
1, 6.0, B
2, 6.0, B
0, 7.0, C
1, 7.0, C
2, 9.0, C
I need to find for each combination (A, B), (A, C), (B, C) the maximum difference over each corresponding Time
points.我需要为每个组合 (A, B), (A, C), (B, C) 找到每个对应
Time
点的最大差异。
So comparing A and B has maximum distance for t=1 which is 6 (B) - 2 (A) = 4.因此,比较 A 和 B 的最大距离为 t=1,即 6 (B) - 2 (A) = 4。
The full output should be something like this:完整的输出应该是这样的:
combination,time,distance
AB, 0, 4
AC, 0, 6
BC, 0, 3
One way in base R using combn
:在基 R 中使用
combn
一种方法:
do.call(rbind, combn(unique(df$Group), 2, function(x) {
df1 <- subset(df, Group == x[1])
df2 <- subset(df, Group == x[2])
df3 <- merge(df1, df2, by = 'Time')
value <- abs(df3$Value.x - df3$Value.y)
data.frame(combn = paste(x, collapse = ''),
time = df3$Time[which.max(value)],
max_difference = max(value))
}, simplify = FALSE))
# combn time max_difference
#1 AB 1 4
#2 AC 0 8
#3 BC 0 5
We create all combination of unique
Group
values, subset
the data for them and merge
them on Time
.我们创建
unique
Group
值的所有组合,为它们设置数据subset
并在Time
上merge
它们。 Subtract the corresponding value columns and return the max
difference between them.减去相应的值列并返回它们之间的
max
差值。
data数据
df <- structure(list(Time = c(0L, 1L, 2L, 0L, 1L, 2L, 0L, 0L, 0L),
Value = c(1, 2, 3, 4, 6, 6, 7, 7, 9), Group = c("A", "A",
"A", "B", "B", "B", "C", "C", "C")),
class = "data.frame", row.names = c(NA, -9L))
One dplyr
option could be:一种
dplyr
选项可能是:
df %>%
inner_join(df, by = "Time") %>%
filter(Group.x != Group.y) %>%
group_by(Time,
Group = paste(pmax(Group.x, Group.y), pmin(Group.x, Group.y), sep = "-")) %>%
summarise(Max_Distance = abs(max(Value.x[Group.x == first(Group.x)]) - max(Value.y[Group.y == first(Group.y)])))
Time Group Max_Distance
<int> <chr> <dbl>
1 0 B-A 3
2 0 C-A 8
3 0 C-B 5
4 1 B-A 4
5 2 B-A 3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.