如何通过匹配另一个数据框来填充数据框列值？

Question

假设我有一个带有 x 和 y 坐标的数据框，如下所示：

          x        y
1  3.984804 4.470310
2  3.985005 4.470310
3  3.985071 4.470310
4  3.985262 4.469213
5  3.985262 4.469213
6  3.985262 4.469213
7  3.985001 4.471442
8  3.985001 4.471759
9  3.984981 4.472782
10 3.985001 4.478800

输入 output：

structure(list(x = c(3.98480399, 3.98500453380952, 3.98507138190476,
3.98526204428571, 3.98526204428571, 3.98526204428571, 3.98500133714286,
3.98500133714286, 3.98498099190476, 3.98500133714286), y = c(4.47030988428572,
4.47030988428572, 4.47030988428572, 4.46921270476191, 4.46921270476191,
4.46921270476191, 4.47144165047619, 4.47175932380952, 4.47278151761905,
4.47880045571429)), numFrames = 68418L, fps = 50, units = "mm", timeUnits = "s", row.names = c(NA,
10L), class = c("Trajectory", "data.frame"))

我有另一个坐标如下的数据框：

          x1       y1
1  0.1466667 3.053333
2  0.1466667 3.446667
3  0.1466667 3.753333
4  0.1933333 4.053333
5  0.2800000 4.400000
6  0.4066667 4.653333
7  0.5400000 4.920000
8  0.7133333 5.193333
9  0.8400000 5.366667
10 8.2133333 5.233333
11 8.3733333 5.066667
12 8.5133333 4.853333
13 8.6866667 4.613333
14 8.7933333 4.440000
15 8.9066667 4.180000
16 9.0066667 3.526667
17 9.1200000 3.513333
18 9.1533333 3.046667
19 9.1400000 2.880000

输入 output：

structure(list(x1 = c(0.146666666666667, 0.146666666666667, 0.146666666666667,
0.193333333333333, 0.28, 0.406666666666667, 0.54, 0.713333333333333,
0.84, 8.21333333333333, 8.37333333333333, 8.51333333333333, 8.68666666666667,
8.79333333333333, 8.90666666666667, 9.00666666666667, 9.12, 9.15333333333333,
9.14), y1 = c(3.05333333333333, 3.44666666666667, 3.75333333333333,
4.05333333333333, 4.4, 4.65333333333333, 4.92, 5.19333333333333,
5.36666666666667, 5.23333333333333, 5.06666666666667, 4.85333333333333,
4.61333333333333, 4.44, 4.18, 3.52666666666667, 3.51333333333333,
3.04666666666667, 2.88)), class = "data.frame", row.names = c(NA,
-19L))

我想在第一个数据帧中添加一列，其中新列是第二个数据帧的 y1 值，数据帧之间的 x 值最接近。

例如第一行将是：

          x        y        y1
1  3.984804 4.470310 4.4653333

因为第二个数据帧的 x1 中的第 6 行最接近第一个数据帧中的 x，所以添加了 y 值。

Answer 1


d1 <- structure(list(x = c(3.98480399, 3.98500453380952, 3.98507138190476,
3.98526204428571, 3.98526204428571, 3.98526204428571, 3.98500133714286,
3.98500133714286, 3.98498099190476, 3.98500133714286), y = c(4.47030988428572,
4.47030988428572, 4.47030988428572, 4.46921270476191, 4.46921270476191,
4.46921270476191, 4.47144165047619, 4.47175932380952, 4.47278151761905,
4.47880045571429)), numFrames = 68418L, fps = 50, units = "mm", timeUnits = "s", row.names = c(NA,
10L), class = c("Trajectory", "data.frame"))

d2 <- structure(list(x1 = c(0.146666666666667, 0.146666666666667, 0.146666666666667,
0.193333333333333, 0.28, 0.406666666666667, 0.54, 0.713333333333333,
0.84, 8.21333333333333, 8.37333333333333, 8.51333333333333, 8.68666666666667,
8.79333333333333, 8.90666666666667, 9.00666666666667, 9.12, 9.15333333333333,
9.14), y1 = c(3.05333333333333, 3.44666666666667, 3.75333333333333,
4.05333333333333, 4.4, 4.65333333333333, 4.92, 5.19333333333333,
5.36666666666667, 5.23333333333333, 5.06666666666667, 4.85333333333333,
4.61333333333333, 4.44, 4.18, 3.52666666666667, 3.51333333333333,
3.04666666666667, 2.88)), class = "data.frame", row.names = c(NA,
                                                              -19L))



## calculate diff from all of x to all of x1:
dm <- abs(outer( d1$x, d2$x1, FUN="-" ))

## find closest per row:
i.closest <- apply( dm, 1, which.min )

d1$y1 <- d2$y1[i.closest]

Answer 2

您可以尝试max.col + outer如下所示

d1$y1 <- d2$y1[max.col(-abs(outer(d1$x, d2$x1, "-")))]

如何通过匹配另一个数据框来填充数据框列值？

问题描述

2 个解决方案

解决方案1
0 已采纳 2021-04-07 21:03:24

解决方案2
0 2021-04-07 21:34:34

如何通过匹配另一个数据框来填充数据框列值？

问题描述

2 个解决方案

解决方案1 0 已采纳 2021-04-07 21:03:24

解决方案2 0 2021-04-07 21:34:34

解决方案1
0 已采纳 2021-04-07 21:03:24

解决方案2
0 2021-04-07 21:34:34