計算相鄰時間點之間的距離，並通過所有時間點找到“n”個最短路徑

Question

正如標題所述：我想計算相鄰時間點之間的距離，並找到所有時間點的n最短路徑。

我在下面發布了一個例子。 在此示例中，有2個明確區域（在3D空間中），其中點被定位。 在每個區域內，我們有多個時間點。 我想計算T1 --> T2 --> ... --> T8之間的距離，同時強制執行時間點排序。 我最終將此視為某種樹，我們最初從T1的第一個點分支到T2的2個（或更多）點，然后從每個T2分支到每個T3，等等。一旦構建了樹，我們就可以計算出來了從開始到結束的每條路徑的距離，返回最小距離的前n條路徑。 簡而言之，這里的目標是將每個T1節點與其各自的最短路徑連接起來。 也許可能有更高效或更好的方法來做到這一點。

示例數據：

> example
                   Timepoint Centre.int.X Centre.int.Y Centre.int.Z
FOV4.Beads.T1.C2          T1        5.102       28.529        0.789
FOV4.Beads.T1.C2.1        T1       37.904       50.845        0.837
FOV4.Beads.T2.C2          T2       37.905       50.843        1.022
FOV4.Beads.T2.C2.1        T2        5.083       28.491        0.972
FOV4.Beads.T4.C2          T4       37.925       50.851        0.858
FOV4.Beads.T4.C2.1        T4        5.074       28.479        0.785
FOV4.Beads.T5.C2          T5       37.908       50.847        0.977
FOV4.Beads.T5.C2.1        T5        5.102       28.475        0.942
FOV4.Beads.T6.C2          T6        5.114       28.515        0.643
FOV4.Beads.T6.C2.1        T6       37.927       50.869        0.653
FOV4.Beads.T7.C2          T7       37.930       50.875        0.614
FOV4.Beads.T7.C2.1        T7        5.132       28.525        0.579
FOV4.Beads.T8.C2          T8        4.933       28.674        0.800
FOV4.Beads.T8.C2.1        T8       37.918       50.816        0.800

此data.frame生成一個如下所示的3D散點圖：

生成上圖的基線代碼如下：

require(scatterplot3d)
    with(example, {
      s3d <- scatterplot3d(Centre.int.X, Centre.int.Y, Centre.int.Z,
                           pch=19,
                           cex.symbols=2,
                           col.axis="grey", col.grid="lightblue",
                           angle=45, 
                           xlab="X",
                           ylab="Y",
                           zlab="Z")
    })

這是一個相對干凈的例子，但我的一些數據非常混亂，這就是為什么我試圖避免聚類方法（例如k-means，dbscan等）。 任何幫助，將不勝感激！

編輯：添加結構細節。

structure(list(Timepoint = structure(c(1L, 1L, 2L, 2L, 4L, 4L, 
5L, 5L, 6L, 6L, 7L, 7L, 8L, 8L), .Label = c("T1", "T2", "T3", 
"T4", "T5", "T6", "T7", "T8"), class = "factor"), Centre.int.X = c(5.102, 
37.904, 37.905, 5.083, 37.925, 5.074, 37.908, 5.102, 5.114, 37.927, 
37.93, 5.132, 4.933, 37.918), Centre.int.Y = c(28.529, 50.845, 
50.843, 28.491, 50.851, 28.479, 50.847, 28.475, 28.515, 50.869, 
50.875, 28.525, 28.674, 50.816), Centre.int.Z = c(0.789, 0.837, 
1.022, 0.972, 0.858, 0.785, 0.977, 0.942, 0.643, 0.653, 0.614, 
0.579, 0.8, 0.8)), .Names = c("Timepoint", "Centre.int.X", "Centre.int.Y", 
"Centre.int.Z"), class = "data.frame", row.names = c("FOV4.Beads.T1.C2", 
"FOV4.Beads.T1.C2.1", "FOV4.Beads.T2.C2", "FOV4.Beads.T2.C2.1", 
"FOV4.Beads.T4.C2", "FOV4.Beads.T4.C2.1", "FOV4.Beads.T5.C2", 
"FOV4.Beads.T5.C2.1", "FOV4.Beads.T6.C2", "FOV4.Beads.T6.C2.1", 
"FOV4.Beads.T7.C2", "FOV4.Beads.T7.C2.1", "FOV4.Beads.T8.C2", 
"FOV4.Beads.T8.C2.1"))

Answer 1

不是很優雅，但它可以找到最短的路徑。

distance.matrix <- as.matrix(dist(example[,2:4], upper = TRUE, diag = TRUE))

t1s <- grep("T1", rownames(distance.matrix))
paths <- lapply(t1s, function (t) { 
    path <- rownames(distance.matrix)[t]
    distance <- NULL
    for (i in c(2,4:8))
    {
        next.nodes <- grep(paste0("T", i), rownames(distance.matrix))
        next.t <- names(which.min(distance.matrix[t,next.nodes]))
        path <- c(path, next.t)
        distance <- sum(distance, distance.matrix[t,next.t])
        t <- next.t

    }
    output <- list(path, distance)
    names(output) <- c("Path", "Total Distance")
    return(output)
})

編輯：切斷一些不需要的行。

Answer 2

這是Python中的一個實現：

from io import StringIO
import numpy as np
import pandas as pd

# Read data
s = """Name                      Timepoint Centre.int.X Centre.int.Y Centre.int.Z
FOV4.Beads.T1.C2          T1        5.102       28.529        0.789
FOV4.Beads.T1.C2.1        T1       37.904       50.845        0.837
FOV4.Beads.T2.C2          T2       37.905       50.843        1.022
FOV4.Beads.T2.C2.1        T2        5.083       28.491        0.972
FOV4.Beads.T4.C2          T4       37.925       50.851        0.858
FOV4.Beads.T4.C2.1        T4        5.074       28.479        0.785
FOV4.Beads.T5.C2          T5       37.908       50.847        0.977
FOV4.Beads.T5.C2.1        T5        5.102       28.475        0.942
FOV4.Beads.T6.C2          T6        5.114       28.515        0.643
FOV4.Beads.T6.C2.1        T6       37.927       50.869        0.653
FOV4.Beads.T7.C2          T7       37.930       50.875        0.614
FOV4.Beads.T7.C2.1        T7        5.132       28.525        0.579
FOV4.Beads.T8.C2          T8        4.933       28.674        0.800
FOV4.Beads.T8.C2.1        T8       37.918       50.816        0.800"""

df = pd.read_table(StringIO(s), sep=" ", skipinitialspace=True, index_col=0, header=0)

# Get time point ids
ts = sorted(df.Timepoint.unique())
# Get the spatial points in each time point
points = [df[df.Timepoint == t].iloc[:, -3:].values.copy() for t in ts]
# Get the spatial point names in each time point
point_names = [list(df[df.Timepoint == t].index) for t in ts]

# Find the best next point starting from the end
best_nexts = []
accum_dists = [np.zeros(len(points[-1]))]
for t_prev, t_next in zip(reversed(points[:-1]), reversed(points[1:])):
    t_dists = np.linalg.norm(t_prev[:, np.newaxis, :] - t_next[np.newaxis, :, :], axis=-1)
    t_dists += accum_dists[-1][np.newaxis, :]
    t_best_nexts = np.argmin(t_dists, axis=1)
    t_accum_dists = t_dists[np.arange(len(t_dists)), t_best_nexts]
    best_nexts.append(t_best_nexts)
    accum_dists.append(t_accum_dists)
# Reverse back the best next points and accumulated distances
best_nexts = list(reversed(best_nexts))
accum_dists = list(reversed(accum_dists))

# Reconstruct the paths
paths = []
for i, p in enumerate(point_names[0]):
    cost = accum_dists[0][i]
    path = [p]
    idx = i
    for t_best_nexts, t_point_names in zip(best_nexts, point_names[1:]):
        next_idx = t_best_nexts[idx]
        path.append(t_point_names[next_idx])
        idx = next_idx
    paths.append((path, cost))

for i, (path, cost) in enumerate(paths):
    print("Path {} (total distance {}):".format(i, cost))
    print("\n".join("\t{}".format(p) for p in path))
    print()

輸出：

Path 0 (total distance 1.23675871386137):
    FOV4.Beads.T1.C2
    FOV4.Beads.T2.C2.1
    FOV4.Beads.T4.C2.1
    FOV4.Beads.T5.C2.1
    FOV4.Beads.T6.C2
    FOV4.Beads.T7.C2.1
    FOV4.Beads.T8.C2

Path 1 (total distance 1.031072818390815):
    FOV4.Beads.T1.C2.1
    FOV4.Beads.T2.C2
    FOV4.Beads.T4.C2
    FOV4.Beads.T5.C2
    FOV4.Beads.T6.C2.1
    FOV4.Beads.T7.C2
    FOV4.Beads.T8.C2.1

說明：

它與維特比算法基本相同。 從最后開始，將每個最終節點的初始成本分配給零。 然后，對於每對連續時間點t_prev和t_next ，計算每個可能的點對之間的距離，並在t_next添加先前累積的點的t_next 。 然后為t_prev每個點選擇成本最低的下一個點，並繼續前一個時間點。 最后，對於每個時間點的每個點， best_nexts包含在下一個時間點進入的最佳點。

重建只是在best_nexts中遵循這些指數的best_nexts 。 對於每個可能的初始點，在下一個時間點選擇最佳點並繼續。

計算相鄰時間點之間的距離，並通過所有時間點找到“n”個最短路徑

問題描述

2 個解決方案

解決方案1
1 2017-08-22 01:44:37

解決方案2
1 2017-08-22 16:02:57

計算相鄰時間點之間的距離，並通過所有時間點找到“n”個最短路徑

問題描述

2 個解決方案

解決方案1 1 2017-08-22 01:44:37

解決方案2 1 2017-08-22 16:02:57

解決方案1
1 2017-08-22 01:44:37

解決方案2
1 2017-08-22 16:02:57