简体   繁体   English

如何使用 osrm 加速找到两组点之间最短行驶时间的代码

[英]How to speed up code that finds shortest driving time between two sets of points using osrm

Say that I have two sets of coordinates, A and B. My goal is, for each element of A, to find the element of B with the shortest driving time (retaining B's index, driving time, and distance).假设我有两组坐标,A 和 B。我的目标是,对于 A 的每个元素,找到 B 中行驶时间最短的元素(保留 B 的索引、行驶时间和距离)。 Based on the answer by @Ben in this question ( Calculating distance of multiple coordinates in R ), I have come up with the code below.根据@Ben 在这个问题( 计算 R 中多个坐标的距离)中的回答,我想出了下面的代码。 My question is how to make this faster.我的问题是如何让它更快。

library(osrm)
library(sf)

apotheke.df <- st_read(system.file("gpkg/apotheke.gpkg", package = "osrm"),
                       quiet = TRUE)

points <- data.frame(
    id = 1:3,
    longitude = c(13.4, 13.5, 13.3),
    latitude = c(52.4, 52.5, 52.3)
)

route <- list()

#overall goal: for each point, find distance to nearest apotheke

#find distance from each of the 3 points to the first apotheke
for (x in 1:nrow(points)) {
    route[[x]] <- osrmRoute(src = c(points$longitude[x], points$latitude[x]),
                            dst = apotheke.df[1,],
                            overview = FALSE, osrm.profile = "car")
    #add index
    route[[x]][3] <- 1
}

#replace if duration is less than the lowest one
for (x in 1:nrow(points)) {
    for(y in 2:nrow(apotheke.df)) {
        temp <- osrmRoute(src = c(points$longitude[x], points$latitude[x]),
                          dst = apotheke.df[y,],
                          overview = FALSE, osrm.profile = "car")
        temp[3] <- y
        
        print(y)
        print(temp)
        if(temp[2] < route[[x]][2])  route[[x]] <- temp
    }
}

do.call(rbind, route)

Result:结果:

     duration distance   
[1,]     3.52     1.84 18
[2,]     2.05     1.00 14
[3,]    17.10    17.76 76

In my actual application, one set of points has about 150 and the other has thousands and thousands.在我的实际应用中,一组点有150个左右,另一组有几千几千。 This takes a really long time to do.这需要很长时间才能完成。 This leads to my question - how can I speed up the above code?这引出了我的问题——我怎样才能加快上面的代码?

My best guess is to use parallel processing (though other aspects of the code may be slow), but I am not good at figuring out how to do this.我最好的猜测是使用并行处理(尽管代码的其他方面可能很慢),但我不擅长弄清楚如何做到这一点。 It may be related to these questions: 1) parallelizing API in R (OSRM) and 2) How to apply an osrm function to every row of a dataframe .它可能与这些问题有关:1) 在 R (OSRM) 中并行化 API和 2) 如何将 osrm function 应用于 dataframe 的每一行

Here is a reproducible example elaborating on my comment.这是一个可重现的例子,详细说明了我的评论。

1. Define sample source and dest data: 1.定义sample sourcedest数据:

This is just for the purpose of the answer.这只是为了回答的目的。

library(osrm)

source <- data.frame(
    id = 1:3,
    longitude = c(13.4, 13.5, 13.3),
    latitude = c(52.4, 52.5, 52.3)
)

dest <- data.frame(
    id = 4:5,
    longitude = c(13.9, 13.6),
    latitude = c(52.2, 52.4)
)

2. Calculate distances 2.计算距离

You will save a huge amount of time using osrmTable() rather than looping through your data and calling osrmRoute() each time.使用osrmTable()而不是循环遍历数据并每次调用osrmRoute()将节省大量时间。 This is particularly true if you are using a remote server rather than hosting you own, as each request is travelling over the inte.net.如果您使用的是远程服务器而不是托管您自己的服务器,则尤其如此,因为每个请求都通过 inte.net 传输。 Using osrmTable() makes one request to the server for the entire table, rather than a request for every pair of coordinates.使用osrmTable()向服务器发出对整个表的一次请求,而不是对每一对坐标的请求。

distances <- osrmTable(
    src = source[c("longitude", "latitude")],
    dst = dest[c("longitude", "latitude")],
    osrm.profile = "car"
)

distances[["durations"]]

#      1    2
# 1 60.6 25.7
# 2 71.4 25.4
# 3 55.7 25.3

Even with just 100 sources coordinates and 100 destinations, R needs to interpret one command, make one.network request and do one high level assignment ( <- ) operation, rather than 10,000 if you iterate over each one individually.即使只有 100 个源坐标和 100 个目的地,R 也需要解释一个命令,发出一个 .network 请求并执行一个高级分配 ( <- ) 操作,而不是 10,000 个,如果您单独迭代每个操作。

3. Find the closest destination to each source 3.找到离每个源头最近的目的地

Again it is much quicker if you skip the loop.同样,如果您跳过循环,它会更快。 I am going to use data.table in the example because it's fast.我将在示例中使用data.table ,因为它很快。 Firstly, make the distance matrix into a data.table and get it into long form using melt() :首先,将距离matrix变为data.table并使用melt()将其变为长格式:

library(data.table)

distances_dt <- data.table(
    source = source$id,
    distances[["durations"]],
    key = "source"
) |>
    setnames(
        seq_len(nrow(dest)) + 1,
        as.character(dest$id)
    ) |>
    melt(
        id.vars = "source",
        variable.name = "dest",
        value.name = "distance"
    )

#    source   dest distance
#     <int> <fctr>    <num>
# 1:      1      4     60.6
# 2:      2      4     71.4
# 3:      3      4     55.7
# 4:      1      5     25.7
# 5:      2      5     25.4
# 6:      3      5     25.3

Then we can simply find the minimum distance for each source:然后我们可以简单地找到每个源的最小距离:

distances_dt[,
    .(min_distance = min(distance)),
    by = source
]

#    source min_distance
#     <int>        <num>
# 1:      1         25.7
# 2:      2         25.4
# 3:      3         25.3

4. Setting up your own osrm-backend instance 4. 设置你自己的osrm-backend实例

The only problem I can foresee is that you will get a rate limit error if you use the default R osrm package server, rather than a local one.我可以预见的唯一问题是,如果您使用默认的 R osrm package 服务器,而不是本地服务器,您将收到速率限制错误。 According to the R osrm package docs :根据 R osrm package文档

The OSRM demo server does not allow large queries (more than 10000 distances or durations). OSRM 演示服务器不允许大型查询(超过 10000 个距离或持续时间)。

If you are not sure whether you are hosting your own server then you probably aren't.如果您不确定您是否在托管自己的服务器,那么您可能不是。 You can check the default server in your R session by running getOption("osrm.server") .您可以通过运行getOption("osrm.server")检查 R session 中的默认服务器。 The default is "https://routing.openstreetmap.de/" .默认值为"https://routing.openstreetmap.de/"

If you make a request larger than the max-table-size parameter on the server you are using, you will get this response:如果您在正在使用的服务器上发出大于 max-table-size 参数的请求,您将收到以下响应:

{"code":"TooBig","message":"Too many table coordinates"}

If this is the case, then if practical, you could break up your data into smaller chunks.如果是这种情况,那么如果可行的话,您可以将数据分解成更小的块。

Alternatively, you can run your own osrm-backend instance.或者,您可以运行自己的 osrm-backend 实例。 If you are going to be calculating distances regularly, or even a lot of distances once, I would recommend doing this.如果您打算定期计算距离,或者一次计算很多距离,我建议您这样做。 There are two approaches:有两种方法:

a.一种。 Install osrm-backend and change the max-table-size parameter (Linux/Mac)安装 osrm-backend 并更改max-table-size参数 (Linux/Mac)

This is the tutorial I followed some years ago, which I think is a good way to do it. 是我几年前遵循的教程,我认为这是一个很好的方法。 The instructions are for Ubuntu although onlyminor changes are required to install on other, popular Linux flavours or MacOS.这些说明适用于 Ubuntu,但只需进行少量更改即可在其他流行的 Linux 版本或 MacOS 上安装。

You need to change one line in the tutorial.您需要更改教程中的一行。 Rather than running:而不是运行:

osrm-routed map.xml.osrm

You can run:你可以运行:

osrm-routed map.xml.osrm --max-table-size 100000

(Or whatever number is sufficiently large.) (或者任何足够大的数字。)

b. b. Expose the API in a Docker image (Windows/Linux/Mac)在 Docker 图像中公开 API (Windows/Linux/Mac)

Alternatively, if you're running Windows, or you're comfortable with Docker, then it's probably easier to use the Docker images .或者,如果您正在运行 Windows,或者您对 Docker 感到满意,那么使用Docker 图像可能更容易。 Even if you haven't used Docker before, this is a good set of instructions.即使您以前没有使用过 Docker, 也是一套很好的说明。

If you are running an osrm-backend instance in a Docker container, you can change the max-table-size parameter by passing it to docker run as described here , with syntax similar to:如果您在 Docker 容器中运行 osrm-backend 实例,您可以通过将其传递给docker run来更改max-table-size参数,按照此处所述运行,其语法类似于:

docker run -t -i -p 5000:5000 -v "${PWD}:/data" osrm/osrm-backend osrm-routed --algorithm mld --max-table-size 10000 /data/berlin-latest.osrm

In either case, once you have set up your own osrm-routed instance, you need to tell the R osrm package to use it.在任何一种情况下,一旦您设置了自己的 osrm 路由实例,就需要告诉 R osrm package 使用它。 If it is running on port 5000 of your localhost , you can do this with:如果它在你的localhost的端口 5000 上运行,你可以这样做:

options(osrm.server = "http://127.0.0.1:5000/")

You can see an example of integrating a local osrm-routed instance with the R osrm package here .您可以在此处查看将本地 osrm 路由实例与 R osrm package 集成的示例。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在 R 中查找两组点(纬度和经度)点之间的最短距离 - Finding shortest distance between two sets of points ( latitude and longitude) points in R 使用R软件包“ OSRM”查找两点之间的距离和行驶时间 - Using R package 'OSRM' to find distance and travel times between two points 如何在两组点之间创建几何图形 - How to create a geometry between two sets of points 总结两个表时间点之间的“点” - Sum up "points" between time points over two tables 使用R和Google Map API获取两点之间的距离(lat,lon) - Getting driving distance between two points (lat, lon) using R and Google Map API R-如何计算两组坐标点之间的距离? - R - How to calculate distance between two sets of coordinate points? 在两组R中找出两组点之间的最小距离 - Finding minimum distance between two sets of points in two sets of R 如何在短时间内计算一个数据集中的经纬度点与另一个数据集中的经纬度点之间的最短距离 - How to calculate shortest distance between longitude-latitude points in one dataset with those in another in a short time 找到两个数据帧之间的最短时间差 - find the shortest time difference between two dataframes 计算相邻时间点之间的距离,并通过所有时间点找到“n”个最短路径 - Compute the distances between points of neighboring time points and find the `n` shortest paths though all time points
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM