简体   繁体   English

如何循环列并使用 R 中的经纬度计算距离

[英]How loop over columns and calculate distance using lat and long in R

I have a dataframe with lat and long of various areas in a city.我有一个数据框,其中包含城市中各个区域的纬度和经度。

A subset of the dataframe:数据帧的一个子集:

structure(list(Locality = c("ADYAR", "AMBATTUR", "KOLATHUR", 
"AVADI", "AGARAM", "ANNA NAGAR WEST", "CHROMPET", "MADIPAKKAM", 
"MOGAPPAIR", "MYLAPORE"), Transactions = c(607, 569, 498, 409, 
103, 257, 303, 343, 316, 205), lon = c(80.2564957, 80.1547844, 
80.2121332, 80.0969511, 80.2294222, 80.2017906, 80.1461663, 80.1960832, 
80.1749627, 80.2676303), lat = c(13.0011774, 13.1143393, 13.1239583, 
13.1067448, 13.1116221, 13.0861782, 12.951611, 12.9647462, 13.0837224, 
13.0367914), Ambatturlon = c(80.15478, 80.15478, 80.15478, 80.15478, 
80.15478, 80.15478, 80.15478, 80.15478, 80.15478, 80.15478), 
    Ambatturlat = c(13.11434, 13.11434, 13.11434, 13.11434, 13.11434, 
    13.11434, 13.11434, 13.11434, 13.11434, 13.11434), Guindylon = c(80.22064, 
    80.22064, 80.22064, 80.22064, 80.22064, 80.22064, 80.22064, 
    80.22064, 80.22064, 80.22064), Guindylat = c(13.00666, 13.00666, 
    13.00666, 13.00666, 13.00666, 13.00666, 13.00666, 13.00666, 
    13.00666, 13.00666), OMRlon = c(80.22915, 80.22915, 80.22915, 
    80.22915, 80.22915, 80.22915, 80.22915, 80.22915, 80.22915, 
    80.22915), OMRlat = c(12.91261, 12.91261, 12.91261, 12.91261, 
    12.91261, 12.91261, 12.91261, 12.91261, 12.91261, 12.91261
    )), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"
))
> 
> df
# A tibble: 10 x 10
   Locality        Transactions   lon   lat Ambatturlon Ambatturlat Guindylon Guindylat OMRlon OMRlat
   <chr>                  <dbl> <dbl> <dbl>       <dbl>       <dbl>     <dbl>     <dbl>  <dbl>  <dbl>
 1 ADYAR                    607  80.3  13.0        80.2        13.1      80.2      13.0   80.2   12.9
 2 AMBATTUR                 569  80.2  13.1        80.2        13.1      80.2      13.0   80.2   12.9
 3 KOLATHUR                 498  80.2  13.1        80.2        13.1      80.2      13.0   80.2   12.9
 4 AVADI                    409  80.1  13.1        80.2        13.1      80.2      13.0   80.2   12.9
 5 AGARAM                   103  80.2  13.1        80.2        13.1      80.2      13.0   80.2   12.9
 6 ANNA NAGAR WEST          257  80.2  13.1        80.2        13.1      80.2      13.0   80.2   12.9
 7 CHROMPET                 303  80.1  13.0        80.2        13.1      80.2      13.0   80.2   12.9
 8 MADIPAKKAM               343  80.2  13.0        80.2        13.1      80.2      13.0   80.2   12.9
 9 MOGAPPAIR                316  80.2  13.1        80.2        13.1      80.2      13.0   80.2   12.9
10 MYLAPORE                 205  80.3  13.0        80.2        13.1      80.2      13.0   80.2   12.9
> 

Columns Ambatturlon, Ambatturlat, Guindylon etc are localities within the same city.列 Ambatturlon、Ambatturlat、Guindylon 等是同一城市内的地方。 I need to calculate the distance between each locality and the other localities as mentioned in the columns: (Ambatturlon, Ambatturlat), (Guindylon Guindylat), (OMRlon OMRlat).我需要计算列中提到的每个地点与其他地点之间的距离:(Ambatturlon, Ambatturlat), (Guindylon Guindylat), (OMRlon OMRlat)。

I learnt that we can use distHaversine function from geosphere package for this.我了解到我们可以为此使用 geosphere 包中的 distHaversine 函数。

I tried it for first locality using below code:我使用以下代码在第一个地点尝试了它:

> df %>% 
+   rowwise() %>% 
+     mutate(disttoAmbattur = distHaversine(c(lon, lat), c(Ambatturlon, Ambatturlat)))
Source: local data frame [10 x 11]
Groups: <by row>

# A tibble: 10 x 11
   Locality        Transactions   lon   lat Ambatturlon Ambatturlat Guindylon Guindylat OMRlon OMRlat disttoAmbattur
   <chr>                  <dbl> <dbl> <dbl>       <dbl>       <dbl>     <dbl>     <dbl>  <dbl>  <dbl>          <dbl>
 1 ADYAR                    607  80.3  13.0        80.2        13.1      80.2      13.0   80.2   12.9      16744.   
 2 AMBATTUR                 569  80.2  13.1        80.2        13.1      80.2      13.0   80.2   12.9          0.483
 3 KOLATHUR                 498  80.2  13.1        80.2        13.1      80.2      13.0   80.2   12.9       6309.   
 4 AVADI                    409  80.1  13.1        80.2        13.1      80.2      13.0   80.2   12.9       6326.   
 5 AGARAM                   103  80.2  13.1        80.2        13.1      80.2      13.0   80.2   12.9       8098.   
 6 ANNA NAGAR WEST          257  80.2  13.1        80.2        13.1      80.2      13.0   80.2   12.9       5984.   
 7 CHROMPET                 303  80.1  13.0        80.2        13.1      80.2      13.0   80.2   12.9      18139.   
 8 MADIPAKKAM               343  80.2  13.0        80.2        13.1      80.2      13.0   80.2   12.9      17245.   
 9 MOGAPPAIR                316  80.2  13.1        80.2        13.1      80.2      13.0   80.2   12.9       4050.   
10 MYLAPORE                 205  80.3  13.0        80.2        13.1      80.2      13.0   80.2   12.9      14975.   
> 

I could do same manually but there are many such localities columns.我可以手动做同样的事情,但有很多这样的地方列。 Could someone let me know if I can loop through other localities and add a new column similar to disttoAmbattur for each lat long combination of all localities columns.有人可以让我知道我是否可以遍历其他位置并为所有位置列的每个经纬度组合添加一个类似于 disttoAmbattur 的新列。

We can gather all the lat and lon columns together in a vector and use map2 to pass them in parrallel.我们可以将所有 lat 和 lon 列聚集在一个向量中,并使用map2map2方式传递它们。 Calculate distHaversine for each pair and add them as new columns in the original dataframe.计算每对的distHaversine并将它们添加为原始数据distHaversine的新列。

library(dplyr)
library(purrr)

lon_col <- grep('.lon', names(df), value = TRUE)
lat_col <- grep('.lat', names(df), value = TRUE)

df %>%
  bind_cols(map2_dfc(lon_col, lat_col, ~{
       newcol <- paste0('dist', sub('lon', '', .x))
       df %>% 
       rowwise() %>% 
       transmute(!!newcol := geosphere::distHaversine(c(lon, lat),
                             c(.data[[.x]], .data[[.y]])))
}))

# A tibble: 10 x 13
#   Locality        Transactions   lon   lat Ambatturlon Ambatturlat Guindylon Guindylat OMRlon OMRlat distAmbattur distGuindy distOMR
#   <chr>                  <dbl> <dbl> <dbl>       <dbl>       <dbl>     <dbl>     <dbl>  <dbl>  <dbl>        <dbl>      <dbl>   <dbl>
# 1 ADYAR                    607  80.3  13.0        80.2        13.1      80.2      13.0   80.2   12.9    16744.         3937.  10296.
# 2 AMBATTUR                 569  80.2  13.1        80.2        13.1      80.2      13.0   80.2   12.9        0.483     13953.  23861.
# 3 KOLATHUR                 498  80.2  13.1        80.2        13.1      80.2      13.0   80.2   12.9     6309.        13090.  23599.
# 4 AVADI                    409  80.1  13.1        80.2        13.1      80.2      13.0   80.2   12.9     6326.        17437.  25935.
# 5 AGARAM                   103  80.2  13.1        80.2        13.1      80.2      13.0   80.2   12.9     8098.        11723.  22154.
# 6 ANNA NAGAR WEST          257  80.2  13.1        80.2        13.1      80.2      13.0   80.2   12.9     5984.         9085.  19548.
# 7 CHROMPET                 303  80.1  13.0        80.2        13.1      80.2      13.0   80.2   12.9    18139.        10140.   9995.
# 8 MADIPAKKAM               343  80.2  13.0        80.2        13.1      80.2      13.0   80.2   12.9    17245.         5373.   6823.
# 9 MOGAPPAIR                316  80.2  13.1        80.2        13.1      80.2      13.0   80.2   12.9     4050.         9906.  19934.
#10 MYLAPORE                 205  80.3  13.0        80.2        13.1      80.2      13.0   80.2   12.9    14975.         6101.  14440.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM