簡體   English   中英

使用 R 進行地理編碼:完全停止程序時出錯

[英]Geocoding with R: Errors stopping program altogether

我有一個工作程序,它從 Excel 的列表中提取地址,並使用 Google API 對其進行地理編碼,但只要它到達一個帶有公寓、單元或無法找到地址的地址,它就會停止程序。 我無法在循環中獲得可行的 tryCatch 例程。 :(

這是代碼:

library("readxl")
library(ggplot2)
library(ggmap)
fileToLoad <- file.choose(new = TRUE)
origAddress <- read_excel(fileToLoad, sheet = "Sheet1")
geocoded <- data.frame(stringsAsFactors = FALSE)
for(i in 1:nrow(origAddress))
{
  # Print("Working...")
  result <- geocode(origAddress$addresses[i], output = "latlona", source = "google")
  origAddress$lon[i] <- as.numeric(result[1])
  origAddress$lat[i] <- as.numeric(result[2])
  origAddress$geoAddress[i] <- as.character(result[3])
}

write.csv(origAddress, "geocoded1.csv", row.names=FALSE)

這是錯誤消息:

Warning: Geocoding "[removed address]" failed with error:
You must use an API key to authenticate each request to Google Maps Platform APIs. For additional information, please refer to http://g.co/dev/maps-no-account

Error: Can't subset columns that don't exist.
x Location 3 doesn't exist.
i There are only 2 columns.
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning messages:
1: Unknown or uninitialised column: `lon`. 
2: Unknown or uninitialised column: `lat`. 
3: Unknown or uninitialised column: `geoAddress`. 

現在,這不是 API 密鑰錯誤,因為該密鑰在錯誤之后的調用中起作用——並且它停止在街道名稱后以數字結尾的任何地址。

我每個月要處理成批的數千個地址,它們並不都是完美的,所以我需要能夠跳過這些壞地址,在 lon/lat 列中輸入“NA”,然后繼續前行。
我是 R 的新手,無法制定可行的錯誤處理例程來處理這些類型的錯誤。 誰能指出我正確的方向? 提前致謝。

當 geocode 找不到地址並且output = "latlona"時,不會返回address字段。 您的代碼可以使用以下修改。

#
#  example data
#
  origAddress <- data.frame(addresses = c("white house, Washington",
                           "white house, # 100, Washington",
                           "white hose, Washington",
                           "Washington Apartments, Washington, DC 20001",
                           "1278 7th st nw, washington, dc 20001") )
#
#  simple fix for fatal error
#
  for(i in 1:nrow(origAddress))
  {
    result <- geocode(origAddress$addresses[i], output = "latlona",
                      source = "google")
    origAddress$lon[i] <- result$lon[1]
    origAddress$lat[i] <- result$lat[1]
    origAddress$geoAddress[i] <- ifelse( is.na(result$lon[1]), NA, result$address[1] )
 }

但是,您提到您的某些地址可能不准確。 Google 的地理編碼將嘗試解釋您提供的所有地址。 有時它會失敗並返回 NA,但有時它的解釋可能不正確,因此您應該始終檢查地理編碼結果。 一種簡單的方法,可以捕獲許多錯誤,在geocode中設置output = "more" ,然后檢查loctype列中返回的值。 如果loctype != "rooftop" ,您可能會遇到問題。 檢查type列將為您提供更多信息。 此項檢查未完成。 要進行更完整的檢查,您可以使用output = "all"返回 google 為地址提供的所有數據,但這需要解析一個中等復雜的列表。 您應該在https://developers.google.com/maps/documentation/geocoding/overview閱讀有關 google 地理編碼返回的數據的更多信息

此外, geocode至少需要數十分鍾才能返回數千個地址的結果。 為了最大限度地縮短響應時間,您應該將地址作為地址的字符向量提供給地理編碼。 然后返回結果數據框,您可以使用它來更新origAddress數據框並檢查錯誤,如下所示。

 #
 #  Solution should check for wrongly interpreted addresses
 #
 #  see https://developers.google.com/maps/documentation/geocoding/overview 
 #  for more information on fields returned by google geocoding
 #
 #  return all addresses in single call to geocode
 #
    origAddress <- data.frame(addresses = c("white house, Washington",           # identified by name
                                        "white hose, Washington",            # misspelling
                                        "Washington Apartments, apt 100, Washington, DC 20001",  # identified by name of apartment building
                                        "Washington Apartments, # 100, Washington, DC 20001",    # invalid apartment number specification
                                        "1206 7th st nw, washington, dc 20001") )   # address on street but no structure with that address 


   result <- suppressWarnings(geocode(location = origAddress$addresses,
                                   output = "more",
                                   source = "google") )
   origAddress <- cbind(origAddress, result[, c("address", "lon","lat","type", "loctype")])
 #
 #   Addresses which need to be checked
 #
    check_addresses <- origAddress[ origAddress$loctype != "rooftop" |
                                is.na(origAddress$loctype), ]

    

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM