简体   繁体   English

从纬度和经度中检索国家坐标

[英]Retrieving country coordinates from latitude and longitude

I'm trying to retrieve the name of the country of a given (latitude, longitude) using coordinates2politics() from the RDSTK package: 我正在尝试使用RDSTK包中的coordinates2politics()检索给定(纬度,经度)国家/地区的名称:

library(dplyr)
library(plyr)
library(rjson)
library(RDSTK)

df2 <- df %>%
  mutate(politics = coordinates2politics(place_lat, place_lon),
         country = ldply(fromJSON(coordinates2politics(place_lat, place_lon)), 
                         data.frame)[["politics.name"]]) 

Here, coordinates2politics(latitude, longitude) returns a JSON string which I convert to a dataframe to extract politics.name 这里, coordinates2politics(latitude, longitude)返回一个JSON字符串,我将其转换为数据帧以提取politics.name

In df2 , I get the correct value of politics (which is the whole JSON string) but the wrong value for country df2 ,我得到了正确的politics价值(这是整个JSON字符串),但是country的错误价值

  1. Using this method (converting to a dataframe), how can I retrieve an element from a JSON string? 使用此方法(转换为数据帧),如何从JSON字符串中检索元素?
  2. Would there a more efficient method to extract an element from a JSON string ? 是否有更有效的方法从JSON字符串中提取元素?
  3. Is there a better way to retrieve a country name from a given latiude / longitude (other than using the RDSTK package) ? 有没有更好的方法从给定的latiude /经度检索国家名称(除了使用RDSTK包)?

df1 DF1

> dput(head(df1, 10L))
structure(list(place_lat = c(-23.682803, 30.109684, 36.232855, 
26.674996, 40.655138, 40.00134, 44.0752271, 32.230987, -9.5333295, 
38.3045585), place_lon = c(-46.5955455, -93.767675, -115.223125, 
-81.816602, -73.9487755, -74.1880345, -103.2334107, -90.1580165, 
-35.6871125, -92.4367735), location = c("South West", "North West", 
"North West", "North West", "North West", "North West", "North West", 
"North West", "South West", "North West"), sentiment = c("positive", 
"positive", "neutral", "positive", "neutral", "positive", "positive", 
"neutral", "positive", "neutral"), id = 1:10), .Names = c("place_lat", 
"place_lon", "location", "sentiment", "id"), row.names = c(NA, 
10L), class = "data.frame")

df2 DF2

> dput(head(df2, n=2L))
structure(list(place_lat = c(-23.682803, 30.109684), place_lon = c(-46.5955455, 
-93.767675), location = c("South West", "North West"), sentiment = c("positive", 
"positive"), id = 1:2, politics = structure(c("[\n  {\n    \"politics\": [\n      {\n        \"type\": \"admin2\",\n        \"friendly_type\": \"country\",\n        \"name\": \"Brazil\",\n        \"code\": \"bra\"\n      },\n      {\n        \"type\": \"admin4\",\n        \"friendly_type\": \"state\",\n        \"name\": \"São Paulo\",\n        \"code\": \"br32\"\n      }\n    ],\n    \"location\": {\n      \"latitude\": -23.682803,\n      \"longitude\": -46.5955455\n    }\n  }\n]", 
"[\n  {\n    \"politics\": [\n      {\n        \"type\": \"admin2\",\n        \"friendly_type\": \"country\",\n        \"name\": \"United States\",\n        \"code\": \"usa\"\n      },\n      {\n        \"type\": \"constituency\",\n        \"friendly_type\": \"constituency\",\n        \"name\": \"Eighth district, TX\",\n        \"code\": \"48_08\"\n      },\n      {\n        \"type\": \"admin6\",\n        \"friendly_type\": \"county\",\n        \"name\": \"Orange\",\n        \"code\": \"48_361\"\n      },\n      {\n        \"type\": \"admin4\",\n        \"friendly_type\": \"state\",\n        \"name\": \"Texas\",\n        \"code\": \"us48\"\n      },\n      {\n        \"type\": \"admin5\",\n        \"friendly_type\": \"city\",\n        \"name\": \"Orange\",\n        \"code\": \"48_54132\"\n      },\n      {\n        \"type\": \"admin5\",\n        \"friendly_type\": \"city\",\n        \"name\": \"Pinehurst\",\n        \"code\": \"48_57608\"\n      },\n      {\n        \"type\": \"admin5\",\n        \"friendly_type\": \"city\",\n        \"name\": \"\",\n        \"code\": \"_\"\n      }\n    ],\n    \"location\": {\n      \"latitude\": 30.109684,\n      \"longitude\": -93.767675\n    }\n  }\n]"
), .Names = c("http://www.datasciencetoolkit.org/coordinates2politics/-23.682803%2c-46.5955455", 
"http://www.datasciencetoolkit.org/coordinates2politics/30.109684%2c-93.767675"
)), country = structure(c(1L, 1L), .Label = "Brazil", class = "factor")), .Names = c("place_lat", 
"place_lon", "location", "sentiment", "id", "politics", "country"
), row.names = 1:2, class = "data.frame")   

This is an alternate way (one of many) to get country names from lat/lon. 这是从lat / lon获取国家/地区名称的另一种方式(众多之一)。 This won't require API calls out to a server. 这不需要API调用服务器。 (Save the GeoJSON file locally for real/production use): (在本地保存GeoJSON文件以供实际/生产使用):

library(rgdal)
library(magrittr)

world <- readOGR("https://raw.githubusercontent.com/AshKyd/geojson-regions/master/data/source/ne_50m_admin_0_countries.geo.json", "OGRGeoJSON")

places %>%
  select(place_lon, place_lat) %>%
  coordinates %>%
  SpatialPoints(CRS(proj4string(world))) %over% world %>%
  select(iso_a2, name) %>%
  cbind(places, .)

##    place_lat  place_lon   location sentiment id iso_a2          name
## 1  -23.68280  -46.59555 South West  positive  1     BR        Brazil
## 2   30.10968  -93.76767 North West  positive  2     US United States
## 3   36.23286 -115.22312 North West   neutral  3     US United States
## 4   26.67500  -81.81660 North West  positive  4     US United States
## 5   40.65514  -73.94878 North West   neutral  5     US United States
## 6   40.00134  -74.18803 North West  positive  6     US United States
## 7   44.07523 -103.23341 North West  positive  7     US United States
## 8   32.23099  -90.15802 North West   neutral  8     US United States
## 9   -9.53333  -35.68711 South West  positive  9     BR        Brazil
## 10  38.30456  -92.43677 North West   neutral 10     US United States

You can get more granular location data with the gadm2 shapefile, but it's huge and takes a while (even on my system) to load: 您可以使用gadm2 shapefile获得更精细的位置数据,但它很庞大并需要一段时间(甚至在我的系统上)加载:

# this takes _forever_
big_world <- readOGR("gadm2.shp", "gadm2")

# this part takes a while, too, so best save off temp results
big_res <- places %>%
  select(place_lon, place_lat) %>%
  coordinates %>%
  SpatialPoints(CRS(proj4string(big_world))) %over% big_world

big_res %>%
  select(iso_a2=ISO, name=NAME_0, name_1=NAME_1, name_2=NAME_2) %>%
  cbind(places, .)

##    place_lat  place_lon   location sentiment id iso_a2          name       name_1           name_2
## 1  -23.68280  -46.59555 South West  positive  1    BRA        Brazil    São Paulo          Diadema
## 2   30.10968  -93.76767 North West  positive  2    USA United States        Texas           Orange
## 3   36.23286 -115.22312 North West   neutral  3    USA United States       Nevada            Clark
## 4   26.67500  -81.81660 North West  positive  4    USA United States      Florida              Lee
## 5   40.65514  -73.94878 North West   neutral  5    USA United States     New York            Kings
## 6   40.00134  -74.18803 North West  positive  6    USA United States   New Jersey            Ocean
## 7   44.07523 -103.23341 North West  positive  7    USA United States South Dakota       Pennington
## 8   32.23099  -90.15802 North West   neutral  8    USA United States  Mississippi           Rankin
## 9   -9.53333  -35.68711 South West  positive  9    BRA        Brazil      Alagoas Maceió (capital)
## 10  38.30456  -92.43677 North West   neutral 10    USA United States     Missouri           Miller

If you can use the geonames package you can query that service instead. 如果您可以使用geonames包,则可以查询该服务。

> require(geonames) 
> options(geonamesUsername="myusername")

You need a vectorised version of the GNCountryCode function: 您需要GNCountryCode函数的矢量化版本:

> vg = Vectorize(GNcountryCode)

Then dplyr: 然后dplyr:

> df1 %>% mutate(cc=unlist(vg(place_lat, place_lon)["countryCode",]))
   place_lat  place_lon   location sentiment id cc
1  -23.68280  -46.59555 South West  positive  1 BR
2   30.10968  -93.76767 North West  positive  2 US
3   36.23286 -115.22312 North West   neutral  3 US
4   26.67500  -81.81660 North West  positive  4 US
5   40.65514  -73.94878 North West   neutral  5 US
6   40.00134  -74.18803 North West  positive  6 US
7   44.07523 -103.23341 North West  positive  7 US
8   32.23099  -90.15802 North West   neutral  8 US
9   -9.53333  -35.68711 South West  positive  9 BR
10  38.30456  -92.43677 North West   neutral 10 US

Use "countryName" if you want the name, but you will get "Federative Republic of Brazil" for what everyone else calls "Brasil" (or "Brazil"). 如果你想要这个名字,请使用“countryName”,但是你会得到“巴西联邦共和国”,其他人称之为“巴西”(或“巴西”)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM