简体   繁体   English

找出一组多边形R中每个多边形的最大点

[英]Find maximum point with in each polygon for a set of polygons R

I'm sure this question has been answered elsewhere, but I have not been able to come up with it by searching. 我确定这个问题已经在其他地方得到了解答,但是我无法通过搜索提出。

I have points representing cities within a country along with population for each city. 我有代表一个国家内的城市以及每个城市人口的点。 I also have a polygon file of counties. 我也有一个县的多边形文件。 I want to find the location of the largest city within each county. 我想找到每个县内最大城市的位置。

How can this be done? 如何才能做到这一点?

Here is some data 这是一些数据

structure(list(Country = c("us", "us", "us", "us", "us", "us", "us", "us", "us", "us", "us", 结构(列表(国家= c(“ us”,“ us”,“ us”,“ us”,“ us”,“ us”,“ us”,“ us”,“ us”,“ us”,“ us ”,
"us", "us", "us", "us", "us", "us", "us", "us", "us", "us", "us", "us", "us", "us"), City = c("cabarrus", "cox store", "cal-vel", "briarwood townhouses", "barker heights", "davie “我们”,“我们”,“我们”,“我们”,“我们”,“我们”,“我们”,“我们”,“我们”,“我们”,“我们”,“我们”,“我们“,”我们“),城市= c(” cabarrus“,” cox store“,” cal-vel“,” briarwood townhouses“,” barker heights“,” davie
crossroads", "crab point village", "azalea", "chesterfield", "charlesmont", "connor", "clover garden", "corriher heights", "callisons", "crestview acres", "clegg", "canaan park", "chantilly", "belgrade", "brices crossroads", "bluff", "butner", "bottom", "bandy", "bostian heights"), AccentCity = c("Cabarrus", "Cox Store", "Cal-Vel", "Briarwood Townhouses", "Barker Heights", "Davie Crossroads", "Crab Point Village", "Azalea", "Chesterfield", "Charlesmont", "Connor", "Clover Garden", "Corriher Heights", "Callisons", "Crestview Acres", "Clegg", "Canaan Park", "Chantilly", "Belgrade", "Brices Crossroads", "Bluff", "Butner", "Bottom", "Bandy", "Bostian Heights"), Region = c("NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC"), Population = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, A 十字路口”,“蟹点村”,“ azalea”,“ chesterfield”,“ charlesmont”,“ connor”,“ clover garden”,“ corriher heights”,“ callisons”,“ crestview acres”,“ clegg”,“ canaan”公园”,“尚蒂伊”,“贝尔格莱德”,“ brices十字路口”,“虚张声势”,“ butner”,“ bottom”,“ bandy”,“ bostian heights”),AccentCity = c(“ Cabarrus”,“ Cox Store” ,“ Cal-Vel”,“ Briarwood Townhouses”,“ Barker Heights”,“ Davie Crossroads”,“ Crab Point Village”,“ Azalea”,“ Chesterfield”,“ Charlesmont”,“ Connor”,“ Clover Garden”,“ Corriher Heights”,“ Callisons”,“ Crestview Acres”,“ Clegg”,“ Canaan Park”,“ Chantilly”,“ Belgrade”,“ Brices Crossroads”,“ Bluff”,“ Butner”,“ Bottom”,“ Bandy” ,“ Bostian Heights”),区域= c(“ NC”,“ NC”,“ NC”,“ NC”,“ NC”,“ NC”,“ NC”,“ NC”,“ NC”,“ NC” ,“ NC”,“ NC”,“ NC”,“ NC”,“ NC”,“ NC”,“ NC”,“ NC”,“ NC”,“ NC”,“ NC”,“ NC”,“ NC“,” NC“,” NC“),人口= c(NA_integer_,NA_integer_,NA_integer_,NA_integer_,NA_integer_,NA_integer_,NA_integer_,NA_integer_,NA_integer_,NA_integer_,NA_integer_,NA_integer_, _integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_), Latitude = (35.2369444, 35.275, 36.4291667, 35.295, 35.3111111, 35.8319444, 34.7602778, 35.58, 35.81, 5.9341667, 35.7419444, 36.1883333, 35.5605556, 35.0841667, 35.0213889, 35.8655556, 36.2761111, 36.3016667, 34.88, 34.8186111, 35.8377778, 36.1319444, 36.4747222, 35.6419444, 35.7544444), Longitude = c(-80.5419444, -82.0352778, -78.9694444, -81.5238889, -82.4441667, -80.535, -76.7305556, -82.4713889, -81.6611111, -81.5127778, -78.1486111, -79.4630556, -80.635, -76.7255556, -80.5427778, -78.8497222, -79.7852778, -76.1711111, -77.2352778, -78.1016667, -82.8580556, -78.7569444, -80.7741667, -81.09, -80.9294444)), .Names = c("Country", "City", "AccentCity", "Region", "Population", "Latitude", "Longitude"), row.names = c(544L, 889L, 551L, 434L, 190L, 975L, 894L, 147L, 717L, 700L, 831L, 773L, 862L, 559L, 915L, 753L, 584L, 695L, _integer_,NA_integer_,NA_integer_,NA_integer_,NA_integer_,NA_integer_,NA_integer_,NA_integer_,NA_integer_,NA_integer_,NA_integer_,NA_integer_,NA_integer_),纬度=(35.2369444,35.275,36.4291667,35.295,35.3111111,35.8319444,34.7602778,35.58,35.81,5.9341667, 35.7419444、36.1883333、35.5605556、35.0841667、35.0213889、35.8655556、36.2761111、36.3016667、34.88、34.8186111、35.8377778、36.1319444、36.4747222、35.6419444、35.7544444),经度= c(-80.5419444,-82.0352778,-72.5238667444) -80.535,-76.7305556,-82.4713889,-81.6611111,-81.5127778,-78.1486111,-79.4630556,-80.635,-76.7255556,-80.5427778,-78.8497222,-79.7852778,-76.1711111,-77.2352778,-78.1016667,-82.8580556556,-7 ,-80.7741667,-81.09,-80.9294444))。.names = c(“国家/地区”,“城市”,“ AccentCity”,“区域”,“人口”,“纬度”,“经度”),row.names = c(544L,889L,551L,434L,190L,975L,894L,147L,717L,700L,831L,773L,862L,559L,915L,753L,584L,695L, 262L, 437L, 372L, 537L, 406L, 178L, 02L), class = "data.frame") 262L,437L,372L,537L,406L,178L,02L),类=“ data.frame”)

and some code to read in north carolina 以及在北卡罗来纳州阅读的一些代码

xx <- readShapePoly(system.file("shapes/sids.shp", package="maptools")[1],
                IDvar="FIPSNO", proj4string=CRS("+proj=longlat +ellps=clrk66"))

plot(xx)

I want to find the city with the maximum population within each county. 我想找到每个县内人口最多的城市。 i'm sorry I don't have a reproducible example. 对不起,我没有可复制的示例。 If I did, I would have the answer! 如果我做到了,我将得到答案!

The short answer is that you should use gContains(...) in package rgeos . 简短的答案是,您应该在包rgeos使用gContains(...)

Here is the long answer. 这是一个很长的答案。

In the code below, we grab a high resolution shapefile of North Carolina counties from the GADM database, and a geocoded dataset of North Carolina cities from from the US Geological Survey database. 在下面的代码中,我们从GADM数据库中获取北卡罗来纳州县的高分辨率shapefile,并从美国地质调查局数据库中获取北卡罗来纳州城市的地理编码数据集。 The latter already has county information but we ignore that. 后者已经有了县的信息,但是我们忽略了这一点。 Then we map cities to their appropriate county using gContains(...) , add that information to the cities data frame, and identify the largest city in each county using the data.table package. 然后,我们使用gContains(...)将城市映射到相应的县,将信息添加到城市数据框中,并使用data.table包确定每个县中最大的城市。 Most of the work is in 4 lines of code near the end. 大部分工作都在末尾的4行代码中。

library(raster)   # for getData(...);   you may not need this
library(foreign)  # for read.dbf(...);  you may not need this
library(rgeos)    # for gContains(...); loads package sp as well

setwd("< directory for downloaded data >")
# get North Carolina Counties shapefile from GADM database
USA         <- getData("GADM",country="USA",level=2)   # level 2 is counties
NC.counties <- USA[USA$NAME_1=="North Carolina",]      # North Carolina Counties
# get North Carolina Cities data from USGS database
url <- "http://dds.cr.usgs.gov/pub/data/nationalatlas/citiesx010g_shp_nt00962.tar.gz"
download.file(url,"cities.tar.gz")
untar("cities.tar.gz")
data      <- read.dbf("citiesx010g.dbf",as.is=TRUE)
NC.data   <- data[data$STATE=="NC",c("NAME","COUNTY","LATITUDE","LONGITUDE","POP_2010")]
## --- evverything up to here is just to set up the example

# convert cities data.frame to SpatialPointsDataFrame
NC.cities <- SpatialPointsDataFrame(NC.data[,c("LONGITUDE","LATITUDE")],
                                    data=NC.data,
                                    proj4string=CRS(proj4string(NC.counties)))
# map cities to counties
city.cnty   <- gContains(NC.counties,NC.cities,byid=TRUE)
# add county information to cities data
NC.data$county <- apply(city.cnty,1,function(cnty)ifelse(any(cnty),NC.counties@data[cnty,]$NAME_2,NA))
# identify largest city in each county
library(data.table)
result <- setDT(NC.data)[,.SD[which.max(POP_2010)],by="county"]
head(result)
#      county             NAME   COUNTY LATITUDE LONGITUDE POP_2010
# 1:  Jackson        Cullowhee  Jackson 35.31371 -83.17653     6228
# 2:   Graham     Robbinsville   Graham 35.32287 -83.80740      620
# 3:   Wilkes North Wilkesboro   Wilkes 36.15847 -81.14758     4245
# 4:    Rowan        Salisbury    Rowan 35.67097 -80.47423    33662
# 5: Buncombe        Asheville Buncombe 35.60095 -82.55402    83393
# 6:    Wayne        Goldsboro    Wayne 35.38488 -77.99277    36437

The workhorse here is the line: 这条线是这里的主力:

city.cnty   <- gContains(NC.counties,NC.cities,byid=TRUE)

This compares every point in the SpatialPointsDataFrame NC.Cities to every Polygon in the SpatialPolygonsDataFrame NC.counties and returns a logical matrix where tthe rows represent cities and the columns represent counties, and the [i,j] element is TRUE if city i is in county j , FALSE otherwise. 它将SpatialPointsDataFrame NC.Cities中的每个点与SpatialPolygonsDataFrame NC.Cities中的每个多边形进行NC.counties并返回一个逻辑矩阵,其中行代表城市,列代表县,如果城市i位于[i,j]元素为TRUE 。县j ,否则为FALSE We process the matrix row-wise in the next statement: 我们在下一条语句中逐行处理矩阵:

NC.data$county <- apply(city.cnty,1,function(cnty)ifelse(any(cnty),NC.counties@data[cnty,]$NAME_2,NA))

using each row in succession to index the attributes table for NC.counties to extract the county name. 使用连续的每一NC.counties的属性表建立NC.counties以提取县名称。

The data you provided in your question has some problems which are nevertheless instructive. 您在问题中提供的数据存在一些问题,这些问题仍然具有启发性。 First, the NC shapefile in the maptools package is relatively low resolution. 首先, maptools软件包中的NC shapefile分辨率较低。 In particular this means that some of the coastal islands are completely missing, so any city on one of those islands will not map to a county. 特别是,这意味着某些沿海岛屿完全消失了,因此这些岛屿之一上的任何城市都不会映射到一个县。 You might have the same problem with your real data so watch out for it. 您的真实数据可能会遇到同样的问题,因此请当心。

Second, comparing the COUNTY column in the original USGS dataset with the county column which we added, there are 3 (out of 865) counties that do not agree. 其次,将原始USGS数据集中的COUNTY列与我们添加的county列进行比较,有3个(共865个)县意见不一致。 It turns out that, in those cases, the USGS database was wrong (or out of date). 事实证明,在这些情况下,USGS数据库是错误的(或已过期)。 You might have the same problem so watch out for that too. 您可能有同样的问题,所以也要当心。

Third, an additional three cities did not map to any county. 第三,另外三个城市没有映射到任何县。 These were all coastal cities and probably reflect small inaccuracies in the North Carolina shapefile. 这些都是沿海城市,可能反映了北卡罗莱纳州shapefile中的小错误。 You night have this problem as well. 你晚上也有这个问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从具有多边形/区域和点(纬度,经度)的shapefile中,找出每个点属于哪个多边形/区域? 在R中 - From a shapefile with polygons/areas, and points (lat,lon), figure out which polygon/area each point belongs to? In R 列出触摸R中每个多边形的相同边界的相邻多边形的列表 - Making a list of neighbor polygons touching same border of each polygon in R R:找到数据集中的最大点 - R: Find the Maximum Point in a Dataset 如何遍历 R 中单个 shapefile 中的多个重叠多边形以获取每个多边形的区域统计信息? - How can I iterate through multiple overlapping polygons in a single shapefile in R to get zonal statistics of each polygon? 如何在R中的地图上找到从点到多边形边缘的最短距离(点在多边形外部) - How do I find the shortest distance from a point to a polygon edge on a map in R (point is outside polygon) 查找相邻的多边形R - Find neighbouring polygons R 如何找到最接近 R 中某个点的多边形? - How do I find the polygon nearest to a point in R? 查找 r 列表中每一行的最大值? - find maximum values for each row in list in r? 对于大数据集,如何使用 R 有效地检查一个点是否在多边形中? - How to check if a point is in a polygon effectively using R for large data set? 在R中找到并设置绘图窗口的最大大小 - Find and set maximum size of plot window in R
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM