简体   繁体   English

无法在r中构建等值区域图

[英]Unable to construct choropleth map in r

I have some demographic data I would like to use to make a choropleth map of US counties. 我有一些人口统计数据,我想用来制作美国各州的等值线图。 My workflow doesn't run into any errors and I'm able to create the final map, however, the data that its mapping is incorrect. 我的工作流程没有遇到任何错误,我可以创建最终的地图,但是,它的映射不正确的数据。 My workflow makes use of two data sources -a shape file and a data.frame. 我的工作流程使用两个数据源 - 形状文件和data.frame。 The shapefile is a counties shapefile that can be found at this link https://www.dropbox.com/s/4ujxidyx42793j7/cb_2015_us_county_500k.zip?dl=1 The data.frame file can be found at this link: https://www.dropbox.com/s/qys6s6ikrs1g2xb/data.dem.csv?dl=1 shapefile是一个县形状文件,可以在以下链接找到https://www.dropbox.com/s/4ujxidyx42793j7/cb_2015_us_county_500k.zip?dl=1可以在以下链接找到data.frame文件: https:// www.dropbox.com/s/qys6s6ikrs1g2xb/data.dem.csv?dl=1

Here is my code: 这是我的代码:

#Load dependencies
library(sp)
library(spatialEco)
library(rgdal)
library(dplyr)
library(maptools)
library(taRifx.geo)
library(ggplot2)
library(USAboundaries)
library(splitstackshape)
library(maps)
library(cowplot)

#Read in shape and csv files
county.track<-readOGR("/path", "filename")
county.track@data$id = rownames(county.track@data)
data<-read.csv("/path/filename.csv")

#Convert data.frame (data) to points polygon file
data$y<-data$lat
data$x<-data$long
coordinates(data) <- ~ x + y
proj4string(data) <- CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0")
proj4string(county.track) <- CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0")

#Overlay points onto polygons
county.track.data<-point.in.poly(data, county.track)

#Summarize point data by county
count<-select(as.data.frame(county.track.data), id, count)
count<-count %>%
  group_by(id) %>%
  summarize(count=sum(count))

#Merge with shape file data
county.track@data<-merge(county.track@data, count, by="id", all.x=T)

#Replace NA values with zeroes 
county.track@data$count[is.na(county.track@data$count)]<-0
county.track.points = fortify(county.track, region="id")
map.plot<-merge(county.track.points, county.track@data, by="id")

#Get rid of Hawaii and Alaska
map.plot<-map.plot %>%
  filter(lat<50 & lat>25) %>%
  filter(long>-130)

#Create choropleth map using ggplot2
 ggplot(map.plot) +
  geom_polygon(aes(long, lat, group=group, fill=log(count))) +
  coord_map()

The output looks like the following: 输出如下所示: 在此输入图像描述

But this is just wrong, which is apparent for a number of reasons. 但这是错误的,这很明显有很多原因。 One, most obviously much of the data is not mapped. 其中一个,最明显的大部分数据都没有映射。 The grey areas on the map signify NA. 地图上的灰色区域表示NA。 But I removed the NAs in one of the steps above, also when examining the data used to map (map.plot), there are no NAs in the fill variable (count). 但是我在上面的一个步骤中删除了NA,同样在检查用于映射的数据(map.plot)时,填充变量(count)中没有NA。 Second, the distribution of values for what is mapped is off. 其次,映射的值的分布是关闭的。 Los Angeles county should have the highest count value at 793 (log value of 6.675823), yet on the map numerous lighter colored counties indicate the value of other spatial units are higher and some of the top ranking counties such as San Diego, are not filled in at all (bottom left of map). 洛杉矶县的最高计数值为793(对数值为6.675823),但在地图上,许多较浅色的县表明其他空间单位的价值较高,而且一些排名靠前的县(如圣地亚哥)未填写在所有(地图的左下角)。

When I examine the data I used to map (map.plot), everything seems A OK. 当我检查我用来映射的数据(map.plot)时,一切似乎都没问题。 Los Angeles county is still the highest valued county for the "count" variable, yet the map suggests otherwise (see this image here). 洛杉矶县仍然是“计数”变量的最高价值县,但地图则另有建议(见此图)。 在此输入图像描述 I'm hoping some one can do some forensics here and identify the problem, I've done my best to go through all my steps but I can't seem to identify the issue. 我希望有人可以在这里做一些取证并找出问题,我已经尽力完成所有步骤,但我似乎无法确定问题。 Thanks in advance. 提前致谢。

UPDATE: I tried using a different shapefile from the same source. 更新:我尝试使用来自同一来源的不同shapefile。 The shapefile in the link above is the same as the the one labeled "cb_2015_us_county_500k.zip" at the following ( https://www.census.gov/geo/maps-data/data/cbf/cbf_counties.html ). 上面链接中的shapefile与下面标记为“cb_2015_us_county_500k.zip”的形状文件相同( https://www.census.gov/geo/maps-data/data/cbf/cbf_counties.html )。 When I choose a different shapefile (such as cb_2015_us_county_5m.zip) I get a different map but same problems: See the following map an example: 当我选择不同的shapefile(例如cb_2015_us_county_5m.zip)时,我会得到一个不同的地图但是同样的问题:请看下面的地图示例:

在此输入图像描述

I'm not sure what is going on! 我不确定发生了什么! In this new map, LA county is no longer even colored in but Orange County is! 在这张新地图中,洛杉矶县不再是彩色的,而是橙县! Any help is much appreciated. 任何帮助深表感谢。

Not rly sure what's going on with your merging, but this worked for me: 不清楚你的合并会发生什么,但这对我有用:

library(albersusa) # devtools::install_github("hrbrmstr/albersusa)
library(readr)
library(dplyr)
library(rgeos)
library(maptools)
library(ggplot2)
library(ggalt)
library(ggthemes)
library(viridis)

df <- read_csv("data.dem.csv")

counties_composite() %>% 
  subset(state %in% unique(df$state)) -> usa

pts <- df[,2:1]
coordinates(pts) <- ~long+lat
proj4string(pts) <- CRS(proj4string(usa))

bind_cols(df, select(over(pts, usa), -state)) %>% 
  count(fips, wt=count) -> df

You have 942 total counties: 您有942个县:

glimpse(df)
## Observations: 942
## Variables: 2
## $ fips <chr> "01001", "01003", "01013", "01015", "01043", "01055", "01061", ...
## $ n    <int> 1, 2, 1, 3, 1, 3, 1, 1, 19, 6, 12, 7, 7, 1, 4, 4, 1, 5, 67, 19,...

There are over 3K counties in the US 美国有超过3K个县

However, there aren't many NA s: 但是, NA不是很多:

filter(df, is.na(fips))
## # A tibble: 1 x 2
##    fips     n
#3   <chr> <int>
## 1  <NA>    10

usa_map <- fortify(usa, region="fips")

gg <- ggplot()
gg <- gg + geom_map(data=usa_map, map=usa_map,
                    aes(long, lat, map_id=id),
                    color="#b2b2b2", size=0.05, fill="white")
gg <- gg + geom_map(data=df, map=usa_map,
                    aes(fill=n, map_id=fips),
                    color="#b2b2b2", size=0.05)
gg <- gg + scale_fill_viridis(name="Count", trans="log10")
gg <- gg + coord_proj(us_aeqd_proj)
gg <- gg + theme_map()
gg <- gg + theme(legend.position=c(0.85, 0.2))
gg

在此输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM