简体   繁体   English

R中的ggplot2在Choropleth映射上的数据绘制不正确

[英]Incorrect plotting of data on Choropleth map with ggplot2 in R

I am trying to make a Choropleth map of Germany using ggplot2. 我正在尝试使用ggplot2制作德国的Choropleth地图。 My data is a .csv file with 2 rows ( RS= containing numbers 1 to 16 for each German state & Tariff= 16 random positive and negative numbers ). 我的数据是一个带有2行的.csv文件( RS =每个德国州包含1到16的数字,Tariff = 16个随机的正负数字 )。

    RS  Tariff
1   01  -5.25
2   02  7.16
3   03  6.65
4   04  3.10
5   05  3.69
6   06  2.49
7   07  1.89
8   08  3.93
9   09  -5.84
10  10  -2.61
11  11  -0.21
12  12  2.35
13  13  -5.94
14  14  -7.54
15  15  -3.27
16  16  -8.75

Also I have a shape file Germany shape file . 我也有一个形状文件德国形状文件 What I want to do is to plot this positive and negative numbers onto map of germany for each state, with 2 colors (positive=Green and Negative=Red). 我想要做的是将每个国家的正负数字绘制到德国地图上,并使用两种颜色(正=绿色和负=红色)。 Below is my code 下面是我的代码

library(XLConnect)
library(sp)
library(rgdal)
library(ggplot2)
library(plyr)
library(RColorBrewer)
library(DataCombine)
library(rgeos)
library(maptools)

#### EEG Data Read ####
eeg<-read.csv(file = "data/testdata1.csv", skip = 0, sep = ",", dec=".",       header=TRUE)
colnames(eeg)<-c("RS", "Tariff")
eeg$RS<-   c("01","02","03","04","05","06","07","08","09","10","11","12","13","14","15","16")
eeg$RS<-as.factor(eeg$RS)
eeg$Tariff<- as.numeric(eeg$Tariff)

#### Shape Data Read ####
bundesl<-readOGR("data/new_shape/vg2500_bld.shp", "vg2500_bld")
bundesl@data<- bundesl@data[order(bundesl$RS, na.last=NA),]

### Rearrange shape data for better merging ###
levels(bundesl$GEN)<- c("Schleswig-Holstein", "Mecklenburg-Vorpommern",   "Hamburg", "Bremen", "Niedersachsen", "Sachsen-Anhalt", "Brandenburg", 
                    "Berlin", "Nordrhein-Westfalen", "Hessen","Thüringen","Sachsen", "Rheinland-Pfalz", "Saarland", "Baden-  Württemberg", "Bayern")
bundesl$GEN<- c("Schleswig-Holstein", "Mecklenburg-Vorpommern", "Hamburg", "Bremen", "Niedersachsen", "Sachsen-Anhalt", "Brandenburg", 
            "Berlin", "Nordrhein-Westfalen", "Hessen","Thüringen","Sachsen",  "Rheinland-Pfalz", "Saarland", "Baden-Württemberg", "Bayern")
bundesl$SHAPE_LENG<-  c("1217255.7","1780980.8","175253.8","154971.6","2016496.4","949096.8",
                   "1295460.4","180751.2","1352108","1105092.8","961942.7","979294.3","910650.4",
                   "282910.8","1298891.7","2046039.3")
bundesl$SHAPE_AREA<- c("15857425536","23044684847","760539820","405480872","47716406483","20494982327","29653902483","886480139","34047269991","21092318103","16178531941","18401642456","19834907486","2578541706","35801397076","70550070623")

# #### Shape Data und EEG Data join ####
bundesl@data<-merge(bundesl@data, eeg, by="RS",  all=TRUE)

# #### Shapes Plot ####
bundesl@data$id <- (as.numeric(rownames(bundesl@data))-1)
bundesl.df<-fortify(bundesland)
bundesl.df <- join(bundesl.df, bundesl@data, by="id")


 ggp <- ggplot(data=bundesl.df, aes(x=long, y=lat, group=group))
 ggp <- ggp + geom_polygon(aes(fill=Tariff), col="black") 
 ggp <- ggp + coord_map() 
 ggp <- ggp + scale_fill_continuous(name=expression(Tariff), low = "red", high = "green", space = "Lab", na.value = "white", guide = "colourbar")
ggp <- ggp + theme_minimal()
ggp <- ggp + theme(axis.title=element_blank(), axis.ticks=element_blank(),   axis.text=element_blank())
 ggp

So far I manage to plot the map, but with wrong data mapping . 到目前为止,我设法绘制了地图,但是数据映射错误 I mean state with positive Tariff like Schleswig-Holstein should be green but are red and Bavaria should be red but is green. 我的意思是,像石勒苏益格-荷尔斯泰因州(Schleswig-Holstein)这样具有积极关税的州应为绿色,但应为红色,巴伐利亚应为红色,但应为绿色。

My guess is that there is problem with fortify function . 我的猜测是强化功能存在问题。 My data is only 16 rows but after fortify it print 1000+ rows. 我的数据只有16行,但是经过强化之后,它会打印1000余行。 Why?? 为什么?? and this is something which is causing mismatching of data. 这是导致数据不匹配的原因。 I did all the search on internet I possibly can for the solution. 我在互联网上进行了所有可能的解决方案搜索。 I would appreciate if anybody can give me an answer as to why this problem is occurring. 如果有人能回答我为什么会出现此问题,我将不胜感激。

Thank you for your help in advance! 提前谢谢你的帮助!

fortify puts the polygons from the shapefile into something ggplot can plot, hence the 1,000+ rows. fortify将shapefile中的多边形放到ggplot可以绘制的东西中,因此有1000多个行。 While you can attach values to the fortified polygons, it's not necessary. 虽然可以将值附加到加固的多边形上,但这不是必需的。

So, you don't really have to go through all that trouble for the choropleth. 因此,您不必真正为麻烦做所有的麻烦。 Take a look at the following. 看一下以下内容。 I added in some extra bits to show which values get mapped to which RS : 我添加了一些额外的位以显示哪些值映射到了哪个RS

library(rgdal)
library(ggplot2)
library(gridExtra)

egg <- read.table(text="RS  Tariff
01  -5.25
02  7.16
03  6.65
04  3.10
05  3.69
06  2.49
07  1.89
08  3.93
09  -5.84
10  -2.61
11  -0.21
12  2.35
13  -5.94
14  -7.54
15  -3.27
16  -8.75", header=TRUE, colClasses=c("character", "numeric"))

bundesl <- readOGR("vg2500_geo84/vg2500_bld.shp", "vg2500_bld")
bundesl@data<- bundesl@data[order(bundesl$RS, na.last=NA),]

# good projection for germany but if you intende to draw additional
# lines or points you'll have to project them before plotting so this
# may be more trouble than it's worth and you can just use 
# coord_map("mollweide") or something else that works for you besides mercator

bundesl <- spTransform(bundesl, CRS("+proj=utm +zone=33 +ellps=WGS84 +datum=WGS84 +units=m +no_defs "))

bundesl_map <- fortify(bundesl, region="RS")

# only doing this bit to plot the RS # at the center of each polygon
# totally not necessary for the choropleth

egg <- cbind(egg, data.frame(gCentroid(bundesl, byid=TRUE)))

gg <- ggplot()

# this bit ensures you have the outlines

gg <- gg + geom_map(data=bundesl_map, map=bundesl_map,
                    aes(x=long, y=lat, map_id=id),
                    color="#7f7f7f", size=0.15)

# this bit here does your choropleth

gg <- gg + geom_map(data=egg, map=bundesl_map,
                    aes(fill=Tariff, map_id=RS),
                    color="#7f7f7f", size=0.15)
gg <- gg + geom_text(data=egg, aes(x=x, y=y, label=RS), size=3)
gg <- gg + coord_equal() # we already projected it
gg <- gg + scale_fill_continuous(name=expression(Tariff), 
                                 low="red", high="green", space="Lab", 
                                 na.value="white", guide="colourbar")
gg <- gg + labs(x=NULL, y=NULL)

# decent map theme 

gg <- gg + theme_bw()
gg <- gg + theme(panel.grid=element_blank())
gg <- gg + theme(panel.border=element_blank())
gg <- gg + theme(axis.ticks=element_blank())
gg <- gg + theme(axis.text=element_blank())


gt <- tableGrob(cbind(bundesl@data[,c(2,4)], egg[,2]))

grid.arrange(gg, gt, ncol=2)

在此处输入图片说明

08 & 16 have unicode in them, hence the lack of display without conversion. 0816包含unicode,因此缺少显示而没有转换。 I also realize the plotting of the RS number on the centroid is problematic for Berlin & Brandenburg, but it was to give a general idea, not be perfect. 我也意识到在质心上绘制RS编号对于柏林和勃兰登堡来说是有问题的,但这只是给出一个总体思路,而不是完美的。

I'd highly suggest using cut to define 5 or 6 standardized breaks for the values vs use a continuous scale. 我强烈建议使用cut为值定义5或6个标准化的中断,而不是使用连续刻度。

This is a bit old, but since it is the best question about germany choropleth maps, I wanted to add a few things i learned while following @hrbrmstr's great answer. 这有点老了,但是由于这是关于德国Choropleth映射的最佳问题,因此我想补充一些我在@hrbrmstr的出色答案时学到的东西。

As you can see in the map he plotted, Berlin is just getting the color of Brandenburg. 正如您在他绘制的地图中看到的那样,柏林只是获得勃兰登堡州的颜色。 To fix this, the order in the bundesl_map has to be edited, to make sure the Berlin (10) comes after Brandenburg (11). 要解决此问题,必须编辑bundesl_map中的顺序,以确保柏林(10)在勃兰登堡(11)之后。 So the full processing of the map should look like this: 因此,地图的完整处理应如下所示:

library(rgdal)
library(rgeos)
library(maptools)

bundesl <- readOGR("vg2500_geo84/vg2500_bld.shp", "vg2500_bld")
bundesl@data<- bundesl@data[order(bundesl$RS, na.last=NA),]
bundesl_map <- fortify(bundesl, region="RS")
bundesl_map <- rbind(
    bundesl_map[bundesl_map$id != 10, ], 
    bundesl_map[bundesl_map$id == 10, ]
)
saveRDS(bundesl_map, "bundesland")

In the last step we save the map for future use ( bundesl_map <- readRDS("budesland") ). 在最后一步中,我们保存地图以供将来使用( bundesl_map <- readRDS("budesland") )。 Here is a copy of the file I created with named ids. 这是我使用命名ID创建的文件的副本。

The plotting can also be made a bit more concise as follows: 也可以使绘制更加简洁,如下所示:

library(magrittr)
library(ggplot2)
library(viridis)

egg %>% ggplot(aes(fill=Tariff, map_id=RS)) +
    geom_map(map=bundesl_map, color="white", size=0.2) + 
    geom_text(aes(x=x, y=y, label=egg$RS), size=2, color="white") +
    coord_map("mercator") +
    expand_limits(x=bundesl_map$long, y=bundesl_map$lat) +
    scale_fill_viridis(begin=0.4, end=0.9, breaks=-8:7, guide=guide_legend(reverse=T)) + 
    theme_map(base_size=8)

where theme_map is defined as: 其中theme_map定义为:

theme_map <- function(...) {
    theme_classic(...) %+replace% 
    theme(
        axis.ticks = element_blank(), 
        axis.text = element_blank(),
        axis.title = element_blank(), 
        line=element_blank()
    )
}

This will yield a map like this: 这将产生如下图:

地图

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM