简体   繁体   English

R:用不同的颜色映射正负数

[英]R: Mapping positive and negative numbers with different colors

I'm a journalist working to map the counties where the number of black farmers increased or decreased between 2002 and 2012. I am using R (3.2.3) to process and map the data. 我是一名记者,负责绘制2002年至2012年间黑人农民人数增加或减少的县的地图。我使用R(3.2.3)处理和绘制数据。

I've been able to map the whole range of county-level gains and losses—which goes from negative 40 to positive 165—in a single color, but this makes it hard to see the pattern of gains and losses. 我已经能够用一种颜色绘制整个县级损益的范围(从负40到正165),但这很难看到损益的模式。 What I'd like to do is make the losses all variations of a single color (say, blue), and render gains in variations of a second color (say, red). 我想做的是使单一颜色(例如蓝色)的所有变化都损失,而在第二种颜色(例如红色)的变化中获得收益。

The following code generates two separate (very simplified) maps for counties that saw positive and negative changes. 以下代码为看到正面和负面变化的县生成两个单独的地图(非常简化)。 Anyone know how to capture this information in two colors on a single map? 有人知道如何在一张地图上以两种颜色捕获此信息吗? Ideally, counties with a "Difference" value of 0 would appear in grey. 理想情况下,“差异”值为0的县将以灰色显示。 Thanks for looking at this! 感谢您的关注!

  df <- data.frame(GEOID = c("45001", "22001", "51001", "21001", "45003"), 
                        Difference = c(-10, -40, 150, 95, 20))

#Second part: built a shapefile and join.
counties <- readOGR(dsn="Shapefile", layer="cb_2015_us_county_5m")

#Join the data about farmers to the spatial data. 
counties@data <- left_join(counties@data, df)

#NAs are not permitted in qtm method, so let's replace them with zeros.  
counties$Difference[is.na(counties$Difference)] <- 0

#Here are the counties that lost black farmers.
loss.counties <- counties[counties$Difference < 0, ]
qtm(loss.counties, "Difference")

#Here are the counties that gained black farmers.
gain.counties <- counties[counties$Difference > 0, ]
qtm(gain.counties, "Difference")

Using the source data from your original post, here is a solution using ggplot as suggested in my comment above. 使用您原始帖子中的源数据,这是我上面的评论中建议的使用ggplot的解决方案。

library(ggplot2)
library(ggmap)
library(maps)
library(dplyr)

# get data from 
# https://quickstats.nass.usda.gov/results/A68E27D5-E9B2-3621-8F1E-58829A551F32
df <- read.csv("nass_data.csv")
df$County <- tolower(df$County)
df$State <- tolower(df$State)

#Calculate the difference between the 2002 and 2012 census95, 
df <- df %>%
  filter(Domain == "TOTAL", Year == 2002 | Year == 2012) %>%
  group_by(County) %>%
  mutate(Difference = ifelse(is.na(Value-lag(Value)), 0, Value-lag(Value)))  %>%
  select(County, State, Difference)

#get map data for US counties and states
county_map <- map_data("county")
county_map$County <- county_map$subregion
county_map$State <- county_map$region

#Join the data about farmers to the spatial data. 
county_map <- left_join(county_map, df)

#plot using ggplot
ggplot(county_map, aes(x = long, y = lat, group=group)) +
  geom_polygon(aes(fill = Difference)) + 
  scale_fill_gradient2(midpoint = 0, mid="#eee8d5", high="#dc322f", low="#268bd2")

在此处输入图片说明 I'll note that your source data appear to be missing several counties throughout the country. 我会注意到,您的源数据似乎在全国多个县中缺失。 Nonetheless, I think this gets you what you want. 尽管如此,我认为这可以满足您的需求。

It's probably better to bin this data. 最好将这些数据进行装箱。 I made a snap judgment for what the bins should be, you should look at the data to see if it should be different. 我对存储箱应该是什么做出了迅速的判断,您应该查看数据以查看是否应该不同。 I also did the binning very manually to try to show what's going on. 我还非常手动地进行了分箱,以尝试显示正在发生的情况。

Using FIPS code (the combo of the "ANSI" columns) can help in situations where county names are hard to match, hence why I did that here. 使用FIPS代码(“ ANSI”列的组合)可以在县名难以匹配的情况下提供帮助,因此,为什么在这里这样做。

Folks tend to leave out AK & HI but there are some farms there it seems. 人们倾向于忽略AK和HI,但似乎那里有一些农场。

Also, red/blue are loaded colors and really should be avoided. 另外,红色/蓝色是已加载的颜色,因此应避免使用。

library(ggplot2)
library(maps)
library(maptools)
library(rgeos)
library(albersusa) # devtools::install_github("hrbrmstr/albersusa")
library(ggalt)
library(ggthemes)
library(dplyr)

df <- read.csv("347E31A8-7257-3AEE-86D3-4BE3D08982A3.csv")

df <- df %>%
  filter(Domain == "TOTAL", Year == 2002 | Year == 2012) %>%
  group_by(County) %>%
  mutate(delta=Value-lag(Value),
         delta=ifelse(is.na(delta), 0, delta),
         fips=sprintf("%02d%03d", State.ANSI, County.ANSI)) 

df$delta <- cut(df$delta, include.lowest=FALSE,
                breaks=c(-400, -300, -200, -100, -1, 1, 100, 200, 300, 400),
                labels=c("301 to 400 (losses)", "201 to 300", "101 to 200", "1 to 100",
                         "no gains/losses", 
                         "+1 to 100", "+101 to 200", "+201 to 300", "301 to 400 (gains)"))

counties <- counties_composite()
counties_map <- fortify(counties, region="fips")

gg <- ggplot()
gg <- gg + geom_map(data=counties_map, map=counties_map,
                    aes(x=long, y=lat, map_id=id),
                    color="#b3b3b3", size=0.15, fill="white")
gg <- gg + geom_map(data=df, map=counties_map,
                    aes(fill=delta, map_id=fips),
                    color="#b3b3b3", size=0.15)
gg <- gg + scale_fill_manual(name="Change since 2002\n(white = no data)",
                            values=c("#543005", "#8c510a", "#bf812d", "#dfc27d",
                                     "#e0e0e0",
                                     "#80cdc1", "#35978f", "#01665e", "#003c30"),
                            guide=guide_legend(reverse=TRUE))
gg <- gg + coord_proj(us_laea_proj)
gg <- gg + labs(x="Grey == no data", y=NULL)
gg <- gg + theme_map()
gg <- gg + theme(legend.position=c(0.85, 0.2))
gg <- gg + theme(legend.key=element_blank())
gg

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM