[英]Visualizing the difference between two points with ggplot2
I want to visualize the difference between two points with a line/bar in ggplot2. 我想用ggplot2中的线/条来可视化两点之间的差异。
Suppose we have some data on income and spending as a time series. 假设我们有一些时间序列上的收入和支出数据。 We would like to visualize not only them, but the balance (=income - spending) as well. 我们不仅要可视化它们,还要可视化余额(=收入-支出)。 Furthermore, we would like to indicate whether the balance was positive (=surplus) or negative (=deficit). 此外,我们想指出余额是正数(=盈余)还是负数(=赤字)。
I have tried several approaches, but none of them produced a satisfying result. 我尝试了几种方法,但是都没有令人满意的结果。 Here we go with a reproducible example. 这里我们举一个可复制的例子。
# Load libraries and create LONG data example data.frame
library(dplyr)
library(ggplot2)
library(tidyr)
df <- data.frame(year = rep(2000:2009, times=3),
var = rep(c("income","spending","balance"), each=10),
value = c(0:9, 9:0, rep(c("deficit","surplus"), each=5)))
df
1.Approach with LONG data 1,采用长数据处理
Unsurprisingly, it doesn't work with LONG data, because the geom_linerange
arguments ymin
and ymax
cannot be specified correctly. 毫不奇怪,它不适用于LONG数据,因为无法正确指定geom_linerange
参数ymin
和ymax
。 ymin=value, ymax=value
is definately the wrong way to go (expected behaviour). ymin=value, ymax=value
肯定是错误的操作方式(预期行为)。 ymin=income, ymax=spending
is obviously wrong, too (expected behaviour). ymin=income, ymax=spending
显然也是错误的(预期行为)。
df %>%
ggplot() +
geom_point(aes(x=year, y=value, colour=var)) +
geom_linerange(aes(x=year, ymin=value, ymax=value, colour=net))
#>Error in function_list[[i]](value) : could not find function "spread"
2.Approach with WIDE data 2.采用WIDE数据的方法
I almost got it working with WIDE data. 我几乎可以使用WIDE数据。 The plot looks good, but the legend for the geom_point(s)
is missing (expected behaviour). 该图看起来不错,但是缺少geom_point(s)
的图例(预期行为)。 Simply adding show.legend = TRUE
to the two geom_point(s) doesn't solve the problem as it overprints the geom_linerange
legend. 仅将show.legend = TRUE
添加到两个geom_point并不能解决问题,因为它会套印geom_linerange
图例。 Besides, I would rather have the geom_point
lines of code combined in one (see 1.Approach). 此外,我宁愿将geom_point
代码行合并为一个(请参阅1.Approach)。
df %>%
spread(var, value) %>%
ggplot() +
geom_linerange(aes(x=year, ymin=spending, ymax=income, colour=balance)) +
geom_point(aes(x=year, y=spending), colour="red", size=3) +
geom_point(aes(x=year, y=income), colour="green", size=3) +
ggtitle("income (green) - spending (red) = balance")
3.Approach using LONG and WIDE data 3,使用LONG和WIDE数据的方法
Combining the 1.Approach with the 2.Approach results in yet another unsatisfying plot. 将1.Approach与2.Approach结合在一起会导致另一个令人不满意的情节。 The legend does not differentiate between balance and var (=expected behaviour). 图例不区分balance和var(=预期行为)。
ggplot() +
geom_point(data=(df %>% filter(var=="income" | var=="spending")),
aes(x=year, y=value, colour=var)) +
geom_linerange(data=(df %>% spread(var, value)),
aes(x=year, ymin=spending, ymax=income, colour=balance))
geom
instead of geom_linerange
? 我应该使用其他的geom
代替geom_linerange
吗? Try 尝试
ggplot(df[df$var != "balance", ]) +
geom_point(
aes(x = year, y = value, fill = var),
size=3, pch = 21, colour = alpha("white", 0)) +
geom_linerange(
aes(x = year, ymin = income, ymax = spending, colour = balance),
data = spread(df, var, value)) +
scale_fill_manual(values = c("green", "red"))
The main idea is that we use two different types of aesthetics for colours ( fill
for the points, with the appropriate pch
, and colour
for the lines) so that we get separate legends for each. 主要思想是,我们对颜色使用两种不同类型的美学效果( fill
点,用适当的pch
fill
颜色,并用colour
表示线条),以便为每种colour
获得单独的图例。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.