简体   繁体   English

ggplot geom_linerange 中的 Alpha 由 Mac 上的观察次数决定

[英]Alpha in ggplot geom_linerange determined by number of observations on Mac

I am plotting some data using the geom_linerange function.我正在使用 geom_linerange 函数绘制一些数据。 This is daily observations over 5-10 years depending on the dataset.根据数据集,这是 5-10 年的日常观察。

When running the script on my Mac, the linerange alpha changes based on the number of observations in each plot.在我的 Mac 上运行脚本时,线范围 alpha 会根据每个图中的观察数量而变化。 However, I want all plots to have alpha=1.但是,我希望所有地块都具有 alpha=1。 Explicitly setting alpha within the geom_linerange function has no effect on the plot - the colours are still transparent when a large number of observations are plotted.在 geom_linerange 函数中显式设置 alpha 对绘图没有影响 - 当绘制大量观察值时,颜色仍然是透明的。

When I used the exact same script on my Windows laptop, the plot was correct with the default alpha of 1.当我在 Windows 笔记本电脑上使用完全相同的脚本时,默认 alpha 为 1 时绘图是正确的。

Below is a minimal working example:下面是一个最小的工作示例:

library(ggplot2)
library(gridExtra)

df1 = data.frame(name = c("A","B","C"),
                Date = rep(seq(as.Date("2010-01-01"),as.Date("2018-01-01"),by=1),each=3),
                value = runif(8769,-1,1))

df2 = data.frame(name = c("A","B","C"),
                 Date = rep(seq(as.Date("2010-01-01"),as.Date("2014-01-01"),by=1),each=3),
                 value = runif(4386,-1,1))

df3 = data.frame(name = c("A","B","C"),
                 Date = rep(seq(as.Date("2010-01-01"),as.Date("2011-01-01"),by=1),each=3),
                 value = runif(1098,-1,1))

Plot1 = ggplot() +
  geom_linerange(data=df1,aes(x=name,ymin=Date,ymax=Date+1,colour=value),size=15) +
  scale_colour_gradient2(low="red",mid="white",high="blue",midpoint=0,name = "Value") +
  theme_bw() +
  coord_flip() + 
  xlab("Driver") +
  ylab("")

Plot2 = ggplot() +
  geom_linerange(data=df2,aes(x=name,ymin=Date,ymax=Date+1,colour=value),size=15) +
  scale_colour_gradient2(low="red",mid="white",high="blue",midpoint=0,name = "Value") +
  theme_bw() +
  coord_flip() + 
  xlab("Driver") +
  ylab("")

Plot3 = ggplot() +
  geom_linerange(data=df3,aes(x=name,ymin=Date,ymax=Date+1,colour=value),size=15) +
  scale_colour_gradient2(low="red",mid="white",high="blue",midpoint=0,name = "Value") +
  theme_bw() +
  coord_flip() + 
  xlab("Driver") +
  ylab("")


grid.arrange(Plot1,Plot2,Plot3)

Below is the output on my Mac.下面是我的 Mac 上的输出。 The top plot, with the most observations, has the lowest alpha:具有最多观察值的顶部图具有最低的 alpha:

Mac Alpha 图

Below is the output on my Windows - as you can see, all plots have alpha=1:以下是我的 Windows 上的输出 - 如您所见,所有图都具有 alpha=1:

Windows Alpha 图

The code is transferred via GitHub repos.代码通过 GitHub 存储库传输。

Unfortunately, I am absolutely stumped as to why this is occurring.不幸的是,我完全不明白为什么会发生这种情况。 Is this expected behaviour on a Mac, or is there something I am doing wrong?这是 Mac 上的预期行为,还是我做错了什么?

Many thanks!非常感谢!

This is a result of the interaction between your high-frequency data and your graphics device, in particular its anti-aliasing setting/capability.这是高频数据与图形设备之间交互的结果,尤其是其抗锯齿设置/功能。 In this case, we are trying to plot about 2,900 days of data using (in my examples below) only about 600 pixels of plot width.在这种情况下,我们尝试使用(在我下面的示例中)仅约 600 像素的绘图宽度绘制约 2,900 天的数据。 With each pixel representing about 4 days' of data, antialiasing gives a more "blurred" look, while plotting without antialiasing shows the range of data better (at the cost of showing less of the data; I'm guessing we're effectively seeing every fourth day's data).每个像素代表大约 4 天的数据,抗锯齿提供了更“模糊”的外观,而没有抗锯齿的绘图更好地显示了数据的范围(以显示更少的数据为代价;我猜我们正在有效地看到每四天的数据)。

In Windows, I believe the default graphics device for the Plot window has been Quartz, without antialiasing.在 Windows 中,我相信 Plot 窗口的默认图形设备是 Quartz,没有抗锯齿。 Plot1+Plot2 look like this with that setting: Plot1+Plot2 在该设置下看起来像这样:

在此处输入图片说明

If I enable antialiasing in RStudio global settings, I get a result similar to your Mac result, since its default graphics device uses antialiasing.如果我在 RStudio 全局设置中启用抗锯齿,我得到的结果与你的 Mac 结果相似,因为它的默认图形设备使用抗锯齿。

在此处输入图片说明

在此处输入图片说明

The simplest way to get what you're going for would be to increase the resolution enough to be able to give each day at least one pixel;获得目标的最简单方法是将分辨率提高到足以每天至少提供一个像素; that way you can represent 100% of the data and use the full range of your color scale.这样你就可以代表 100% 的数据并使用你的色标的全部范围。 You could also output to a vector format like svg to achieve much higher effective resolution.您还可以输出到像svg这样的矢量格式,以获得更高的有效分辨率。

Alternatively, depending on the nature of your data and what you're trying to show, you might taking a rolling average across your days (I expect the result would be similar to the antialiased outputs), or grab a rolling max or min or SD, or some other summary measure which captures what you want more directly, but at a more digestible time granularity.或者,根据您的数据的性质和您要显示的内容,您可能会在您的日子里取一个滚动平均值(我预计结果将类似于抗锯齿输出),或者获取滚动最大值或最小值或 SD ,或其他一些可以更直接地捕获您想要的内容的汇总度量,但以更易于理解的时间粒度。 You might also consider other geometries (like a line chart, or a horizon plot) which are easier for a reader to map to values.您还可以考虑其他几何图形(如折线图或地平线图),它们更容易让读者映射到值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM