简体   繁体   English

使用scale_color_grey()的图例顺序不正确

[英]Incorrect Legend Ordering with scale_color_grey()

I am doing a Monte Carlo simulation, in which I have to display the density of coefficient estimates for simulations with different sample sizes on the same plot. 我正在做一个蒙特卡洛模拟,其中我必须显示同一图上具有不同样本大小的模拟的系数估计密度。 When using scale_color_grey . 使用scale_color_grey I have put my coefficient estimates in the same dataframe, with the sample size as a factor. 我将系数估计值放在同一数据框中,并将样本大小作为一个因素。 If I query the factor with levels() , it is in the correct order (from smallest to highest sample size). 如果我使用levels()查询因子,则其顺序正确(从最小到最大样本大小)。 However, the following code gives a scale in which the order is correct in the legend, but the color moves from light grey to darker grey in a seemingly random order 但是,以下代码给出了一个比例,其中图例中的顺序是正确的,但是颜色以看似随机的顺序从浅灰色变为深灰色

montecarlo <- function(N, nsims, nsamp){
  set.seed(8675309)
  coef.mc <- vector()
    for(i in 1:nsims){
      access <- rnorm(N, 0, 1) 
      health <- rnorm(N, 0, 1)
      doctorpop <- (access*1) + rnorm(N, 0, 1)
      sick <- (health*-0.4) + rnorm(N, 0, 1)
      insurance <- (access*1) + (health*1) + rnorm(N, 0, 1)
      healthcare <- (insurance*1) + (doctorpop*1) + (sick*1) + rnorm(N, 0, 1)
      data <- as.data.frame(cbind(healthcare, insurance, sick, doctorpop))
      sample.data <- data[sample(nrow(data), nsamp), ]
      model <- lm(data=sample.data, healthcare ~ insurance + sick + doctorpop)
      coef.mc[i] <- coef(model)["insurance"]
    }
  return(as.data.frame(cbind(coef.mc, nsamp)))
}

sample30.df <- montecarlo(N=1000, nsims=1000, nsamp=30)
sample100.df <- montecarlo(1000,1000,100)
sample200.df <- montecarlo(1000, 1000, 200)
sample500.df <- montecarlo(1000, 1000, 500)
sample1000.df <- montecarlo(1000, 1000, 1000)
montecarlo.df <- rbind(sample30.df, sample100.df, sample200.df, sample500.df, sample1000.df)
montecarlo.df$nsamp <- as.factor(montecarlo.df$nsamp)
levels(montecarlo.df$nsamp) <- c("30", "100", "200", "500", "1000")

##creating the plot
montecarlo.plot <- ggplot(data=montecarlo.df, aes(x=coef.mc, color=nsamp))+
  geom_line(data = subset(montecarlo.df, nsamp==30), stat="density")+
  geom_line(data = subset(montecarlo.df, nsamp==100), stat="density")+
  geom_line(data = subset(montecarlo.df, nsamp==200), stat="density")+
  geom_line(data = subset(montecarlo.df, nsamp==500), stat="density")+
  geom_line(data = subset(montecarlo.df, nsamp==1000), stat="density")+
  scale_color_grey(breaks=c("30", "100","200", "500", "1000"))+
  labs(x=NULL, y="Density of Coefficient Estimate: Insurance", color="Sample Size")+
  theme_bw()
montecarlo.plot 

Not using the breaks argument to scale_color_grey returns a legend in which the shades are in the right order, but which does not increase from smallest to highest sample size. 不对scale_color_grey使用breaks参数将返回图例,其中阴影以正确的顺序排列,但不会从最小样本大小增加到最大样本大小。

What is going on here? 这里发生了什么? As far as I understand it, ggplot2 should follow the factor's order (which is correct) in assigning colors and creating the legend. 据我了解, ggplot2在分配颜色和创建图例时应遵循该因素的顺序(正确)。 How can I make both the legend and the shades of grey increase from smallest to lowest sample size? 如何使图例和灰色阴影从最小样本大小增加到最小样本大小?

You should let ggplot handle drawing the separate lines for each level of nsamp : because you have mapped nsamp to the colour aesthetic, ggplot will automatically draw a different line for each level, so you can do: 您应该让ggplot来为nsamp每个级别绘制单独的线条:因为您已将nsamp映射到色彩美学, ggplot会为每个级别自动绘制不同的线条,因此您可以执行以下操作:

montecarlo.plot <- ggplot(data=montecarlo.df, aes(x=coef.mc, color=nsamp))+
    geom_line(stat = "density", size = 1.2) +
    scale_color_grey() +
    labs(x=NULL, y="Density of Coefficient Estimate: Insurance", color="Sample Size")+
    theme_bw()
montecarlo.plot

No need to manually subset the data. 无需手动子集数据。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM