简体   繁体   English

如何使用数字数据订购ggplot图例?

[英]How to order ggplot legend of plot with numeric data?

Trying to make a plot with ggplot2 I have difficulties to order the legend as I want. 尝试使用ggplot2进行绘图时,我很难ggplot2订购图例。 For debugging I made some example data and the issue doesn't occur though data and code are similar! 为了进行调试,我提供了一些示例数据,尽管数据和代码相似,但不会发生此问题! I am confused of in which way ggplot sorts its data. 我对ggplot以哪种方式对其数据进行排序感到困惑。

The legend should be sorted by its values since it's numeric, as it actually is in the example data, but not in my working data. 图例应该按其值排序,因为它是数字,因为它实际上在示例数据中,而不是在我的工作数据中。

Here's my data... 这是我的数据

structure(list(n = c(150000, 15000, 3000, 1500, 750), estimate = c(0.0485706666666667, 
0.0454933333333333, 0.0604, 0.0413333333333334, 0.0402666666666666
), se = c(0.00230392190029327, 0.00727258789388646, 0.0163963824219692, 
0.0229426160506936, 0.0324210840623078), t.value = c(21.0811338041398, 
6.25147550973637, 3.67790396107066, 1.80003749831851, 1.23787262678884
)), .Names = c("n", "estimate", "se", "t.value"), row.names = c("150000", 
"15000", "3000", "1500", "750"), class = "data.frame")

...and code I used to make the df ...以及我用来制作df的代码

# allEst <- data.frame(rbind(est1, est2, est3, est4, est5))  
# names(allEst) <- names(est1.tmp)
# rownames(allEst) <- c(150000, 15000, 3000, 1500, 750)

Code for the plot 情节代码

# confidence intervals
interval1 <- - qnorm((1 - .95) / 2)  # 5% 
interval2 <- - qnorm((1 - .99) / 2)  # 1% 

# Plot
library(ggplot2)
ep <- ggplot(allEst, aes(colour=rownames(allEst)))
ep <- ep + geom_hline(yintercept=0.05, colour=gray(1/2), lty=2)
ep <- ep + geom_linerange(aes(x=n,
                                ymin=estimate - se*interval1, 
                                ymax=estimate + se*interval1),
                            lwd=2)
ep <- ep + geom_pointrange(aes(x=n, y=estimate, 
                                 ymin=estimate - se*interval2, 
                                 ymax=estimate + se*interval2),
                             lwd=1, shape=21, fill="WHITE")
ep <- ep + scale_x_log10()
ep <- ep + coord_flip() 

print(ep)

Which gives me: 这给了我:

在此处输入图片说明

And here the toy example I created: 这是我创建的玩具示例:

est1 <- c(1e5, 0.0485, 0.0023, 21.08)
est2 <- c(1e4, 0.0454, 0.0072, 6.25)
est3 <- c(1e3, 0.0604, 0.0163, 3.67)
est4 <- c(1e2, 0.0402, 0.0324, 1.23)

df <- data.frame(rbind(est1, est2, est3, est4))
rownames(df) <- c(100000, 10000, 1000, 100)
df

interval1 <- - qnorm(0.025); interval2 <- - qnorm(0.005)

library(ggplot2)
ep1 <- ggplot(df, aes(colour=rownames(df)))
ep1 <- ep1 + geom_hline(yintercept=0.05, colour=gray(1/2), lty=2)
ep1 <- ep1 + geom_linerange(aes(x=X1, ymin= X2 -  X3*interval1,
                                ymax= X2 +  X3*interval1),
                            lwd=2, position=position_dodge(width=1/2))
ep1 <- ep1 + geom_pointrange(aes(x=X1, y= X2, ymin= X2 - 
                                    X3*interval2, ymax= X2 + 
                                    X3*interval2),
                             lwd=1, position=position_dodge(width=1/2),
                             shape=21, fill="WHITE")
ep1 <- ep1 + scale_x_log10()
ep1 <- ep1 + coord_flip() 

print(ep1)

...and the plot w/ perfectly ordered (!) legend: ...并且剧情带有完美排序的(!)图例:

在此处输入图片说明

So, what's going on? 发生什么了? Where am I blind? 我在哪里瞎?

Reorder color legend with breaks from scale_color_discrete : 使用scale_color_discrete breaks重新scale_color_discrete颜色图例:

# Adding sorted rownames to breaks
ep + scale_color_discrete(breaks = sort(as.numeric(rownames(allEst))))

Result plot: 结果图:

在此处输入图片说明

You pass a character vector from rownames(allEst) to the aesthetics, which are sorted as a character: 您将字符向量从行名rownames(allEst)传递给美观,该美观按字符进行排序:

sort(rownames(allEst))
[1] "1500"   "15000"  "150000" "3000"   "750" 

If sort is used on characters, it uses alphabetic sorting, where '1' and '12' are before '2' just like 'a' and 'ab' are before 'ba' . 如果对字符使用sort ,则使用字母排序,其中'1''12''2'之前,就像'a''ab''ba'之前。 If you look at the difference between 如果您看一下两者之间的区别

sort(c(1:3, 10, 20, 30)) 

which returns 哪个返回

[1] 1 2 3 10 20 30 

and

sort(as.character(c(1:3, 10, 20, 30))) 

which returns 哪个返回

1 "1" "10" "2" "20" "3" "30" 1 “ 1”“ 10”“ 2”“ 20”“ 3”“ 30”

it might be clearer. 可能会更清楚。

One way of changing that would be using factor(sort(as.numeric(rownames(allEst)))) . 一种更改方法是使用factor(sort(as.numeric(rownames(allEst)))) Integrated in your code: 集成在您的代码中:

library(ggplot2)
ep <- ggplot(allEst, aes(colour=factor(sort(as.numeric(rownames(allEst))))))
ep <- ep + geom_hline(yintercept=0.05, colour=gray(1/2), lty=2)
ep <- ep + geom_linerange(aes(x=n,
                              ymin=estimate - se*interval1, 
                              ymax=estimate + se*interval1),
                          lwd=2)
ep <- ep + geom_pointrange(aes(x=n, y=estimate, 
                               ymin=estimate - se*interval2, 
                               ymax=estimate + se*interval2),
                           lwd=1, shape=21, fill="WHITE")
ep <- ep + scale_x_log10()
ep <- ep + coord_flip() 

print(ep)

Returns: 返回值: output_ggplot

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM