简体   繁体   English

手动将图例添加到 R/ggplot2 图而不干扰图

[英]Adding legend manually to R/ggplot2 plot without interfering with the plot

Question: Is it possible to add a legend to a plot that has nothing to do with the plot itself and - crucially - will not interfere with the colors in the plot?问题:是否可以向与情节本身无关的情节添加图例,并且 - 至关重要的是 - 不会干扰情节中的颜色?

Explanation解释

I have all the information I should need for the legend.我拥有传奇所需的所有信息。 In particular, I have the hex codes of the colors and I have the labels.特别是,我有颜色的十六进制代码和标签。 I do not care what shapes are shown (lines, points, whichever is easiest).我不在乎显示什么形状(线、点,以最简单的为准)。

I was hoping this should do the trick (this is a very simplified minimal working example):我希望这应该可以解决问题(这是一个非常简化的最小工作示例):

the_colors <- c("#e6194b", "#3cb44b", "#ffe119", "#0082c8", "#f58231", "#911eb4", "#46f0f0", "#f032e6", 
                "#d2f53c", "#fabebe", "#008080", "#e6beff", "#aa6e28", "#fffac8", "#800000", "#aaffc3", 
                "#808000", "#ffd8b1", "#000080", "#808080", "#ffffff", "#000000")
the_labels <- c("01", "02", "03", "04", "05", "06", "07", "08", "09", "10")

the_df <- data.frame("col1"=c(1, 2, 2, 1), "col2"=c(2, 2, 1, 1), "col3"=c(1, 2, 3, 4))

the_plot <- ggplot() + geom_point(data=the_df, aes(x=col1, y=col2), color=the_colors[[4]])

the_plot <- the_plot +
  scale_color_manual("Line.Color", values=the_colors[1:length(the_labels)],
                      labels=the_labels)

Unfortunately, it will not even show the legend.不幸的是,它甚至不会显示图例。

Playing by the rules, and including the color argument inside the aesthetics element, I can get it to show a legend.遵守规则,并在aesthetics元素中包含color参数,我可以让它显示一个图例。

the_plot <- ggplot() + geom_point(data=the_df, aes(x=col1, y=col2, color=the_colors[[4]]))

But then, of course, it will not take the value passed as the color argument () serious any longer and will instead interpret it as some kind of a label and change the color of these data points to the first color in the the_colors list.但是,当然,它不会再认真对待作为color参数 () 传递的值,而是将其解释为某种标签,并将这些数据点的颜色更改为the_colors列表中的第一个颜色。 At the same time, it will only include this one in the legend and there does not seem to be a way in hell to convince it also include the others.同时,它只会在传说中包含这个,地狱似乎没有办法说服它也包含其他。

In other languages, this is unbelievably easy.在其他语言中,这非常容易。 In R/ggplot2, this seems unbelievably hard.在 R/ggplot2 中,这似乎非常困难。

Reason why I want to do this: I want a legend that does not interfere with the colors in my plot.我想这样做的原因:我想要一个不会干扰我情节中颜色的图例。 This is sometimes very inconvenient.这有时非常不方便。 There is also no deeper reason that the legend must mess with the colors in the plot, just that this is how it is implemented in R/ggplot2.也没有更深层次的原因说明图例必须弄乱图中的颜色,只是这就是它在 R/ggplot2 中的实现方式。

Approach: I was hoping that there is a way to easily do this by still treating this as a legend.方法:我希望有一种方法可以通过仍然将其视为传奇来轻松做到这一点。 Failing that, it might be possible to add a box with some colored points and some text, thereby constructing a legend from scratch.否则,可能会添加一个带有一些颜色点和一些文本的框,从而从头开始构建图例。

Other questions: There have been various other questions asking the same thing.其他问题:有很多其他问题在问同样的事情。 The answers did instead suggest workarounds to solve the concrete problem of the OP (usually by applying melt() or so) without providing a solution to the question that was asked (how to add a legend manually without messing with the plot).答案确实提出了解决 OP 的具体问题的变通方法(通常通过应用melt()左右),而不提供所提出问题的解决方案(如何手动添加图例而不会弄乱情节)。 Eg here and here .例如这里这里 This is not what I am interested in. I would like to know if I can add an arbitrary legend to an arbitrary plot, and, if yes, how.这不是我感兴趣的。我想知道是否可以在任意情节中添加任意图例,如果可以,如何添加。

Software: R 3.6.3, ggplot2 3.2.1软件: R 3.6.3,ggplot2 3.2.1

Edit (March 30 2020):编辑(2020 年 3 月 30 日):

Solution: As described in @Tjebo's answer below, a legend that is reasonably independent from the plot and defines additional data series not shown in the plot can be created with scale_color_identity .解决方案:如下面@Tjebo 的回答所述,可以使用scale_color_identity创建一个与绘图合理独立并定义绘图中未显示的附加数据系列的scale_color_identity With option #1 in the answer by @Tjebo, I could solve my immediate problem:使用@Tjebo 的答案中的选项 #1,我可以解决我眼前的问题:

the_colors <- sort(c("#e6194b", "#3cb44b", "#ffe119", "#0082c8", "#f58231", "#911eb4", "#46f0f0", "#f032e6", 
            "#d2f53c", "#fabebe", "#008080", "#e6beff", "#aa6e28", "#fffac8", "#800000", "#aaffc3", 
            "#808000", "#ffd8b1", "#000080", "#808080"))

color_df <- data.frame(the_colors=the_colors[1:length(the_labels)], the_labels=the_labels)

the_df <- data.frame("col1"=c(1, 2, 2, 1), "col2"=c(2, 2, 1, 1), "col3"=c(1, 2, 3, 4))

the_plot <- ggplot() + 
    geom_point(data = color_df, aes(x = the_df$col1[[1]], y = the_df$col2[[1]], color = the_colors)) +
    scale_color_identity(guide = 'legend', labels = color_df$the_labels) 

the_plot <- the_plot +
  geom_point(data=the_df, aes(x=col1, y=col2), color=the_colors[[4]]) 

print(the_plot)

Explanation of the solution: More generally, as Tjebo explains, it separates the plot from the legend.解决方案的说明:更一般地说,正如 Tjebo 所解释的那样,它将情节与图例分开。 The legend still needs a plot.传奇仍然需要一个情节。 This is built first with:这是首先构建的:

the_plot <- ggplot() + 
  geom_point(data = color_df, aes(x = the_df$col1[[1]], y = the_df$col2[[1]], color = the_colors)) +
  scale_color_identity(guide = 'legend', labels = color_df$the_labels)

The plot this creates still has the wrong color, but the points are chosen so that they are hidden by adding the plot I actually want to show in its appropriate color:这样创建的绘图仍然具有错误的颜色,但选择了这些点,以便通过添加我实际想要以适当颜色显示的绘图来隐藏它们:

the_plot <- the_plot + 
  geom_point(data=the_df, aes(x=col1, y=col2), color=the_colors[[4]])

It is also flexible in that further data series can be added in any of the colors that are predefined in the the_colors variable:它也很灵活,可以在the_colors变量中预定义的任何颜色中添加更多数据系列:

the_plot <- the_plot +
  geom_point(data=the_df, aes(x=col1, y=col3), color=the_colors[[6]]) 

(Note: The data series can also be plotted at once if the color is defined as a third column in the data frame. I just wanted to point out that the solution is flexible and the plot can be modified at a later time without interfering with the legend or the colors of the data points that are already arranged in the plot.) (注意:如果颜色被定义为数据框中的第三列,数据系列也可以一次绘制。我只是想指出解决方案是灵活的,以后可以修改绘图而不会干扰图中已经排列的数据点的图例或颜色。)

Edit 2 (March 30 2020), Additional Note: With this solution, the legend will sort the colors by their hex codes.编辑 2(2020 年 3 月 30 日),附加说明:使用此解决方案,图例将按其十六进制代码对颜色进行排序。 I cannot begin to fathom why it would do that, but it does.我无法开始理解为什么它会这样做,但确实如此。 So, in order for the colors in the legend to match the intended colors, the vector of hex codes should be sorted beforehand (as is done in the code above).因此,为了使图例中的颜色与预期颜色匹配,应事先对十六进制代码的向量进行排序(如上面的代码中所做的那样)。

Unexpected behaviors like this would not be a concern in normal use of R and ggplot2 (where you let ggplot2 do the legend for you and restrict yourself strictly to what designs are intended to be used).像这样的意外行为在 R 和 ggplot2 的正常使用中不会成为问题(您让 ggplot2 为您做图例并严格限制自己使用的设计)。 This solution is basically a hack around how the legend is expected to be used in ggplot2 (unfortunately quite restrictive).这个解决方案基本上是一个关于如何在 ggplot2 中使用图例的黑客(不幸的是非常严格)。 As such, it is possible that this hack will break in future versions of ggplot or R.因此,此 hack 可能会在 ggplot 或 R 的未来版本中中断。

Maybe this is what you want.. Plot 1 is definitely not a clever and ggplot-y way of plotting (essentially, you are not visualising dimensions of your data).也许这就是你想要的......图 1 绝对不是一种聪明的 ggplot-y 绘图方式(本质上,你没有可视化数据的维度)。 Below another option (plot 2)...在另一个选项下面(图 2)...

Below - creating new data frame and plot with scale_color_identity .下面 - 创建新的数据框并使用scale_color_identity Use a data point of your second plot, which comes second and overplots the first plot, so the point disappears.使用第二个图的数据点,它排在第二位并覆盖第一个图,因此该点消失了。

library(tidyverse)
the_colors <- c("#e6194b", "#3cb44b", "#ffe119", "#0082c8", "#f58231", "#911eb4", "#46f0f0", "#f032e6", 
                "#d2f53c", "#fabebe", "#008080", "#e6beff", "#aa6e28", "#fffac8", "#800000", "#aaffc3", 
                "#808000", "#ffd8b1", "#000080", "#808080", "#ffffff", "#000000")

color_df <- data.frame(the_colors, the_labels = seq_along(the_colors))

the_df <- data.frame("col1"=c(1, 2, 2, 1), "col2"=c(2, 2, 1, 1), "col3"=c(1, 2, 3, 4))

#the_plot <- 
  ggplot() + 
    geom_point(data = color_df, aes(x = the_df$col1[[1]], y = the_df$col2[[1]], color = the_colors)) +
    scale_color_identity(guide = 'legend', labels = color_df$the_labels) +
    geom_point(data=the_df, aes(x=col1, y=col2), color=the_colors[[4]]) 

a more ggplot-y way一种更 ggplot-y 的方式

Now, the last plot is really peculiar, because as you say, "the colors have nothing to do with the plot" [with the data] and thus, showing them is absolutely pointless.现在,最后一个情节真的很奇怪,因为正如您所说,“颜色与情节无关”[与数据],因此,显示它们绝对没有意义。 What you actually want is to visualise dimensions of your data.您真正想要的是可视化数据的维度。

So what I believe and hope is that you have some link of those values to your plotted data.因此,我相信并希望您将这些值与绘制的数据联系起来。 I am considering col3 to be the variable of choice, that will be represented by color.我正在考虑col3是选择的变量,它将用颜色表示。

First, create a named vector, so you can pass this as your values argument in scale_color .首先,创建一个命名向量,以便您可以将其作为scale_colorvalues参数scale_color The names should be the values of your column which will be represented by color, in this case col3 .名称应该是您的列的值,它将用颜色表示,在本例中为col3

names(the_colors) <- str_pad(seq_along(the_colors), width = 2, pad = '0')

the_df <- data.frame("col1"=c(1, 2, 2, 1), "col2"=c(2, 2, 1, 1), "col3"=str_pad(c(1, 2, 3, 4), width = 2,pad='0'))

ggplot() + 
  geom_point(data=the_df, aes(x=col1, y=col2, color = col3))  +
  scale_color_manual(limits = names(the_colors), values = the_colors)

Created on 2020-03-28 by the reprex package (v0.3.0)reprex 包(v0.3.0) 于 2020 年 3 月 28 日创建

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM