简体   繁体   English

R ggplot:仅将标签应用于图中的最后 N 个数据点

[英]R ggplot: Apply label only to last N data points in plot

I have created a line chart (plot) in R with labels on each data point.我在 R 中创建了一个折线图(绘图),每个数据点上都有标签。 Due to the large number of data points, the plot becomes very fully with labels.由于大量数据点,图变得非常完整,带有标签。 I would like to apply the labels only for the last N (say 4) data points.我只想为最后 N 个(比如 4 个)数据点应用标签。 I have tried subset and tail in the geom_label_repel function but was not able to figure them our or got an error message.我在geom_label_repel函数中尝试了子集尾部,但无法计算它们或收到错误消息。 My data set consist of 99 values, spread over 3 groups (KPI).我的数据集由 99 个值组成,分布在 3 个组 (KPI) 上。

I have the following code in R:我在 R 中有以下代码:

library(ggplot)
library(ggrepel)

data.trend <- read.csv(file=....)

plot.line <- ggplot(data=data.trend, aes(x = Version, y = Value, group = KPI, color = KPI)) +

  geom_line(aes(group = KPI), size = 1) +
  geom_point(size = 2.5) +


  # Labels defined here
  geom_label_repel(
    aes(Version, Value, fill = factor(KPI), label = sprintf('%0.1f%%', Value)),
    box.padding = unit(0.35, "lines"),
    point.padding = unit(0.4, "lines"),
    segment.color = 'grey50',
    show.legend = FALSE
  )

);

I all fairness, I am quite new to R. Maybe I miss something basic.公平地说,我对 R 很陌生。也许我错过了一些基本的东西。

Thanks in advance.提前致谢。

The simplest approach is to set the data = parameter in geom_label_repel to only include the points you want labeled. 最简单的方法是将geom_label_repeldata =参数geom_label_repel为仅包含要标记的点。

Here's a reproducible example: 这是一个可重复的例子:

set.seed(1235)
data.trend <- data.frame(Version = rnorm(25), Value = rnorm(25), 
                         group = sample(1:2,25,T), 
                         KPI = sample(1:2,25,T))

ggplot(data=data.trend, aes(x = Version, y = Value, group = KPI, color = KPI)) +
  geom_line(aes(group = KPI), size = 1) +
  geom_point(size = 2.5) +
  geom_label_repel(aes(Version, Value, fill = factor(KPI), label = sprintf('%0.1f%%', Value)),
    data = tail(data.trend, 4),                 
    box.padding = unit(0.35, "lines"),
    point.padding = unit(0.4, "lines"),
    segment.color = 'grey50',
    show.legend = FALSE)

在此输入图像描述

Unfortunately, this messes slightly with the repel algorithm, making the label placement suboptimal with respect to the other points which are not labelled (you can see in the above figure that some points get covered by labels). 不幸的是,这与排斥算法略有混淆,使得标签放置相对于未标记的其他点不是最理想的(您可以在上图中看到某些点被标签覆盖)。

So, a better approach is to use color and fill to simply make the unwanted labels invisible (by setting both color and fill to NA for labels you want to hide): 因此, 更好的方法是使用colorfill来简单地使不需要的标签不可见(通过为要隐藏的标签设置颜色和填充为NA ):

ggplot(data=data.trend, aes(x = Version, y = Value, group = KPI, color = KPI)) +
  geom_line(aes(group = KPI), size = 1) +
  geom_point(size = 2.5) +
  geom_label_repel(aes(Version, Value, fill = factor(KPI), label = sprintf('%0.1f%%', Value)),
                   box.padding = unit(0.35, "lines"),
                   point.padding = unit(0.4, "lines"),
                   show.legend = FALSE,
                   color = c(rep(NA,21), rep('grey50',4)),
                   fill = c(rep(NA,21), rep('lightblue',4)))

在此输入图像描述

If you want to show just the last label, using group_by and filter may work:如果您只想显示最后一个标签,使用 group_by 和 filter 可能会起作用:

data = data.trend %>% group_by(KPI) %>% filter(Version == max(Version))

Full example:完整示例:

suppressPackageStartupMessages(library(dplyr))
library(ggplot2)
library(ggrepel)

set.seed(1235)
data.trend <- data.frame(Version = rnorm(25), Value = rnorm(25), 
                         group = sample(1:2,25,T), 
                         KPI = sample(1:2,25,T))

ggplot(data = data.trend, aes(x = Version, y = Value, group = KPI, color = KPI)) +
  geom_line(aes(group = KPI), size = 1) +
  geom_point(size = 2.5) +

  # Labels defined here
  geom_label_repel(
    data = data.trend %>% group_by(KPI) %>% filter(Version == max(Version)), 
    aes(Version, Value, fill = factor(KPI), label = sprintf('%0.1f%%', Value)),
    color = "black",
    fill = "white")

Or if you want to show 4 random labels per KPI, data.trend %>% group_by(KPI) %>% sample_n(4) :或者,如果您想为每个 KPI 显示 4 个随机标签, data.trend %>% group_by(KPI) %>% sample_n(4)

suppressPackageStartupMessages(library(dplyr))
library(ggplot2)
library(ggrepel)

set.seed(1235)
data.trend <- data.frame(Version = rnorm(25), Value = rnorm(25), 
                         group = sample(1:2,25,T), 
                         KPI = as.factor(sample(1:2,25,T)))

ggplot(data = data.trend, aes(x = Version, y = Value, group = KPI, color = KPI)) +
  geom_line(aes(group = KPI), size = 1) +
  geom_point(size = 2.5) +
  
  # Labels defined here
  geom_label_repel(
    data = data.trend %>% group_by(KPI) %>% sample_n(4), 
    aes(Version, Value, fill = factor(KPI), label = sprintf('%0.1f%%', Value), fill = KPI),
    color = "black", show.legend = FALSE
    )
#> Warning: Duplicated aesthetics after name standardisation: fill

Created on 2021-08-27 by the reprex package (v2.0.1)reprex 包(v2.0.1) 于 2021 年 8 月 27 日创建

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM