使用带有颜色和线型的 geom_line 时如何避免之字形 plot

Question

I have a relatively large dataset that I can share here .我有一个相对较大的数据集，可以在这里分享。

I am trying to plot all the lines (not just one: eg a mean or a median) corresponding to the values of y over x = G, with the data grouped by I and P;我正在尝试 plot 对应于 x = G 上 y 值的所有行（不仅仅是一个：例如平均值或中位数），数据按 I 和 P 分组； so that the levels of the variable I appear with a different colour and the levels of the variable P appear with a different line type.因此变量 I 的水平以不同的颜色出现，而变量 P 的水平以不同的线型出现。

The problem I have is that the graph I get is a zig-zag line graph along the x-axis.我遇到的问题是我得到的图形是沿 x 轴的之字形折线图。 The aim, obviously, is to have a line for each combination of data, avoiding the zig-zag.显然，目的是为每个数据组合设置一条线，避免曲折。 I have read that this problem could be related to the way the data is grouped.我读过这个问题可能与数据的分组方式有关。 I have tried several combinations of data grouping using group but I can't solve the problem.我尝试了几种使用group进行数据分组的组合，但我无法解决问题。

The code I use is as follows:我使用的代码如下：

#Selecting colours
colours<-brewer.pal(n = 11, name = "Spectral")[c(9,11,1)]    

#Creating plot
data %>%
      ggplot(aes(x = G, y = y, color = I, linetype=P)) +
      geom_line(aes(linetype=P,color=I),size=0.2)+
      scale_linetype_manual(values=c("solid", "dashed")) +
      scale_color_manual(values=colours) +
      scale_x_continuous(breaks = seq(0,100, by=25), limits=c(0,100)) +
      scale_y_continuous(breaks = seq(0,1, by=0.25), limits=c(0,1)) +
      labs(x = "Time", y = "Value") +
      theme_classic()

I also tried unsuccessfully adding group=interaction(I, P) inside ggplot(aes()) , as they suggests in other forums.正如他们在其他论坛中所建议的那样，我还尝试在ggplot(aes())中添加group=interaction(I, P) ，但未成功。

Answer 1

Following @JonSpring's point:遵循@JonSpring 的观点：

dd2 <- (filter(dd,G %in% c(16,17))
    %>% group_by(P,I,G)
    %>% summarise(n=length(unique(y)))
)

shows that you have many different values of y for each combination of G/I/P :表明对于G/I/P的每种组合，您有许多不同的y值：

# A tibble: 12 x 4
# Groups:   P, I [6]
   P             I         G     n
   <chr>         <chr> <dbl> <int>
 1 heterogeneity I005     16    34
 2 heterogeneity I005     17    37
 3 heterogeneity I010     16    34
... [etc.]

One way around this, if you so choose, is to use stat_summary() to have R collapse the y values in each group to their mean:如果您愿意，解决此问题的一种方法是使用stat_summary()让 R 将每组中的y值折叠到它们的平均值：

(dd %>%
 ggplot(aes(x = G, y = y, color = I, linetype=P)) +
 stat_summary(fun=mean, geom="line",
              aes(linetype=P,color=I,group=interaction(I,P)),size=0.2) +
 scale_linetype_manual(values=c("solid", "dashed")) +
 scale_color_manual(values=colours) +
 labs(x = "Time", y = "Value") +
 theme_classic()
)

You could also do this yourself with group_by() + summarise() before calling ggplot .您也可以在调用ggplot之前使用group_by() + summarise()自己执行此操作。

There's not enough information in the data set as presented to identify individual lines.提供的数据集中没有足够的信息来识别各个行。 If we are willing to assume that the order of the values within a given I/G/P group is an appropriate indexing variable, then we can do this:如果我们愿意假设给定 I/G/P 组中值的顺序是适当的索引变量，那么我们可以这样做：

## add index variable
dd3 <- dd %>% group_by(P,I,G) %>% mutate(index=seq(n()))
(dd3 %>%
 ggplot(aes(x = G, y = y, color = I, linetype=P)) +
 geom_line(aes(group=interaction(index,I,P)), size=0.2) +
 scale_linetype_manual(values=c("solid", "dashed")) +
 scale_color_manual(values=colours) +
 labs(x = "Time", y = "Value") +
 theme_classic()
)

If this isn't what you had in mind, then you need to provide more information...如果这不是您的想法，那么您需要提供更多信息...

使用带有颜色和线型的 geom_line 时如何避免之字形 plot

问题描述

1 个解决方案

解决方案1
2 已采纳 2021-02-27 19:42:24

使用带有颜色和线型的 geom_line 时如何避免之字形 plot

问题描述

1 个解决方案

解决方案1 2 已采纳 2021-02-27 19:42:24

解决方案1
2 已采纳 2021-02-27 19:42:24