简体   繁体   中英

How to avoid zig-zag plot when using geom_line with color and linetype

I have a relatively large dataset that I can share here .

I am trying to plot all the lines (not just one: eg a mean or a median) corresponding to the values of y over x = G, with the data grouped by I and P; so that the levels of the variable I appear with a different colour and the levels of the variable P appear with a different line type.

The problem I have is that the graph I get is a zig-zag line graph along the x-axis. The aim, obviously, is to have a line for each combination of data, avoiding the zig-zag. I have read that this problem could be related to the way the data is grouped. I have tried several combinations of data grouping using group but I can't solve the problem.

The code I use is as follows:

#Selecting colours
colours<-brewer.pal(n = 11, name = "Spectral")[c(9,11,1)]    

#Creating plot
data %>%
      ggplot(aes(x = G, y = y, color = I, linetype=P)) +
      geom_line(aes(linetype=P,color=I),size=0.2)+
      scale_linetype_manual(values=c("solid", "dashed")) +
      scale_color_manual(values=colours) +
      scale_x_continuous(breaks = seq(0,100, by=25), limits=c(0,100)) +
      scale_y_continuous(breaks = seq(0,1, by=0.25), limits=c(0,1)) +
      labs(x = "Time", y = "Value") +
      theme_classic() 

I also tried unsuccessfully adding group=interaction(I, P) inside ggplot(aes()) , as they suggests in other forums.

Following @JonSpring's point:

dd2 <- (filter(dd,G %in% c(16,17))
    %>% group_by(P,I,G)
    %>% summarise(n=length(unique(y)))
)

shows that you have many different values of y for each combination of G/I/P :

# A tibble: 12 x 4
# Groups:   P, I [6]
   P             I         G     n
   <chr>         <chr> <dbl> <int>
 1 heterogeneity I005     16    34
 2 heterogeneity I005     17    37
 3 heterogeneity I010     16    34
... [etc.]

One way around this, if you so choose, is to use stat_summary() to have R collapse the y values in each group to their mean:

(dd %>%
 ggplot(aes(x = G, y = y, color = I, linetype=P)) +
 stat_summary(fun=mean, geom="line",
              aes(linetype=P,color=I,group=interaction(I,P)),size=0.2) +
 scale_linetype_manual(values=c("solid", "dashed")) +
 scale_color_manual(values=colours) +
 labs(x = "Time", y = "Value") +
 theme_classic()
)

You could also do this yourself with group_by() + summarise() before calling ggplot .

每个 I/P 组合一行的时间序列

There's not enough information in the data set as presented to identify individual lines. If we are willing to assume that the order of the values within a given I/G/P group is an appropriate indexing variable, then we can do this:

## add index variable
dd3 <- dd %>% group_by(P,I,G) %>% mutate(index=seq(n()))
(dd3 %>%
 ggplot(aes(x = G, y = y, color = I, linetype=P)) +
 geom_line(aes(group=interaction(index,I,P)), size=0.2) +
 scale_linetype_manual(values=c("solid", "dashed")) +
 scale_color_manual(values=colours) +
 labs(x = "Time", y = "Value") +
 theme_classic()
)

具有 600 条不同曲线的时间序列

If this isn't what you had in mind, then you need to provide more information...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM