I have a relatively large dataset that I can share here .
I am trying to plot all the lines (not just one: eg a mean or a median) corresponding to the values of y over x = G, with the data grouped by I and P; so that the levels of the variable I appear with a different colour and the levels of the variable P appear with a different line type.
The problem I have is that the graph I get is a zig-zag line graph along the x-axis. The aim, obviously, is to have a line for each combination of data, avoiding the zig-zag. I have read that this problem could be related to the way the data is grouped. I have tried several combinations of data grouping using group
but I can't solve the problem.
The code I use is as follows:
#Selecting colours
colours<-brewer.pal(n = 11, name = "Spectral")[c(9,11,1)]
#Creating plot
data %>%
ggplot(aes(x = G, y = y, color = I, linetype=P)) +
geom_line(aes(linetype=P,color=I),size=0.2)+
scale_linetype_manual(values=c("solid", "dashed")) +
scale_color_manual(values=colours) +
scale_x_continuous(breaks = seq(0,100, by=25), limits=c(0,100)) +
scale_y_continuous(breaks = seq(0,1, by=0.25), limits=c(0,1)) +
labs(x = "Time", y = "Value") +
theme_classic()
I also tried unsuccessfully adding group=interaction(I, P)
inside ggplot(aes())
, as they suggests in other forums.
Following @JonSpring's point:
dd2 <- (filter(dd,G %in% c(16,17))
%>% group_by(P,I,G)
%>% summarise(n=length(unique(y)))
)
shows that you have many different values of y
for each combination of G/I/P
:
# A tibble: 12 x 4
# Groups: P, I [6]
P I G n
<chr> <chr> <dbl> <int>
1 heterogeneity I005 16 34
2 heterogeneity I005 17 37
3 heterogeneity I010 16 34
... [etc.]
One way around this, if you so choose, is to use stat_summary()
to have R collapse the y
values in each group to their mean:
(dd %>%
ggplot(aes(x = G, y = y, color = I, linetype=P)) +
stat_summary(fun=mean, geom="line",
aes(linetype=P,color=I,group=interaction(I,P)),size=0.2) +
scale_linetype_manual(values=c("solid", "dashed")) +
scale_color_manual(values=colours) +
labs(x = "Time", y = "Value") +
theme_classic()
)
You could also do this yourself with group_by() + summarise()
before calling ggplot
.
There's not enough information in the data set as presented to identify individual lines. If we are willing to assume that the order of the values within a given I/G/P group is an appropriate indexing variable, then we can do this:
## add index variable
dd3 <- dd %>% group_by(P,I,G) %>% mutate(index=seq(n()))
(dd3 %>%
ggplot(aes(x = G, y = y, color = I, linetype=P)) +
geom_line(aes(group=interaction(index,I,P)), size=0.2) +
scale_linetype_manual(values=c("solid", "dashed")) +
scale_color_manual(values=colours) +
labs(x = "Time", y = "Value") +
theme_classic()
)
If this isn't what you had in mind, then you need to provide more information...
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.