[英]How to avoid zig-zag plot when using geom_line with color and linetype
I have a relatively large dataset that I can share here .我有一个相对较大的数据集,可以在这里分享。
I am trying to plot all the lines (not just one: eg a mean or a median) corresponding to the values of y over x = G, with the data grouped by I and P;我正在尝试 plot 对应于 x = G 上 y 值的所有行(不仅仅是一个:例如平均值或中位数),数据按 I 和 P 分组; so that the levels of the variable I appear with a different colour and the levels of the variable P appear with a different line type.
因此变量 I 的水平以不同的颜色出现,而变量 P 的水平以不同的线型出现。
The problem I have is that the graph I get is a zig-zag line graph along the x-axis.我遇到的问题是我得到的图形是沿 x 轴的之字形折线图。 The aim, obviously, is to have a line for each combination of data, avoiding the zig-zag.
显然,目的是为每个数据组合设置一条线,避免曲折。 I have read that this problem could be related to the way the data is grouped.
我读过这个问题可能与数据的分组方式有关。 I have tried several combinations of data grouping using
group
but I can't solve the problem.我尝试了几种使用
group
进行数据分组的组合,但我无法解决问题。
The code I use is as follows:我使用的代码如下:
#Selecting colours
colours<-brewer.pal(n = 11, name = "Spectral")[c(9,11,1)]
#Creating plot
data %>%
ggplot(aes(x = G, y = y, color = I, linetype=P)) +
geom_line(aes(linetype=P,color=I),size=0.2)+
scale_linetype_manual(values=c("solid", "dashed")) +
scale_color_manual(values=colours) +
scale_x_continuous(breaks = seq(0,100, by=25), limits=c(0,100)) +
scale_y_continuous(breaks = seq(0,1, by=0.25), limits=c(0,1)) +
labs(x = "Time", y = "Value") +
theme_classic()
I also tried unsuccessfully adding group=interaction(I, P)
inside ggplot(aes())
, as they suggests in other forums.正如他们在其他论坛中所建议的那样,我还尝试在
ggplot(aes())
中添加group=interaction(I, P)
,但未成功。
Following @JonSpring's point:遵循@JonSpring 的观点:
dd2 <- (filter(dd,G %in% c(16,17))
%>% group_by(P,I,G)
%>% summarise(n=length(unique(y)))
)
shows that you have many different values of y
for each combination of G/I/P
:表明对于
G/I/P
的每种组合,您有许多不同的y
值:
# A tibble: 12 x 4
# Groups: P, I [6]
P I G n
<chr> <chr> <dbl> <int>
1 heterogeneity I005 16 34
2 heterogeneity I005 17 37
3 heterogeneity I010 16 34
... [etc.]
One way around this, if you so choose, is to use stat_summary()
to have R collapse the y
values in each group to their mean:如果您愿意,解决此问题的一种方法是使用
stat_summary()
让 R 将每组中的y
值折叠到它们的平均值:
(dd %>%
ggplot(aes(x = G, y = y, color = I, linetype=P)) +
stat_summary(fun=mean, geom="line",
aes(linetype=P,color=I,group=interaction(I,P)),size=0.2) +
scale_linetype_manual(values=c("solid", "dashed")) +
scale_color_manual(values=colours) +
labs(x = "Time", y = "Value") +
theme_classic()
)
You could also do this yourself with group_by() + summarise()
before calling ggplot
.您也可以在调用
ggplot
之前使用group_by() + summarise()
自己执行此操作。
There's not enough information in the data set as presented to identify individual lines.提供的数据集中没有足够的信息来识别各个行。 If we are willing to assume that the order of the values within a given I/G/P group is an appropriate indexing variable, then we can do this:
如果我们愿意假设给定 I/G/P 组中值的顺序是适当的索引变量,那么我们可以这样做:
## add index variable
dd3 <- dd %>% group_by(P,I,G) %>% mutate(index=seq(n()))
(dd3 %>%
ggplot(aes(x = G, y = y, color = I, linetype=P)) +
geom_line(aes(group=interaction(index,I,P)), size=0.2) +
scale_linetype_manual(values=c("solid", "dashed")) +
scale_color_manual(values=colours) +
labs(x = "Time", y = "Value") +
theme_classic()
)
If this isn't what you had in mind, then you need to provide more information...如果这不是您的想法,那么您需要提供更多信息...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.