[英]Plotting all data as geom_point and including lines showing means in ggplot2; issues with stat_summary
我喜歡繪制所有數據點,並用它們之間的線表示參與者。 在這里,我根據條件和刺激類型繪制了每個參與者的評分:
我想要的是在每種條件的顏色中為每種刺激類型的每種條件添加平均線。 理想情況下,看起來像這樣:
我一直在使用stat_summary和詳細的GGPLOT2文檔站點stat_sum_df試圖在這里 ,但我不能得到那個工作。 它要么什么都不做,要么為每個參與者畫線。
我用來生成第一個圖形的代碼如下:
ggplot(df, aes(x=StimulusType+jitterVal, y=Rating, group=ParticipantCondition)) +
geom_point(size=4.5, aes(colour=Condition), alpha=0.3)+
geom_line(size=1, alpha=0.05)+
scale_y_continuous(limits=c(0, 7.5), breaks=seq(0,7,by=1))+
scale_colour_manual(values=c("#0072B2", "#009E73", "#F0E442", "#D55E00"))+
xlab('Stimulus type') +
scale_x_continuous(limits=(c(0.5, 2.5)), breaks = c(0.9, 1.9), labels = levels(df$StimulusType))+
ylab('Mean Rating') +
guides(colour = guide_legend(override.aes = list(alpha = 1))) +
theme_bw()
...您可以為前4個參與者創建示例數據框,如下所示:
Participant <- rep(c("01", "02", "03", "04"), 8)
StimulusType <- rep(rep(c(1, 2), each=4), 4)
Condition <- rep(c("A", "B", "C", "D"), each=8)
Rating <- c(5.20, 5.55, 3.10, 4.05, 5.05, 5.85, 3.90, 5.25, 4.70, 3.15, 3.40, 4.85, 4.90, 4.00, 3.95, 3.95, 3.00, 4.60, 3.95, 4.00, 3.15, 5.20,
5.05, 3.70, 2.75, 3.40, 4.80, 4.55, 2.35, 2.45, 5.45, 4.05)
jitterVal <- c(-0.19459509, -0.19571169, -0.17475060, -0.19599276, -0.17536634, -0.19429345, -0.17363951, -0.17446702, -0.13601392,
-0.14484280, -0.12328058, -0.12427593, -0.12913823, -0.12042329, -0.14703381, -0.12603936, -0.09125372, -0.08213296,
-0.09140868, -0.09728309, -0.08377205, -0.08514802, -0.08715795, -0.08932001, -0.02689549, -0.04717990, -0.03918013,
-0.03068255, -0.02826789, -0.02345827, -0.03473678, -0.03369023)
df <- data.frame(Participant, StimulusType, Condition, Rating, jitterVal)
ParticipantCondition <- paste(df$Participant, df$Condition)
我認為問題可能出在我創建的分組變量ParticipantCondition上,目的是為了獲取每種情況下每個參與者的得分之間的界線。
任何幫助將不勝感激。
我使用dplyr
計算了外部dplyr
。 平均值由平方表示。 你怎么看待這件事?
library(dplyr)
library(ggplot2)
Participant <- rep(c("01", "02", "03", "04"), 8)
StimulusType <- rep(rep(c(1, 2), each=4), 4)
Condition <- rep(c("A", "B", "C", "D"), each=8)
Rating <- c(5.20, 5.55, 3.10, 4.05, 5.05, 5.85, 3.90, 5.25, 4.70, 3.15, 3.40, 4.85, 4.90, 4.00, 3.95, 3.95, 3.00, 4.60, 3.95, 4.00, 3.15, 5.20,
5.05, 3.70, 2.75, 3.40, 4.80, 4.55, 2.35, 2.45, 5.45, 4.05)
jitterVal <- c(-0.19459509, -0.19571169, -0.17475060, -0.19599276, -0.17536634, -0.19429345, -0.17363951, -0.17446702, -0.13601392,
-0.14484280, -0.12328058, -0.12427593, -0.12913823, -0.12042329, -0.14703381, -0.12603936, -0.09125372, -0.08213296,
-0.09140868, -0.09728309, -0.08377205, -0.08514802, -0.08715795, -0.08932001, -0.02689549, -0.04717990, -0.03918013,
-0.03068255, -0.02826789, -0.02345827, -0.03473678, -0.03369023)
df <- data.frame(Participant, StimulusType, Condition, Rating, jitterVal)
ParticipantCondition <- paste(df$Participant, df$Condition)
rm(Rating, StimulusType, Condition, jitterVal)
levels(df$Condition)
mean_values <- df %>% group_by(StimulusType ,Condition) %>% select(Rating, jitterVal) %>% summarise_each(funs(mean))
mean_values <- ungroup(mean_values)
levels(mean_values$Condition) <- levels(df$Condition)
ggplot(df, aes(y=Rating, x = StimulusType + jitterVal)) +
geom_point(size=4.5, aes(colour = Condition), alpha=0.4) +
geom_line(size=1, alpha=0.05, aes(group = ParticipantCondition)) +
geom_rect(data = mean_values,
aes( xmin = ((StimulusType + jitterVal) - 0.05),
xmax = ((StimulusType + jitterVal) + 0.05),
ymin = Rating - 0.05,
ymax = Rating + 0.05,
fill = Condition)) +
scale_y_continuous(limits=c(0, 7.5), breaks=seq(0,7,by=1))+
scale_colour_manual(values=c("#0072B2", "#009E73", "#F0E442", "#D55E00"))+
scale_fill_manual(values=c("#0072B2", "#009E73", "#F0E442", "#D55E00"))+
xlab('Stimulus type') +
scale_x_continuous(limits=(c(0.5, 2.5)), breaks = c(0.9, 1.9), labels = levels(df$StimulusType))+
ylab('Mean Rating') +
guides(colour = guide_legend(override.aes = list(alpha = 1))) +
theme_bw()
矩形的大小當然可以輕松調整。
為了避免出現分組問題,可能需要先生成摘要。 一種選擇是:
library(dplyr)
summaryData <-
df %>%
group_by(StimulusType, Condition) %>%
summarise(meanRating = mean(Rating)
, jitterVal = mean(jitterVal)) %>%
mutate(xmin = StimulusType+jitterVal-0.04
, xend = StimulusType+jitterVal+0.04)
ggplot(df, aes(x=StimulusType+jitterVal, y=Rating, group=ParticipantCondition)) +
geom_point(size=4.5, aes(colour=Condition), alpha=0.3)+
geom_line(size=1, alpha=0.05)+
scale_y_continuous(limits=c(0, 7.5), breaks=seq(0,7,by=1))+
scale_colour_manual(values=c("#0072B2", "#009E73", "#F0E442", "#D55E00"))+
xlab('Stimulus type') +
scale_x_continuous(limits=(c(0.5, 2.5)), breaks = c(0.9, 1.9), labels = levels(df$StimulusType))+
ylab('Mean Rating') +
guides(colour = guide_legend(override.aes = list(alpha = 1))) +
geom_segment(data = summaryData
, mapping = aes(x=xmin
, xend=xend
, y=meanRating
, yend =meanRating
, group = NA
, colour = Condition)
, lwd = 3
, show.legend = FALSE
) +
theme_bw()
這是您不需要首先匯總/匯總數據的解決方案。 相反,您可以使用原始數據集,並根據需要輕松添加單個數據點。 使用ggplot的stat_summary選項計算平均值。
ggplot(df, aes(x=StimulusType, y = Rating, group=Condition, color=Condition)) +
# add individual lines + data points
geom_line (aes(group=interaction(Condition,Participant)), linetype = "dashed", size=.5) +
geom_point(size=.5) +
# add mean lines + datapoints
geom_line (stat="summary", fun.y="mean", size=1) +
geom_point(stat="summary", fun.y="mean", size=2) +
scale_colour_manual(values=c("#0072B2", "#009E73", "#F0E442", "#D55E00"))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.