[英]Plotting mean values using ggplot2 line?
從共享的數據集可以看出,$C1 的平均值對於因子“Geminate in $Consonant”比“Singleton”長。
我想 plot $Place 在 x 軸上,$C1 中的平均值在 y 軸上為因子列 $Consonant。
Consonant Place C1 C1_xsampa
1 Singleton Bilabial 149.8670 tS
2 Geminate Bilabial 161.3066 tS
3 Singleton Retroflex 115.9713 f
4 Geminate Retroflex 143.3766 f
5 Singleton Dental 130.1839 k
6 Singleton Dental 118.7762 k
7 Geminate Dental 122.1802 k
8 Singleton Velar 112.3296 s
9 Geminate Velar 142.4654 s
10 Singleton Bilabial 245.7727 tS
11 Geminate Bilabial 288.2960 tS
12 Geminate Retroflex 128.9104 f
13 Singleton Dental 103.7978 k
14 Geminate Dental 135.6264 k
15 Singleton Dental 208.1685 k
為了您的方便,我附上了一張展示類似 plot 的圖片。 我花了幾天時間弄清楚這一點。 任何想法都會非常有幫助。
############################編輯##################### ##
Place C2_xsampa Consonant C1
1 Velar k Singleton 127
2 Velar k: Geminate 122
3 Bilabial p Singleton 129
4 Bilabial p: Geminate 171
5 Dental t_d Singleton 150
6 Dental t_d: Geminate 172
7 Dental t_d_h Singleton 121
8 Dental t_d_h: Geminate 123
9 Retroflex t` Singleton 109
10 Retroflex t`: Geminate 116
目前尚不清楚您預期的 output 是什么,但也許此解決方案適合您的用例:
library(tidyverse)
# summarySE func from
# http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/#Helper%20functions
## Gives count, mean, standard deviation, standard error of the mean, and confidence interval (default 95%).
## data: a data frame.
## measurevar: the name of a column that contains the variable to be summariezed
## groupvars: a vector containing names of columns that contain grouping variables
## na.rm: a boolean that indicates whether to ignore NA's
## conf.interval: the percent range of the confidence interval (default is 95%)
summarySE <- function(data=NULL, measurevar, groupvars=NULL, na.rm=FALSE,
conf.interval=.95, .drop=TRUE) {
library(plyr)
# New version of length which can handle NA's: if na.rm==T, don't count them
length2 <- function (x, na.rm=FALSE) {
if (na.rm) sum(!is.na(x))
else length(x)
}
# This does the summary. For each group's data frame, return a vector with
# N, mean, and sd
datac <- ddply(data, groupvars, .drop=.drop,
.fun = function(xx, col) {
c(N = length2(xx[[col]], na.rm=na.rm),
mean = mean (xx[[col]], na.rm=na.rm),
sd = sd (xx[[col]], na.rm=na.rm)
)
},
measurevar
)
# Rename the "mean" column
datac <- rename(datac, c("mean" = measurevar))
datac$se <- datac$sd / sqrt(datac$N) # Calculate standard error of the mean
# Confidence interval multiplier for standard error
# Calculate t-statistic for confidence interval:
# e.g., if conf.interval is .95, use .975 (above/below), and use df=N-1
ciMult <- qt(conf.interval/2 + .5, datac$N-1)
datac$ci <- datac$se * ciMult
return(datac)
}
dat1 <- read.table(text = "rownumber Consonant Place C1 C1_xsampa
1 Singleton Bilabial 149.8670 tS
2 Geminate Bilabial 161.3066 tS
3 Singleton Retroflex 115.9713 f
4 Geminate Retroflex 143.3766 f
5 Singleton Dental 130.1839 k
6 Singleton Dental 118.7762 k
7 Geminate Dental 122.1802 k
8 Singleton Velar 112.3296 s
9 Geminate Velar 142.4654 s
10 Singleton Bilabial 245.7727 tS
11 Geminate Bilabial 288.2960 tS
12 Geminate Retroflex 128.9104 f
13 Singleton Dental 103.7978 k
14 Geminate Dental 135.6264 k
15 Singleton Dental 208.1685 k",
header = TRUE)
tgc <- summarySE(dat1, measurevar = "C1", groupvars = c("Consonant", "Place"))
ggplot(tgc, aes(x=Place, y=C1,
colour=Consonant,
group = Consonant)) +
geom_errorbar(aes(ymin=C1-se,
ymax=C1+se),
width = 0.2,
position = position_dodge(width = 0.2)) +
geom_line(position = position_dodge(width = 0.2)) +
geom_point(position = position_dodge(width = 0.2))
在將數據提供給 ggplot 之前,應該計算手段。 group_by() and summarise() from dplyr package along with pipe operator %>% from magrittr package will manage this smoothly.
如果上面的數據存儲在object“數據”中,那么
library(dplyr)
library(magrittr)
library(ggplot2)
avg_data <- data %>% group_by(Consonant, Place) %>%
summarise(mean_C1 = mean(C1))
ggplot(avg_data, aes(Place, mean_C1, color = Consonant)) +
geom_point()
應該給出你正在尋找的東西:每個輔音和位置的 C1 的手段。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.