使用 ggplot2 線繪制平均值？

Question

從共享的數據集可以看出，$C1 的平均值對於因子“Geminate in $Consonant”比“Singleton”長。

我想 plot $Place 在 x 軸上，$C1 中的平均值在 y 軸上為因子列 $Consonant。

Consonant     Place       C1 C1_xsampa
1  Singleton  Bilabial 149.8670        tS
2   Geminate  Bilabial 161.3066        tS
3  Singleton Retroflex 115.9713         f
4   Geminate Retroflex 143.3766         f
5  Singleton    Dental 130.1839         k
6  Singleton    Dental 118.7762         k
7   Geminate    Dental 122.1802         k
8  Singleton     Velar 112.3296         s
9   Geminate     Velar 142.4654         s
10 Singleton  Bilabial 245.7727        tS
11  Geminate  Bilabial 288.2960        tS
12  Geminate Retroflex 128.9104         f
13 Singleton    Dental 103.7978         k
14  Geminate    Dental 135.6264         k
15 Singleton    Dental 208.1685         k

為了您的方便，我附上了一張展示類似 plot 的圖片。 我花了幾天時間弄清楚這一點。 任何想法都會非常有幫助。

＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃編輯＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃ ##

Place C2_xsampa Consonant  C1
1      Velar         k Singleton 127
2      Velar        k:  Geminate 122
3   Bilabial         p Singleton 129
4   Bilabial        p:  Geminate 171
5     Dental       t_d Singleton 150
6     Dental      t_d:  Geminate 172
7     Dental     t_d_h Singleton 121
8     Dental    t_d_h:  Geminate 123
9  Retroflex        t` Singleton 109
10 Retroflex       t`:  Geminate 116

Answer 1

目前尚不清楚您預期的 output 是什么，但也許此解決方案適合您的用例：

library(tidyverse)
# summarySE func from
# http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/#Helper%20functions
## Gives count, mean, standard deviation, standard error of the mean, and confidence interval (default 95%).
##   data: a data frame.
##   measurevar: the name of a column that contains the variable to be summariezed
##   groupvars: a vector containing names of columns that contain grouping variables
##   na.rm: a boolean that indicates whether to ignore NA's
##   conf.interval: the percent range of the confidence interval (default is 95%)

summarySE <- function(data=NULL, measurevar, groupvars=NULL, na.rm=FALSE,
                      conf.interval=.95, .drop=TRUE) {
  library(plyr)
  # New version of length which can handle NA's: if na.rm==T, don't count them
  length2 <- function (x, na.rm=FALSE) {
    if (na.rm) sum(!is.na(x))
    else       length(x)
  }
  
  # This does the summary. For each group's data frame, return a vector with
  # N, mean, and sd
  datac <- ddply(data, groupvars, .drop=.drop,
                 .fun = function(xx, col) {
                   c(N    = length2(xx[[col]], na.rm=na.rm),
                     mean = mean   (xx[[col]], na.rm=na.rm),
                     sd   = sd     (xx[[col]], na.rm=na.rm)
                   )
                 },
                 measurevar
  )
  
  # Rename the "mean" column    
  datac <- rename(datac, c("mean" = measurevar))
  
  datac$se <- datac$sd / sqrt(datac$N)  # Calculate standard error of the mean
  
  # Confidence interval multiplier for standard error
  # Calculate t-statistic for confidence interval: 
  # e.g., if conf.interval is .95, use .975 (above/below), and use df=N-1
  ciMult <- qt(conf.interval/2 + .5, datac$N-1)
  datac$ci <- datac$se * ciMult
  return(datac)
}

dat1 <- read.table(text = "rownumber Consonant     Place       C1 C1_xsampa
1  Singleton  Bilabial 149.8670        tS
2   Geminate  Bilabial 161.3066        tS
3  Singleton Retroflex 115.9713         f
4   Geminate Retroflex 143.3766         f
5  Singleton    Dental 130.1839         k
6  Singleton    Dental 118.7762         k
7   Geminate    Dental 122.1802         k
8  Singleton     Velar 112.3296         s
9   Geminate     Velar 142.4654         s
10 Singleton  Bilabial 245.7727        tS
11  Geminate  Bilabial 288.2960        tS
12  Geminate Retroflex 128.9104         f
13 Singleton    Dental 103.7978         k
14  Geminate    Dental 135.6264         k
15 Singleton    Dental 208.1685         k",
                   header = TRUE)

tgc <- summarySE(dat1, measurevar = "C1", groupvars = c("Consonant", "Place"))

ggplot(tgc, aes(x=Place, y=C1,
                colour=Consonant,
                group = Consonant)) + 
  geom_errorbar(aes(ymin=C1-se,
                    ymax=C1+se),
                width = 0.2,
                position = position_dodge(width = 0.2)) +
  geom_line(position = position_dodge(width = 0.2)) +
  geom_point(position = position_dodge(width = 0.2))

Answer 2

在將數據提供給 ggplot 之前，應該計算手段。 group_by() and summarise() from dplyr package along with pipe operator %>% from magrittr package will manage this smoothly.

如果上面的數據存儲在object“數據”中，那么

    library(dplyr)
    library(magrittr)
    library(ggplot2)

    avg_data <- data %>% group_by(Consonant, Place) %>%
        summarise(mean_C1 = mean(C1))

    ggplot(avg_data, aes(Place, mean_C1, color = Consonant)) +
        geom_point()

應該給出你正在尋找的東西：每個輔音和位置的 C1 的手段。

使用 ggplot2 線繪制平均值？

問題描述

2 個解決方案

解決方案1
0 2021-06-11 06:17:05

解決方案2
0 2021-06-11 06:25:23

使用 ggplot2 線繪制平均值？

問題描述

2 個解決方案

解決方案1 0 2021-06-11 06:17:05

解決方案2 0 2021-06-11 06:25:23

解決方案1
0 2021-06-11 06:17:05

解決方案2
0 2021-06-11 06:25:23