简体   繁体   中英

Mean value in ggplot grouped box plot (R)

Though the question was asked previously here , i have a new data frame hence a new question. The data sample is shown below:

ID,Region,Dimension,BlogsInd.,BlogsNews,BlogsTech,Columns
1,PK,Dim1,-4.75,NA,NA,NA
2,PK,Dim1,-5.69,NA,NA,NA
3,PK,Dim1,-0.27,NA,NA,NA
4,PK,Dim1,-2.76,NA,NA,NA
5,PK,Dim1,-8.24,NA,NA,NA
6,PK,Dim1,-12.51,NA,NA,NA
7,PK,Dim1,-1.28,NA,NA,NA
8,PK,Dim1,0.95,NA,NA,NA
9,PK,Dim1,-5.96,NA,NA,NA
10,PK,Dim1,-8.81,NA,NA,NA
11,PK,Dim1,-8.46,NA,NA,NA
12,PK,Dim1,-6.15,NA,NA,NA
13,PK,Dim1,-13.98,NA,NA,NA
14,PK,Dim1,-16.43,NA,NA,NA
15,PK,Dim1,-4.09,NA,NA,NA
16,PK,Dim1,-11.06,NA,NA,NA
17,PK,Dim1,-9.04,NA,NA,NA
18,PK,Dim1,-8.56,NA,NA,NA
19,PK,Dim1,-8.13,NA,NA,NA
20,PK,Dim2,-14.46,NA,NA,NA
21,PK,Dim2,-4.21,NA,NA,NA
22,PK,Dim2,-4.96,NA,NA,NA
23,PK,Dim2,-5.48,NA,NA,NA
24,PK,Dim2,-4.53,NA,NA,NA
25,PK,Dim2,6.31,NA,NA,NA
26,PK,Dim2,-11.16,NA,NA,NA
27,PK,Dim2,-1.27,NA,NA,NA
28,PK,Dim2,-11.49,NA,NA,NA
29,PK,Dim2,-0.9,NA,NA,NA
30,PK,Dim2,-12.27,NA,NA,NA
31,PK,Dim2,6.85,NA,NA,NA
32,PK,Dim2,-5.21,NA,NA,NA
33,PK,Dim2,-1.06,NA,NA,NA
34,PK,Dim2,-2.6,NA,NA,NA
35,PK,Dim2,-0.95,NA,NA,NA
36,PK,Dim3,-0.82,NA,NA,NA
37,PK,Dim3,-7.65,NA,NA,NA
38,PK,Dim3,0.64,NA,NA,NA
39,PK,Dim3,-2.25,NA,NA,NA
40,PK,Dim3,-1.58,NA,NA,NA
41,PK,Dim3,-5.73,NA,NA,NA
42,PK,Dim3,0.37,NA,NA,NA
43,PK,Dim3,-5.46,NA,NA,NA
44,PK,Dim3,-3.48,NA,NA,NA
45,PK,Dim3,0.88,NA,NA,NA
46,PK,Dim3,-2.11,NA,NA,NA
47,PK,Dim3,-10.13,NA,NA,NA
48,PK,Dim3,-2.08,NA,NA,NA
49,PK,Dim3,-4.33,NA,NA,NA
50,PK,Dim3,1.09,NA,NA,NA
51,PK,Dim3,-4.23,NA,NA,NA
52,PK,Dim3,-1.46,NA,NA,NA
53,PK,Dim3,9.37,NA,NA,NA
54,PK,Dim3,5.84,NA,NA,NA
55,PK,Dim3,8.21,NA,NA,NA
56,PK,Dim3,7.34,NA,NA,NA
57,PK,Dim4,1.83,NA,NA,NA
58,PK,Dim4,14.39,NA,NA,NA
59,PK,Dim4,22.02,NA,NA,NA
60,PK,Dim4,4.83,NA,NA,NA
61,PK,Dim4,-3.24,NA,NA,NA
62,PK,Dim4,-5.69,NA,NA,NA
63,PK,Dim4,-22.92,NA,NA,NA
64,PK,Dim4,0.41,NA,NA,NA
65,PK,Dim4,-4.42,NA,NA,NA
66,PK,Dim4,-10.72,NA,NA,NA
67,PK,Dim4,-11.29,NA,NA,NA
68,PK,Dim4,-2.89,NA,NA,NA
69,PK,Dim4,-7.59,NA,NA,NA
70,PK,Dim4,-7.45,NA,NA,NA
71,US,Dim1,-12.49,NA,NA,NA
72,US,Dim1,-11.59,NA,NA,NA
73,US,Dim1,-4.6,NA,NA,NA
74,US,Dim1,-22.83,NA,NA,NA
75,US,Dim1,-4.83,NA,NA,NA
76,US,Dim1,-14.76,NA,NA,NA
77,US,Dim1,-15.93,NA,NA,NA
78,US,Dim1,-2.78,NA,NA,NA
79,US,Dim1,-16.39,NA,NA,NA
80,US,Dim1,-15.22,NA,NA,NA
81,US,Dim1,3.25,NA,NA,NA
82,US,Dim1,-2.73,NA,NA,NA
83,US,Dim1,0.96,NA,NA,NA
84,US,Dim1,-1.12,NA,NA,NA
85,US,Dim1,-0.33,NA,NA,NA
86,US,Dim1,-6.45,NA,NA,NA
87,US,Dim1,2.52,NA,NA,NA
88,US,Dim1,3.18,NA,NA,NA
89,US,Dim1,4.65,NA,NA,NA
90,US,Dim2,-1.75,NA,NA,NA
91,US,Dim2,-0.22,NA,NA,NA
92,US,Dim2,8.16,NA,NA,NA
93,US,Dim2,1.89,NA,NA,NA
94,US,Dim2,4.31,NA,NA,NA
95,US,Dim2,-0.41,NA,NA,NA
96,US,Dim2,-23.02,NA,NA,NA
97,US,Dim2,3.87,NA,NA,NA
98,US,Dim2,-4.76,NA,NA,NA
99,US,Dim2,4.95,NA,NA,NA
100,US,Dim2,4.78,NA,NA,NA
101,US,Dim2,-15.11,NA,NA,NA
102,US,Dim2,-3.74,NA,NA,NA
103,US,Dim2,-6.15,NA,NA,NA
104,US,Dim2,-8.33,NA,NA,NA
105,US,Dim2,-5.55,NA,NA,NA
106,US,Dim3,-5.1,NA,NA,NA
107,US,Dim3,-0.41,NA,NA,NA
108,US,Dim3,-8,NA,NA,NA
109,US,Dim3,-11.8,NA,NA,NA
110,US,Dim3,-10.39,NA,NA,NA
111,US,Dim3,-14.98,NA,NA,NA
112,US,Dim3,-13.14,NA,NA,NA
113,US,Dim3,-16.06,NA,NA,NA
114,US,Dim3,-16.75,NA,NA,NA
115,US,Dim3,-17.58,NA,NA,NA
116,US,Dim3,-13.12,NA,NA,NA
117,US,Dim3,-15.69,NA,NA,NA
118,US,Dim3,-9.29,NA,NA,NA
119,US,Dim3,-14.93,NA,NA,NA
120,US,Dim3,-18.75,NA,NA,NA
121,US,Dim3,-16.15,NA,NA,NA
122,US,Dim3,-14.38,NA,NA,NA
123,US,Dim3,-11.33,NA,NA,NA
124,US,Dim3,2.06,NA,NA,NA
125,US,Dim3,1.55,NA,NA,NA
126,US,Dim3,3.17,NA,NA,NA
127,US,Dim4,3.33,NA,NA,NA
128,US,Dim4,-3.31,NA,NA,NA
129,US,Dim4,5.67,NA,NA,NA
130,US,Dim4,-1.94,NA,NA,NA
131,US,Dim4,-4.2,NA,NA,NA
132,US,Dim4,-13.53,NA,NA,NA
133,US,Dim4,-10.84,NA,NA,NA
134,US,Dim4,-1.04,NA,NA,NA
135,US,Dim4,-8.02,NA,NA,NA
136,US,Dim4,-14.65,NA,NA,NA
137,US,Dim4,-6.39,NA,NA,NA
138,US,Dim4,-3.69,NA,NA,NA
139,US,Dim4,-11.62,NA,NA,NA
140,US,Dim4,-3.02,NA,NA,NA
141,US,Dim4,-28.84,NA,NA,NA

I am trying to create a grouped box plot (uisng a function) with mean values shown in the box plots for each group. The code is below:

attach(data_Blogs)    
plotgraph <- function(x, y, colour, min, max){

      plot1 <- ggplot(dims_Blog, aes_string(x = x, y = y, fill = colour)) +
        geom_boxplot()+
        labs(color=colour) +
        #scale_y_continuous(breaks=c(seq(min,max,5)), limits = c(min, max))+
        labs(x="Dimensions", y="Dimension Score") +
        scale_fill_grey(start = 0.3, end = 0.7) + 
        theme_grey()+
        theme(legend.justification = c(1, 1), legend.position = c(1, 1))+
        geom_text(data= melt(with(dims_Blog, tapply(eval(parse(text=y)),list(eval(parse(text=x)),eval(parse(text=colour))), mean)),varnames=c("Dimension","Region"),value.name="med"),
                  aes_string(y = "med",x=x, label = "round(med,3)"),position=position_dodge(width = 0.8),size = 3, vjust = -0.5,colour="white")
      return(plot1)
    }
    plot1 <- plotgraph ("Dimension", "BlogsInd.", "Region")

I am having problem to understand the part starting with "geom_text" where the data is passed on for mean value. The data frame is being melted (long to wide format) which I think is not required in this scenario as the data is already in wide format. I tried to use 'stats_summary' function with no success. Your help will be great in helping me find the solution.

Indeed melting the data seems superfluous. Rather, you should summarise the data, for instance with dplyr :

library(dplyr)
ggplot(dims_Blog, aes(x=Dimension, y=BlogsInd., fill=Region)) +
  geom_boxplot() +
  geom_text(data = dims_Blog %>% group_by(Dimension, Region) %>% summarise(mean = mean(BlogsInd.)), 
            aes(x = Dimension, y = mean, label = round(mean, 2)), 
            position = position_dodge(width = .7))

And then fine-tune your positioning / formatting.

edit: I did not click through to your previous question, which already extends the above example to prevent NSE in a programming context. So use group_by_ and aes_string in your function.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM