简体   繁体   English

在R中的分组箱图上显示p值

[英]Showing p-value on grouped boxplot in r

I want to show p-values on above my data (possibly with arcs). 我想在数据上方显示p值(可能使用圆弧)。 My data is below: 我的数据如下:

ID  Blog    Region  Dimension   Score
1   Blog1   PK  Info. vs. P. Focus  -4.75
2   Blog1   PK  Info. vs. P. Focus  -5.69
3   Blog1   PK  Info. vs. P. Focus  -0.27
4   Blog1   PK  Info. vs. P. Focus  -2.76
5   Blog1   PK  Info. vs. P. Focus  -8.24
6   Blog1   PK  Addressee Focus -12.51
7   Blog1   PK  Addressee Focus -1.28
8   Blog1   PK  Addressee Focus 0.95
9   Blog1   PK  Addressee Focus -5.96
10  Blog1   PK  Addressee Focus -8.81
11  Blog1   PK  Thematic Variation  -8.46
12  Blog1   PK  Thematic Variation  -6.15
13  Blog1   PK  Thematic Variation  -13.98
14  Blog1   PK  Thematic Variation  -16.43
15  Blog1   PK  Narrative Style -4.09
16  Blog1   PK  Narrative Style -11.06
17  Blog1   PK  Narrative Style -9.04
18  Blog1   PK  Narrative Style -8.56
19  Blog1   PK  Narrative Style -8.13
20  Blog1   PK  Narrative Style -14.46
21  Blog1   PK  Info. vs. P. Focus  -4.21
22  Blog1   PK  Info. vs. P. Focus  -4.96
23  Blog1   PK  Info. vs. P. Focus  -5.48
24  Blog1   PK  Info. vs. P. Focus  -4.53
25  Blog1   PK  Info. vs. P. Focus  6.31
26  Blog1   PK  Addressee Focus -11.16
27  Blog1   PK  Addressee Focus -1.27
28  Blog1   PK  Addressee Focus -11.49
29  Blog1   PK  Addressee Focus -0.9
30  Blog1   PK  Addressee Focus -12.27
31  Blog1   PK  Thematic Variation  6.85
32  Blog1   PK  Thematic Variation  -5.21
33  Blog1   PK  Thematic Variation  -1.06
34  Blog1   PK  Thematic Variation  -2.6
35  Blog1   PK  Narrative Style -0.95
36  Blog1   PK  Narrative Style -0.82
37  Blog1   PK  Narrative Style -7.65
38  Blog1   PK  Narrative Style 0.64
39  Blog1   PK  Narrative Style -2.25
40  Blog1   PK  Narrative Style -1.58
41  Blog1   PK  Info. vs. P. Focus  -5.73
42  Blog1   PK  Info. vs. P. Focus  0.37
43  Blog1   PK  Info. vs. P. Focus  -5.46
44  Blog1   PK  Info. vs. P. Focus  -3.48
45  Blog1   PK  Info. vs. P. Focus  0.88
46  Blog1   PK  Addressee Focus -2.11
47  Blog1   PK  Addressee Focus -10.13
48  Blog1   PK  Addressee Focus -2.08
49  Blog1   PK  Addressee Focus -4.33
50  Blog1   PK  Addressee Focus 1.09
51  Blog1   US  Thematic Variation  -4.23
52  Blog1   US  Thematic Variation  -1.46
53  Blog1   US  Thematic Variation  9.37
54  Blog1   US  Thematic Variation  5.84
55  Blog1   US  Narrative Style 8.21
56  Blog1   US  Narrative Style 7.34
57  Blog1   US  Narrative Style 1.83
58  Blog1   US  Narrative Style 14.39
59  Blog1   US  Narrative Style 22.02
60  Blog1   US  Narrative Style 4.83

The code is below: 代码如下:

get_wraper <- function(width) {
  function(x) {
    lapply(strwrap(x, width = width, simplify = FALSE), paste, collapse="\n")
  }
}
plotgraph <- function(x, y, colour, min, max, incr, p_values)
{
  plot1 <- ggplot(dims_Blog, aes_string(x = x, y = y, fill = colour)) +
    geom_boxplot()+
    labs(color=colour) +
    labs(x="Dimensions", y="Score") +
    scale_fill_grey(start = 0.3, end = 0.6) +
    theme_grey()+
    theme(legend.justification = c(1, 1), legend.position = c(1, 1)) +
    scale_x_discrete(labels = get_wraper(10))+
    scale_y_continuous(breaks=c(seq(min,max,incr)), limits = c(min, max))+
    theme(panel.grid.minor.y = element_blank(), panel.grid.major.x = element_blank())+
    geom_text(data = dims_Blog %>% group_by_(x, colour) %>% summarise_(mean=paste("mean(",y,", na.rm=TRUE)")), aes_string(x=x, y="mean", label="round(mean,3)"), position=position_dodge(width=0.8), size = 3, vjust = -0.5, colour="white")+
    geom_text(data = p_values, aes_string(x="Dimension", y="height", label="val"))
  return(plot1)
}

plot graph 绘图图

plot1 <- plotgraph("Dimension", "Blog1", "Region", -30, 50, 10, p_val1)
plot1

data frame for p-values p值的数据框

Dimensions <- c("Info. vs. P. Focus", "Addressee Focus", "Thematic Variation", "Narrative Style")
val <- c("0.184", "0.079", "0.044", "\u003C.0001")
height <- c(48, 48, 48, 48)
p_val1 <-data.frame(Dimensions, val, height)

Unfortunately I am not sure how to define geom_text for showing p-values. 不幸的是,我不确定如何定义geom_text来显示p值。

Error: Aesthetics must be either length 1 or the same as the data (8): label, x, y, fill

I have tried go through a few similar questions but my limited knowledge did not let me solve the problem. 我已经尝试过一些类似的问题,但是我有限的知识并不能解决我的问题。 Any ideas? 有任何想法吗? 在此处输入图片说明

It seems that you were very close in the original post: the error message says that you need to provide label, x, y, fill attributes for every layer. 似乎您与原始帖子非常接近:错误消息表明您需要为每个图层提供label, x, y, fill属性。 (That is because you defined these attributes in the main ggplot call.) The layer that you use for p-values includes three of these aesthetics in the aes_string(x="Dimensions", y="height", label="val") . (这是因为您在主ggplot调用中定义了这些属性。)用于p值的aes_string(x="Dimensions", y="height", label="val")aes_string(x="Dimensions", y="height", label="val")包含三种美学效果) aes_string(x="Dimensions", y="height", label="val") Try adding a constant fill, like: 尝试添加一个常量填充,例如:

+ geom_text(data = p_values, aes_string(x="Dimensions", y="height", label="val"), fill="black")

or you can move the aesthetics definitions out of the main call, if you are not using multiple layers anyway: 或者,如果您仍然不使用多层,则可以将美学定义从主要调用中移出:

ggplot(dims_Blog) +
    geom_boxplot(aes_string(x = x, y = y, fill = colour)) +
    ... +
    geom_text(data = p_values, aes_string(x="Dimensions", y="height", label="val"))

Secondly, there's a typo - you refer to Dimension in the plotting call, but Dimensions in creating the p-val dataframe. 其次,还有一个错字-你是指Dimension在绘图电话,但Dimensions在创建P-VAL数据帧。

I haven't tested this without having the full dataset, though, so something additional might come up. 但是,我没有完整的数据集就没有对此进行过测试,因此可能会出现一些其他问题。

I have used annotate instead of geom_text which takes three separate vectors (?) instead of a data frame. 我用annotate代替了geom_text ,它采用了三个单独的向量(?)而不是数据帧。 The code is shown below: 代码如下所示:

annotate("text",x=Dimension,y=height,label=val)
Dimension <- c("Info. vs. P. Focus", "Addressee Focus", "Thematic Variation", "Narrative Style")
height <- c(48, 48, 48, 48)
val <- c("p=0.184", "p=0.079", "p=0.044", "p\u003C.0001")

This is not a very good solution, but at least it printed the values I wanted it to print. 这不是一个很好的解决方案,但是至少它打印了我想要打印的值。 I don't know how to expand these vectors to 8 (as my previous data frames are of this size). 我不知道如何将这些向量扩展到8(因为我以前的数据帧具有这种大小)。 That was the problem as well. 这也是问题所在。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM