繁体   English   中英

ggplot2 stat_summary 首先转换数据,在对数刻度中标记错误条

[英]ggplot2 stat_summary transforms data first, which labels errorbars wrong in logarithmic scale

我有一个匹配的“前”和“后”数据集,并希望 plot 几何平均值和 SD 在对数刻度的 plot 行中(见下图)。 由于stat_summary() function 转换数据然后进行计算,所以左图中绘制的几何平均值和 SD 不正确。 几何平均 SD 在对数尺度上应该是对称的,而在 plot(左图中的“pre”组)中则不是。

我知道coord_trans()不进行计算,应该完成这项工作。 然而,对数刻度中的连接线不是直的,这对于可视化来说看起来有点奇怪。

是否有针对 plot 几何平均值和 SD 的解决方案,该方法是根据原始数据以及对数刻度的直线连接线计算得出的?

data_raw = data.frame(ID=c(1,2,3,4,5,6,7,8,9,10,11,12), 
                      Group=c(rep("before",12),rep("post",12)),
                      Values=c(15,60,70,300,40,35,100,1520,102,172,141,103,1200,130,
                               118,158,199,5804,1258,4582,4052,3332,2202,5129))

data_sorted <- data_raw %>% arrange(ID, Group)

left=ggplot(data_sorted, aes(Group,Values))+
  geom_line(aes(group = ID),colour = "gray",linetype= 2,position = position_jitter(width = 0.25, seed = 1))+
  geom_point(size = 1.2, position = position_jitter(width = 0.25, seed = 1))+
  stat_summary(fun = function(x) {exp(mean(log(x)))}, geom="crossbar")+
  stat_summary(fun = function(x) {exp(mean(log(x)))*exp(sd(log(x)))}, geom="crossbar", width=0.4, size=0.1)+
  stat_summary(fun = function(x) {exp(mean(log(x)))/exp(sd(log(x)))}, geom="crossbar", width=0.4, size=0.1)+
  scale_y_log10(breaks = trans_breaks("log10", function(x) 10^x), labels = trans_format("log10", math_format(10^.x)))+
  theme(text = element_text(size = 20))

right=ggplot(data_sorted, aes(Group,Values))+
  geom_line(aes(group = ID),colour = "gray",linetype= 2,position = position_jitter(width = 0.25, seed = 1))+
  geom_point(size = 1.2, position = position_jitter(width = 0.25, seed = 1))+
  stat_summary(fun = function(x) {exp(mean(log(x)))}, geom="crossbar")+
  stat_summary(fun = function(x) {exp(mean(log(x)))*exp(sd(log(x)))}, geom="crossbar", width=0.4, size=0.1)+
  stat_summary(fun = function(x) {exp(mean(log(x)))/exp(sd(log(x)))}, geom="crossbar", width=0.4, size=0.1)+
  coord_trans(y="log10")+
  scale_y_continuous(breaks = trans_breaks("log10", function(x) 10^x), labels = trans_format("log10", math_format(10^.x)))+
  theme(text = element_text(size = 20))

ggarrange(left,right)

在此处输入图像描述

只是为您指出错误,“post”组的实际几何平均值(粗横线)> 1000(右图)。 但是,它在左图中显示 <1000。

欧几里得空间中的几何平均值与对数空间中的算术平均值相同。 因为 y 尺度对数转换您的数据,所以您只需将exp(mean(log(x)))*exp(sd(log(x)))等函数转换为mean(x) + sd(x) (回想一下exp(log(A) + log(B)) == A * B )。

library(ggplot2)
library(ggpubr)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(scales)

data_raw = data.frame(ID=c(1,2,3,4,5,6,7,8,9,10,11,12), 
                      Group=c(rep("before",12),rep("post",12)),
                      Values=c(15,60,70,300,40,35,100,1520,102,172,141,103,1200,130,
                               118,158,199,5804,1258,4582,4052,3332,2202,5129))

data_sorted <- data_raw %>% arrange(ID, Group)

left=ggplot(data_sorted, aes(Group,Values))+
  geom_line(aes(group = ID),colour = "gray",linetype= 2,position = position_jitter(width = 0.25, seed = 1))+
  geom_point(size = 1.2, position = position_jitter(width = 0.25, seed = 1))+
  stat_summary(fun = function(x) {mean(x)}, geom="crossbar")+
  stat_summary(fun = function(x) {mean(x) + sd(x)}, geom="crossbar", width=0.4, size=0.1)+
  stat_summary(fun = function(x) {mean(x) - sd(x)}, geom="crossbar", width=0.4, size=0.1)+
  scale_y_log10(breaks = trans_breaks("log10", function(x) 10^x), labels = trans_format("log10", math_format(10^.x)))+
  theme(text = element_text(size = 20))

right=ggplot(data_sorted, aes(Group,Values))+
  geom_line(aes(group = ID),colour = "gray",linetype= 2,position = position_jitter(width = 0.25, seed = 1))+
  geom_point(size = 1.2, position = position_jitter(width = 0.25, seed = 1))+
  stat_summary(fun = function(x) {exp(mean(log(x)))}, geom="crossbar")+
  stat_summary(fun = function(x) {exp(mean(log(x)))*exp(sd(log(x)))}, geom="crossbar", width=0.4, size=0.1)+
  stat_summary(fun = function(x) {exp(mean(log(x)))/exp(sd(log(x)))}, geom="crossbar", width=0.4, size=0.1)+
  coord_trans(y="log10")+
  scale_y_continuous(breaks = trans_breaks("log10", function(x) 10^x), labels = trans_format("log10", math_format(10^.x)))+
  theme(text = element_text(size = 20))

ggarrange(left,right)

代表 package (v2.0.1) 于 2022 年 9 月 27 日创建

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM