简体   繁体   English

带有 stat_summary 的 ggplot 表示以天为单位的平均时间

[英]ggplot with stat_summary for mean along time represented by days

I have this data representing the value of a variable Q1 along time.我有这个数据表示变量 Q1 随时间变化的值。 The time is not represented by dates, it is represented by the number of days since one event.时间不是用日期表示的,而是用自一个事件以来的天数表示。

https://www.mediafire.com/file/yfzbx67yivvvkgv/dat.xlsx/filehttps://www.mediafire.com/file/yfzbx67yivvvkgv/dat.xlsx/file

I'm trying to plot the mean value of Q1along time, like in here我正在尝试 plot Q1along 时间的平均值,就像在这里

Plotting average of multiple variables in time-series using ggplot 使用 ggplot 在时间序列中绘制多个变量的平均值

I'm using this code我正在使用此代码

 library(Hmisc)
    ggplot(dat,aes(x=days,y=Q1,colour=type,group=type)) +
      stat_summary(fun.data = "mean_cl_boot", geom = "smooth")

Besides the code, which does not appear to work with the new ggplot2 version, you also have the problem that your data is not really suited for that kind of plot.除了似乎不适用于新ggplot2版本的代码之外,您还有一个问题,即您的数据并不真正适合那种 plot。 This code achieves what you wanted to do:此代码实现了您想要做的事情:

dat <- rio::import("dat.xlsx")

library(ggplot2)
library(dplyr)dat %>% 
  ggplot(aes(x = days, y = Q1, colour = type, group = type)) +
  geom_smooth(stat = 'summary', fun.data = mean_cl_boot)

But the plot doesn't really tell you anything, simply because there aren't enough values in your data.但是 plot 并没有真正告诉您任何事情,仅仅是因为您的数据中没有足够的值。 Most often there seems to be only one value per day, the vales jump quickly up and down, and the gaps between days are sometimes quite big.大多数情况下,每天似乎只有一个值,谷值快速上下跳跃,有时天之间的差距很大。

You can see this when you group the values into timespans instead.当您将值分组到时间跨度时,您可以看到这一点。 Here I used round(days, -2) which will round to the nearest 100 (eg, 756 is turned into 800, 301 becomes 300, 49 becomes 0):这里我使用round(days, -2)将四舍五入到最接近的 100(例如,756 变成 800,301 变成 300,49 变成 0):

dat %>% 
  mutate(days = round(days, -2)) %>% 
  ggplot(aes(x = days, y = Q1, colour = type, group = type)) +
  geom_smooth(stat = 'summary', fun.data = mean_cl_boot)

This should be the same plot as linked but with huge confidence intervals.这应该与链接的 plot 相同,但置信区间很大。 Which is not surprising since, as mentioned, values quickly alternate between values 1-5.这并不奇怪,因为如前所述,值在 1-5 之间快速交替。 I hope that helps.我希望这会有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM