简体   繁体   English

R和ggplot-图分布和线

[英]R and ggplot - plot distribution and line

I have a question that seems very simple, but I cannot seem to figure out. 我有一个看起来很简单的问题,但似乎无法弄清楚。 I have a dataset with treatments in a given year. 我有一个给定年份的治疗数据集。 There are 3 different treatments. 有3种不同的治疗方法。 I would like to create two plots: 我想创建两个图:

One that looks like this: 看起来像这样的一个:

面积图

And one that looks like this: 看起来像这样的一个:

散点图

, except, I would like to stack multiple treatments (three instead of just the one in the example). ,除了,我想堆叠多种处理方式(三种而不是示例中的一种)。

Let's say we have the follow df: 假设我们有以下df:

y=c(2001,2001,2001,2001,2002,2002,2002,2003,2003,2003,2003,2004,2004)
t=c("a","a","b","c","a","a","b","c","a","a","b","c","b")
df=data.frame(y,t)

I've tried using 我试过使用

geom_plot()

But it does not work. 但这行不通。 The closest I could get to have R do the proportions for me is the following stacked histogram using code from another post: 我可以让R最接近地为我做比例的是使用来自另一篇文章的代码的以下堆叠直方图:

p+geom_histogram(aes(y=..density.., color=t , fill=t))

For the types of charts you show, you'll need to compute the proportions before you plot. 对于显示的图表类型,在绘制之前需要计算比例。 The table function can be used to do the counts of t by year and t . table功能可用于按年和tt进行计数。 ave with sum by y then computes the annual sums for the proportions. ave with sum by y然后计算比例的年度总和。 Your first plot is made with geom_area while the second is a standard line and point plot. 您的第一个图是用geom_area而第二个是标准线和点图。 The code could look like 代码看起来像

library(ggplot2)
y=c(2001,2001,2001,2001,2002,2002,2002,2003,2003,2003,2003,2004,2004)
t=c("a","a","b","c","a","a","b","c","a","a","b","c","b")
df=data.frame(y, t)

# Count number of t's by year 
  df_tab <- as.data.frame(table(df), stringsAsFactors=FALSE)
# convert counts to percents
  df <-  data.frame(df_tab, p=df_tab$Freq/ave(df_tab$Freq, df_tab$y, FUN=sum))
  df$y <- as.numeric(df$y)
# Set plot colors and themes
  plot_colours <- c(a="red3", b = "orange", c = "blue")
  plot_theme <- theme(axis.title = element_text(size = 18 )) +
                 theme(axis.text = element_text(size = 18)) +
                 theme(legend.position="top", legend.text=element_text(size=18))
# make area plot
  sp <- ggplot(data=df, aes(x=y, y= 100*p, fill=t)) + geom_area()
  sp <- sp + scale_fill_manual(values=plot_colours)
  sp <- sp + labs(x="Year", y = "Percentage of Patients")
  sp <- sp + plot_theme
  plot(sp)

# make line plot
  sp <- ggplot(data=df, aes(x=y, y=p, colour=t))
  sp <- sp + geom_line(aes(ymax=1), position="stack", size=1.05) + geom_point(aes(ymax=1), position="stack", size=4)
  sp <- sp + scale_colour_manual(values=plot_colours)
  sp <- sp + labs(x="Year", y = "Proportion Receiving Treatment")
  sp <- sp + plot_theme
  plot(sp)

which produces the plots 产生地块 在此处输入图片说明

and

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM