简体   繁体   English

计算 ggplot2 中堆叠条的累积和

[英]Calculating the cumulative sum for stacked bars in ggplot2

Assume an R data frame ( testData ) that contains three columns (named DATE , FREQ_RECORDS , and CRITERION ) and paired data (column CRITERION contains the values "positive" or "negative").假设 R 数据帧 ( testData ) 包含三列(名为DATEFREQ_RECORDSCRITERION )和配对数据(列CRITERION包含值“正”或“负”)。

testData = structure(list(DATE = structure(c(18140, 18140, 18170, 18170, 18201, 18201), class = "Date"), FREQ_RECORDS = c(57L, 120L, 302L, 64L, 40L, 20L), CRITERION = structure(c(1L, 2L, 1L, 2L, 1L, 2L), .Label = c("positive", "negative"), class = "factor")), row.names = c(395L, 756L, 396L, 757L, 397L, 758L), class = "data.frame")

I would like to visualize the data via ggplot2 as dodged bars that are cumulative within (but not across) the pairing factor (ie, the final bars should have a height of 57+302+40=399 for "positive" and 120+64+20=204 for "negative").我想通过ggplot2将数据可视化为在配对因子内(但不跨越)累积的闪避条(即,对于“正”和 120+64,最终条的高度应该为 57+302+40=399 +20=204 表示“负数”)。

I incorrectly believed that the following code would produce such a plot:我错误地认为以下代码会产生这样的 plot:

ggplot(data=testData, aes(x=DATE, y=cumsum(testData[,"FREQ_RECORDS"]), fill=CRITERION), width=1) + 
    geom_bar(stat="identity", position="dodge", alpha=0.5) + 
    theme_minimal()

在此处输入图像描述

What is incorrect about the above code in order to obtain the desired result and how would I need to correct it?为了获得所需的结果,上述代码有什么不正确之处,我需要如何纠正它? Note: I believe it is an issue of how the cumulative sum is calculated (ie, cumsum(testData[,"FREQ_RECORDS"] ), but am uncertain about the details.注意:我认为这是如何计算累积和的问题(即cumsum(testData[,"FREQ_RECORDS"] ),但不确定细节。

When you do cumsum(testData[,"FREQ_RECORDS"]), it is applied onto all of FREQ_RECORDS.当您执行 cumsum(testData[,"FREQ_RECORDS"]) 时,它将应用于所有 FREQ_RECORDS。 The grouping by fill=.. will separately your x and y values accordingly and plot..按填充=.. 分组将相应地分开您的 x 和 y 值以及 plot..

So maybe try this, unfortunately you cannot plot it on the fly (I think):所以也许试试这个,不幸的是你不能在运行中 plot 它(我认为):

df<-testData %>% 
group_by(CRITERION) %>%
mutate(CUMFREQ=cumsum(FREQ_RECORDS))

ggplot(data=df, aes(x=DATE, y=CUMFREQ, fill=CRITERION), width=1) + 
    geom_bar(stat="identity", position="dodge", alpha=0.5) + 
    theme_minimal()

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM