简体   繁体   English

绘制缺失值的年月累计数据

[英]Plotting year-month accumulative data with missing values

I have a data frame with date and count columns. 我有一个带有日期和计数列的数据框。 I need to create a bar chart with the x axis displaying the year and month and the y axis displaying the sum of the corresponding rows that fall into the appropriate period. 我需要创建一个条形图,其中x轴显示年和月,y轴显示落入适当时期的相应行的总和。

data <- data.frame(Date = as.Date(c("01/01/2014","02/01/2014","03/03/2014","07/08/2014","08/08/2014","09/08/2014","10/10/2014"),  "%d/%m/%Y"))

x <- as.Date(data$Date)
y <- sample(10, length(x))
tmp <- data.frame(dt = format(x, "%Y-%m"), cnt = y, stringsAsFactors = FALSE)

# # Pre-Allocate the table
# minYr = min(as.numeric(strftime(data$Date, "%Y")))
# maxYr = min(as.numeric(strftime(data$Date, "%Y")))
# # The table will contain the number of months in a year.
# n <- (maxYr - minYr + 1) * 12
# dt <- character(n)
# cnt <- numeric(n)
# for (i in minYr:maxYr) {
#     for (j in c("01","02","03","04","05","06","07","08","09","10","11","12")) {
#         lev <- (i - minYr) * 12 + as.numeric(j)
#         dt[lev] <- paste0(as.character(i),"-",j,"-01")
#         cnt[lev] <- 0
#     }
# }
# dt = as.Date(dt, format="%Y-%m-%d")
# tmp <- data.frame(dt = format(dt, "%Y-%m"), cnt, stringsAsFactors = FALSE)
# tmp <- rbind(tmp, data.frame(dt = format(x, "%Y-%m"), cnt = y, stringsAsFactors = FALSE))
# 

tmp2 <- aggregate(cnt ~ dt, tmp, sum)

g <- ggplot(tmp2, (aes(x = dt, y = cnt)))
g + geom_bar(stat="identity")

The code above plots the data but if there as no transactions for a particular month these will not show. 上面的代码绘制了数据,但如果特定月份没有交易,则不会显示这些交易。 I want the chart to show missing months with a value of zero. 我希望图表显示值为零的缺失月份。

The remarked chunk of code preallocates each month within the period with zeroes and gives me the desired answer but I was wondering whether I can avoid it by tapping into built in ggplot functionality. 上面标记的代码块会在零期间内每月预先分配一个月,并为我提供所需的答案,但是我想知道是否可以通过利用内置的ggplot功能来避免它。

You can use scale_x_date to achieve this. 您可以使用scale_x_date实现此目的。 But you need to have to change the x variable to Date class. 但是您需要将x变量更改为Date类。

library(scales)
g <- ggplot(tmp2, (aes(x = as.Date(paste0(dt, '-01')), y = cnt)))
g + geom_bar(stat="identity") + 
  scale_x_date(name='dt', breaks = date_breaks("month"), labels = date_format('%Y-%m'))

EDIT: To get the desired width of the bars, you could add the width argument to geom_bar : 编辑:要获得所需的条形宽度,可以将width参数添加到geom_bar

g + geom_bar(stat="identity", width=28) + 
  scale_x_date(name='dt', breaks = date_breaks("month"), labels = date_format('%Y-%m'))

And of course you can also add limits in the scale_x_date argument, to make it start and end at the desired place: 当然,您也可以在scale_x_date参数中添加限制,以使其在所需位置开始和结束:

g + geom_bar(stat="identity", width=28) + 
  scale_x_date(name='dt', 
               breaks = date_breaks("month"), 
               labels = date_format('%Y-%m'), 
               limits=as.Date(c('2014-01-01', '2014-12-01')))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM