简体   繁体   中英

r ggplot with zeroes and no comma as the big number separator

I am trying to count instances by month, plot them on a graph and adding monthly counts to the top of the bars as labels. Below is a reproducible example of the problem I have:

library(scales)
library(ggplot2)

set.seed(1)

df <- data.frame(DueDate = as.Date(paste("2015", 
sample(1:6, 6000, replace=T), 
sample(1:30, 6000, replace=T), sep = "-")),
stringsAsFactors = F)

ggplot(df, aes(as.Date(cut(DueDate,
  breaks = "month")) )) + 
  geom_bar() +
  geom_text(stat = 'bin', 
            aes(label = ..count..),
            vjust = -1, 
            size = 2) +
  scale_y_continuous(labels = comma) +
 labs(x = "Month", y = "Frequency") + 
  theme_minimal()

The issue is that when I create the plot there are 0s between the bars and the numbers on top of the bars do not have commas as the big number separator.

在此处输入图片说明

Corrected a couple of errors that were in my comments above. Sampling from a Date-sequence lets you count the 31st days of the month and avoid the NA's from the 29-30th nondays in Feb.

set.seed(1)

df <- data.frame(DueDate = format(
         sample( 
             seq( as.Date("2015-01-01"), 
                  as.Date("2015-06-30"), by="1 day") ,  
             6000,replace=T),     "%b"),
                 stringsAsFactors = F)
    #  This does all the aggregation in one step.
    #  Could probably leave them as Dates and use `format` in the `aes` call
ggplot(df, aes(DueDate)) + 
  geom_bar() +
  geom_text(stat = 'bin', 
            aes(label = formatC(..count.., big.mark=",") ),
            vjust = -1, 
            size = 2) +
  scale_y_continuous(labels = comma) +
 labs(x = "Month", y = "Frequency") + 
  theme_minimal()

Multiplied sample size by two to show that the comma -argument to the y-scale was working.

在此处输入图片说明

You can make a new column for the month, and then make the plot. I use the lubridate package to help deal with dates in R.

# Functions to help handle dates
library(lubridate)

# Make a new month column
df$month <- month(df$DueDate, label = TRUE)

# Plot with aes(month)
ggplot(df, aes(month)) + 
  geom_bar() +
  geom_text(stat = 'bin', 
            aes(label = ..count..),
            vjust = -1, 
            size = 2) +
  scale_y_continuous(labels = comma) +
  labs(x = "Month", y = "Frequency") + 
  theme_minimal()

There are some NAs in the data, indicated by the last bar in the plot. This is likely due to invalid dates created for February when you generated the data (eg there is no February 30).

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM