简体   繁体   中英

Sort BoxPlot on Date

How do I sort by Date ?

my data is below:

EXPIRE_DATE is character, so I made another true (Date) column using mutate.

I feel like I am close but how do I sort either by Descending or Ascending order ?

       EXPIRE_DATE     mean        sd           Date
                 (chr)    (dbl)     (dbl)          (chr)
1             04/30/17 56.75132 103.75048     April 2017
2             08/30/17 30.36706  46.12009    August 2017
3             08/31/17 42.84366  67.79964    August 2017
4             12/30/17 26.88593  23.60440  December 2017
5             12/31/17 38.67540  58.72461  December 2017
6             02/28/18 42.50570  63.91448  February 2018
7             01/30/18 28.60205  44.85719   January 2018
8             01/31/18 70.80121 134.13060   January 2018
9             07/31/17 45.45389  77.15242      July 2017
10            06/30/17 47.73592  81.88312      June 2017
11            05/30/17 46.38233  53.73065       May 2017
12            05/31/17 52.25520  88.89367       May 2017
13            11/30/17 39.27158  66.40248  November 2017
14            10/31/17 40.43197  71.51545   October 2017
15            09/30/17 43.12762  79.27168 September 2017

the code that makes is below:

list_mean_sd <- EXPIRING %>% 
                group_by(EXPIRE_DATE) %>% 
                summarize( mean = mean(TOTAL), sd = sd(TOTAL)  ) %>%
                mutate( Date = format(as.Date(EXPIRE_DATE, "%m/%d/%y"), format="%B %Y") )

my ultimate goal is to create a Box Plot with the Date sorted so it doesn't look weird..

boxplot(mean ~ Date, data = list_mean_sd, outline = FALSE) 

this is what I am getting..

在此处输入图片说明

dput(head(EXPIRING, 15))
structure(list(KEY = c(9495, 9541, 9638, 9717, 9743, 
9921, 10048, 10053, 10061, 10067, 10254, 10343, 24825, 25016, 
25162), TOTAL = c(20, 240, 91.04, 20, 140, 100, 
301.2, 40, 540, 469.82, 40, 140, 133.09, 1700, 20), EXPIRE_DATE = c("11/30/17", 
"01/31/18", "01/31/18", "12/31/17", "12/31/17", "01/31/18", "04/30/17", 
"07/31/17", "01/31/18", "01/31/18", "01/31/18", "01/31/18", "01/31/18", 
"01/31/18", "06/30/17")), .Names = c("KEY", "TOTAL", 
"EXPIRE_DATE"), row.names = c(NA, 15L), class = "data.frame")

added:

dput(head(list_mean_sd, 30))
structure(list(EXPIRE_DATE = c("01/30/18", "01/31/18", 
"02/28/18", "04/30/17", "05/30/17", "05/31/17", "06/30/17", "07/31/17", 
"08/30/17", "08/31/17", "09/30/17", "10/31/17", "11/30/17", "12/30/17", 
"12/31/17"), mean = c(28.6020454545455, 70.8012116673021, 42.5057014558283, 
56.751320667367, 46.3823270440252, 52.2552028540308, 47.7359164733179, 
45.4538902012763, 30.3670622064929, 42.843660721111, 43.1276177589063, 
40.4319721861389, 39.2715832825871, 26.8859251197214, 38.6753964550534
), sd = c(44.857189842357, 134.130597512432, 63.9144788499397, 
103.750483732426, 53.7306532607393, 88.8936749200348, 81.8831227378872, 
77.1524193002944, 46.1200886362958, 67.7996403857795, 79.2716764935199, 
71.5154547562237, 66.4024797158997, 23.6044043594643, 58.7246098554578
), Date = c("January 2018", "January 2018", "February 2018", 
"April 2017", "May 2017", "May 2017", "June 2017", "July 2017", 
"August 2017", "August 2017", "September 2017", "October 2017", 
"November 2017", "December 2017", "December 2017")), .Names = c("EXPIRE_DATE", 
"mean", "sd", "Date"), class = c("tbl_df", "data.frame"), row.names = c(NA, 
-15L))

Updated match discussion.

You can force boxplot to put the data in the order you want by making it a factor with the levels sorted the way that you want.

DateOrder = order(as.Date(list_mean_sd$EXPIRE_DATE, "%m/%d/%y"))
list_mean_sd$Date = factor(list_mean_sd$Date, 
    levels = unique(list_mean_sd$Date[DateOrder]))
boxplot(mean ~ Date, data = list_mean_sd, cex.axis=0.65)

有序箱线图

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM