简体   繁体   中英

R return day of week position in month (first Tuesday, second month, fourth Friday, etc)

Here is my dataframe

structure(list(Date = structure(c(19016, 19017, 19018, 19019, 
19020, 19021, 19022, 19023, 19024, 19025, 19026, 19027, 19028, 
19029, 19030, 19031, 19032, 19033, 19034, 19035, 19036, 19037, 
19038, 19039, 19040, 19041, 19042, 19043, 19044, 19045, 19046, 
19047, 19048, 19049, 19050, 19051, 19052, 19053, 19054, 19055, 
19056, 19057, 19058, 19059, 19060, 19061, 19062, 19063, 19064, 
19065, 19066, 19067, 19068, 19069, 19070, 19071, 19072, 19073, 
19074, 19075, 19076, 19077, 19078, 19079, 19080, 19081, 19082
), class = "Date"), Sales = c(17230.95, 15356.85, 21123.41, 19811.86, 
24899.7, 18670.08, 14129.03, 18454.97, 20942.28, 23324.83, 20284.81, 
24456.1, 19884.73, 15473.03, 19809.02, 16871.57, 0, 25492.58, 
27801.39, 21106.03, 16089.1, 20173.34, 18964.01, 37685.96, 19639.33, 
22188.08, 18944.28, 18166.25, 21962.05, 22927.46, 22181.48, 22115.81, 
24914.86, 22782.64, 19192.26, 25431.16, 24078.59, 24288, 22021.68, 
28372.21, 20201.48, 20209.54, 23325.9, 21977.64, 23368.25, 21879.24, 
0, 21430.3, 18440.64, 24777.07, 22030.41, 25575.51, 23089.05, 
23139.19, 28243.81, 21652.69, 21952.06, 18940.91, 24881.95, 19458.29, 
25724.01, 19723.64, 20525.86, 22713.51, 0, 23620.43, 23755.2), 
    Day = structure(c(2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 
    5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 
    6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 
    7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 
    1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L), .Label = c("Sunday", 
    "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"
    ), class = c("ordered", "factor")), Month = structure(c(18993, 
    18993, 18993, 18993, 18993, 18993, 18993, 18993, 19024, 19024, 
    19024, 19024, 19024, 19024, 19024, 19024, 19024, 19024, 19024, 
    19024, 19024, 19024, 19024, 19024, 19024, 19024, 19024, 19024, 
    19024, 19024, 19024, 19024, 19024, 19024, 19024, 19024, 19052, 
    19052, 19052, 19052, 19052, 19052, 19052, 19052, 19052, 19052, 
    19052, 19052, 19052, 19052, 19052, 19052, 19052, 19052, 19052, 
    19052, 19052, 19052, 19052, 19052, 19052, 19052, 19052, 19052, 
    19052, 19052, 19052), class = "Date")), row.names = c(NA, 
-67L), class = c("tbl_df", "tbl", "data.frame"))

And a screenshot so you can see it at a glance.

在此处输入图像描述

I originally had the Date and Sales columns. By using these lines of code I was able to get the Day and Month columns.

time_series_df_sample$Day <- wday(time_series_df_sample$Date, label=TRUE, abbr=FALSE)

time_series_df_sample <- time_series_df_sample %>%
  mutate(Month = floor_date(Date, 'month'))

I'm trying to create another column called Day of Week Position in Month , or some other name. For Date == 2022-02-01 , for example, Day of Week Position in Month should say 1 because this is the first Tuesday in the month and where Date == 2022-02-08 it should say 2 because this is the second Tuesday in the month. So this new column will always have values 1-5 .

The tally() function might not be the right approach, as least as I'm using it. It just says 4 or 5 all the way through, except for the first month where I don't have as much data. And the min() function isn't doing the job - it just returns the day of week value.

time_series_and_other_stories_fl %>%
  group_by(Month, Day) %>%
  tally()

time_series_and_other_stories_fl %>%
  group_by(Month, Day) %>%
  mutate(min = min(Day))
out <- time_series_df_sample%>%
  group_by(Month)%>%
  arrange(Date)%>%
  mutate(`Day of Week Position in Month`=ave(Day==Day,Day,FUN=cumsum))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM