繁体   English   中英

如何计算总金额取决于 r 中的日期?

[英]How to calculate the total amount depend on the date in r?

我是 R 的新手,在计算每个月的账单金额时遇到问题。 我的 dataframe 如下:

dat <- data.frame(
  time = factor(c("Breakfast","Breakfast","Breakfast","Breakfast","Breakfast","Breakfast"), levels=c("Breakfast")), date=c("2020-01-20","2020-01-21","2020-01-22","2020-02-10","2020-02-11","2020-02-12"),
  total_bill = c(12.7557,14.8,17.23,15.7,16.9,13.2)
)

我的目标是计算每个月在Breakfast上的花费,所以在这里我们有两个月,我想分别得到一月和二月的总和。

对此的任何帮助将不胜感激。 谢谢!

这回答了你的问题了吗?

sums <- tapply(dat$total_bill, format(as.Date(dat$date), "%B"), sum)
February  January 
 45.8000  44.7857 

sums是一个列表:因此,例如,如果您想访问 2 月的数据,您可以这样做:

sums[1]
February 
    45.8

或者,您可以将sums转换为 dataframe 并通过月份名称访问每月总和:

sums <-  as.data.frame.list(tapply(dat$total_bill, format(as.Date(dat$date), "%B"), sum))
sums$February 
    45.8

加法

另一个(有趣的)解决方案是通过正则表达式:您将日期定义为一个模式,并使用sub plus backreference \\1来调用破折号之间的两个数字,将它们减少到月份部分:

tapply(dat$total_bill, sub("\\d{4}-(\\d{2})-\\d{2}", "\\1", dat$date), sum)
     01      02 
44.7857 45.8000  

我们可以将“日期”转换为Date class,获取month ,并将其用作分组列并对“total_bill” sum

library(dplyr)
dat %>%
    group_by(time, Month = format(as.Date(date), "%B")) %>% 
    summarise(total_bill = sum(total_bill, na.rm = TRUE))
# A tibble: 2 x 3
# Groups:   time [1]
#  time      Month    total_bill
#  <fct>     <chr>         <dbl>
#1 Breakfast February       45.8
#2 Breakfast January        44.8

如果需要,我们可以将其转换为“宽”格式

library(tidyr)
out <- dat %>%
     group_by(time, Month = format(as.Date(date), "%B")) %>% 
     summarise(total_bill = sum(total_bill, na.rm = TRUE)) %>% 
     pivot_wider(names_from = Month, values_from = total_bill)

out
# A tibble: 1 x 3
# Groups:   time [1] 
#   time      February January
#  <fct>        <dbl>   <dbl>
# 1 Breakfast     45.8    44.8

如果我们还需要按“年”分组

out <- dat %>%
     mutate(date = as.Date(date)) %>%
     group_by(time, Year = format(date, "%Y"), Month = format(date, "%B")) %>% 
     summarise(total_bill = sum(total_bill, na.rm = TRUE)) 
library(dplyr)
d_sum <- dat %>% 
  group_by(substr(date, 0, 7)) %>%
  summarise(sum = sum(total_bill))

d_sum
# A tibble: 2 x 2
  `substr(date, 0, 7)`   sum
  <chr>                <dbl>
1 2020-01               44.8
2 2020-02               45.8

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM