简体   繁体   English

在R中每天,每月和每年的观察次数计数

[英]Count number of observations per day, month and year in R

I have the a dataframe in the following form (its too big to post here entirerly): 我有以下形式的数据框(它太大了,无法全部张贴在这里):

      listing_id    date    city    type    host_id availability
1   703451  25/03/2013  amsterdam   Entire home/apt 3542621 245
2   703451  20/04/2013  amsterdam   Entire home/apt 3542621 245
3   703451  28/05/2013  amsterdam   Entire home/apt 3542621 245
4   703451  15/07/2013  amsterdam   Entire home/apt 3542621 245
5   703451  30/07/2013  amsterdam   Entire home/apt 3542621 245
6   703451  19/08/2013  amsterdam   Entire home/apt 3542621 245

and so on... 等等...

I would like three new data frames. 我想要三个新的数据框。 One counting the number of observations for a particular year (2013,2012, 2011 and so on) another per month (07/2013, 06/2013 and so on) and another per day (28/05/2013, 29/05/2013 and so on). 一个统计特定年份(2013、2012、2011等)的观测值,另一个统计每月(07 / 2013、06 / 2013等),另一个统计每天(28/05 / 2013,29 / 05 /) 2013等)。 I just want to count how many occurances there are per unit of time. 我只想计算每单位时间发生的次数。

How would I do that? 我该怎么做?

Using data.table , this is pretty straightforward: 使用data.table ,这非常简单:

library(data.table)
dt <- fread("listing_id    date    city    type    host_id availability
703451  25/03/2013  amsterdam   Entire_home/apt 3542621 245
703451  20/04/2013  amsterdam   Entire_home/apt 3542621 245
703451  28/05/2013  amsterdam   Entire_home/apt 3542621 245
703451  15/07/2013  amsterdam   Entire_home/apt 3542621 245
703451  30/07/2013  amsterdam   Entire_home/apt 3542621 245
703451  19/08/2013  amsterdam   Entire_home/apt 3542621 245")
dt$date <- as.Date(dt$date, "%d/%m/%Y")

dt[, .N, by=year(date)] 
#    year N
# 1: 2013 6

dt[, .N, by=.(year(date), month(date))] 
#    year month N
# 1: 2013     3 1
# 2: 2013     4 1
# 3: 2013     5 1
# 4: 2013     7 2
# 5: 2013     8 1

dt[, .N, by=date] # or: dt[, .N, by=.(year(date), month(date), day(date)] 
#          date N
# 1: 2013-03-25 1
# 2: 2013-04-20 1
# 3: 2013-05-28 1
# 4: 2013-07-15 1
# 5: 2013-07-30 1
# 6: 2013-08-19 1

We can convert the 'date' column to Date class, extract the year using the ?year from library(lubridate) , get the month-year using as.yearmon from library(zoo) . 我们可以在“日期”列转换为Date类,提取year使用?yearlibrary(lubridate)使用获得月-年as.yearmonlibrary(zoo) We place the 'dates', 'yr', 'monyr' in a list , loop through it ( lapply ), and create the count of occurance column in the original dataset ('df1') using ave . 我们将'date','yr','monyr'放入list ,循环遍历( lapply ),然后使用ave在原始数据集('df1')中创建出现次数列。 It is better to place the datasets in the list . 最好将数据集放在list However, if you insist, we can overload the global environment with multiple objects using list2env . 但是,如果您坚持认为,我们可以使用list2env在全局环境中添加多个对象。

library(zoo)
library(lubridate)
dates <- as.Date(df1$date, '%d/%m/%Y')
yr <- year(dates)
monyr <- as.yearmon(dates)
lst <- lapply(list(dates, yr, monyr), function(x) 
       transform(df1, Count=ave(seq_along(x), x, FUN= length)))
names(lst) <- paste0('newdf', seq_along(lst))
list2env(lst, envir=.GlobalEnv)

Get your index into Postxct format, then: 将索引获取为Postxct格式,然后:

counts <- data.frame(table(as.Date(index(my_data_frame))))

Change as.Date as necessary. 根据需要更改as.Date

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM