I want to create some summary statistics of a data.table
, aggregating by month and year of the date column. Here's what I start with:
> head(monthly)
betnr persnr idnum frau gebjahr te_med month tentgelt status
1: 50536344 62181514 40442 1 1960 76.52142 1993-12-01 0.5777598 fire
2: 50536344 62744472 40442 0 1963 76.52142 1993-08-01 0.5777598 fire
3: 50536344 63071749 40442 0 1947 76.52142 1993-12-01 0.5777598 fire
4: 50536344 63385685 40442 1 1946 76.52142 1993-07-01 0.5777598 fire
5: 50536344 63918388 40442 0 1952 76.52142 1993-12-01 0.5777598 fire
6: 50536344 61961225 40442 0 1980 71.90094 1994-12-01 23.1001672 fire
To create my statistics, I then run
statistics2 <- monthly[, list(NOBS = .N, MWAGE=mean(tentgelt)), by=list(status, month=format(month, '%m-%Y'))]
This creates the correct statistics, but the month
column now contains a string. I try to change the type to date by fixing the days to be 01
always:
x <-apply(statistics2, 1, function(x) paste('01-',x['month'], sep=''))
statistics2[, month:= as.Date(x, '%d-%m-%Y')]
Which gives me the desired output:
> head(statistics2)
status month NOBS MWAGE
1: hire 1993-01-01 37914 0.5820961
2: normal 1993-01-01 790 0.5787695
3: hire 1994-01-01 6471 15.1267445
4: normal 1994-01-01 23931 22.8101928
5: hire 1993-02-01 435 0.5946736
6: normal 1993-02-01 38661 0.5820226
However, my whole approach feels a bit cloddy. Is there a cleaner way of getting the desired output?
Yes, you can make it simpler and do everything in one go. Just do the whole conversion to Date
class in the aggregation process
statistics2 <- monthly[, list(NOBS = .N,
MWAGE = mean(tentgelt)),
by = list(status, month = as.Date(format(month, '%Y-%m-01')))]
statistics2
# status month NOBS MWAGE
# 1: fire 1993-12-01 3 0.5777598
# 2: fire 1993-08-01 1 0.5777598
# 3: fire 1993-07-01 1 0.5777598
# 4: fire 1994-12-01 1 23.1001672
Some side notes:
Your apply
method is not how you should have done this with data.table
. You could accomplish your last step simply by doing:
statistics2[, month := as.Date(paste0("01-", month), "%d-%m-%Y")]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.