简体   繁体   中英

Aggregate data.table on conditions but add columns

I have a data.table in R that looks like the following:

> head(t)
   ITEM COUNTRY YEAR MONTH QUANTITY
1:    1       1 2014     1        2
2:    1       2 2014     1      133
3:    2       3 2014     1        2
4:    3       4 2014     1       20
5:    3       2 2014     1      142
6:    4       4 2014     1        5

There is for the same combination of ITEM-COUNTRY also data for other years/months; I have data from Jan 2014-Dec 2015 so 24 months. It might be that not all items/countries have data for all points, ie some have only for 19 months etc.

I know how to aggregate the quantity over the entire period:

agg = t[,list("TOT Q"=sum(QUANTITY)),by='ITEM,COUNTRY']

However, now I want something like this:

ITEM - COUNTRY - Q 1-2014 - Q 2-2014 - .... - Q 12-2015

In case there is no data for a specific item/country, the Q should be zero, and otherwise it should be the data in the corresponding row to the year and month. Ie I want to aggregate on item and country, however, keep all the quantities and put them in different columns.

Edit:

Tried the comment of Jaap, which dit not work in the first place. But I solved it. I was getting 0 and 1 as output and a warning of missing an aggregate function. Searching for this warning I found stackoverflow.com/questions/30463591/… which solved my problem. As I said, I have more variables, so for some item/country/month/year combinations I had duplicates, as they differed on other variables, so i needed an aggregate function (sum).

What I use now:

dcast(t, ITEM + COUNTRY ~ MONTH + YEAR, value.var="QUANTITY", fun.aggregate=sum)

Tried the comment of Jaap, which dit not work in the first place. But I solved it. I was getting 0 and 1 as output and a warning of missing an aggregate function. Searching for this warning I found stackoverflow.com/questions/30463591/… which solved my problem. As I said, I have more variables, so for some item/country/month/year combinations I had duplicates, as they differed on other variables, so i needed an aggregate function (sum).

What I use now:

dcast(t, ITEM + COUNTRY ~ MONTH + YEAR, value.var="QUANTITY", fun.aggregate=sum)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM