I have the data such as below
| Year | Qrtrs |Type |Freq |
| 1950 | JAS | BS | 1 |
| 1950 | OND | BS | 1 |
| 1950 | OND | BS | 1 |
| 1950 | OND | BY | 2 |
I would like to produce like this
+------+------+-----+----+----+
| Year | Qrts | All | BS | BY |
+------+------+-----+----+----+
| 1950 | JAS | 1 | 1 | 0 |
| 1950 | OND | 4 | 2 | 2 |
+------+------+-----+----+----+
I tried using tapply, but didn't get what I wanted.
Thank you!
Your sample data, copyable and usable:
dat <- read.table(header=TRUE, stringsAsFactors=FALSE, text='
Year Qrtrs Month Type freq
1 1950 JAS 9 BS 1
2 1950 OND 10 BS 1
3 1950 OND 11 BS 1
4 1950 OND 12 BY 1 ')
Using dplyr
:
library(dplyr)
dat %>%
group_by(Year, Qrtrs, Type) %>%
summarize(count = sum(freq)) %>%
tidyr::spread(Type, count, fill=0) %>%
mutate(All = sum(BS:BY))
# # A tibble: 2 x 5
# # Groups: Year, Qrtrs [2]
# Year Qrtrs BS BY All
# <int> <chr> <dbl> <dbl> <int>
# 1 1950 JAS 1 0 1
# 2 1950 OND 2 1 3
Using data.table
:
library(data.table)
DT <- as.data.table(dat)
dcast(DT[,.(count = sum(freq)), by=c("Year", "Qrtrs", "Type")],
Year + Qrtrs ~ Type, value.var = "count", fun=sum)[
, All := BS+BY,][]
# Year Qrtrs BS BY All
# 1: 1950 JAS 1 0 1
# 2: 1950 OND 2 1 3
I think your expected output above is incorrect, its counts for BS
and BY
are inconsistent.
There's a base-R solution somewhere starting with
aggregate(freq ~ Year + Qrtrs + Type , data=dat, FUN=sum)
but I've run out of time ...
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.