I currently have a dataset called "DT" that looks like:
Name
A11
B16
B16
B16
B16
B98
B98
M88
K99
K99
K99
This is a subset of the real dataset, which is around 5 million rows. What I seek to do is to find the average number of occurrences for each name. That is, if I could create a new dataset that looks like:
Count
1
4
2
1
3
then it would be very trivial to just take the column sum and divide by the length. I am currently working with the data.table package and am trying to play around with the .N feature, but haven't been able to come close. the best I've done is:
DT[,`:=` .N, by = Name]
I feel like I am missing just a little something, can anyone lead me to the right direction? Thanks!
You can do
DT[,.N,by=Name]
#> DT
# Name N
# 1: A11 1
# 2: B16 4
# 3: B98 2
# 4: M88 1
# 5: K99 3
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.