使用dplyr在组内的累计计数

Question

I am trying to create a column that contains a cumulative count of another column. 我正在尝试创建一个包含另一列的累积计数的列。

My data: 我的资料：

df <- data.frame(brand = c("A","B","C","A","A","B","A","A","B","C"))

And this is my expected output: 这是我的预期输出：

    |Brand |  Count  |
    |:-----|--------:|
    |A     |        1|
    |B     |        1|
    |C     |        1|
    |A     |        2|
    |A     |        3|
    |B     |        2|
    |A     |        4|
    |A     |        5|
    |B     |        3|
    |C     |        2|

I have tried cumsum but it doesn't accept strings or factors: 我尝试过cumsum，但不接受字符串或因素：

df %>%
  group_by(Brand) %>%
  mutate(Count = cumsum(Brand))

Edit: For bonus points it would be great if the solution could be used on database tables also (SQL Server) 编辑：对于加分点，如果解决方案也可以在数据库表上使用（SQL Server），那将是很好的

Answer 1

We can create the column with rowid of 'brand' 我们可以使用rowid为“ brand”的列来创建

library(dplyr)
library(data.table)
 df %>%
    mutate(Count = rowid(brand))

Or use a row_number after grouping by 'brand' 或者按“品牌”分组后使用row_number

df %>%
    group_by(brand) %>%
    mutate(Count = row_number())

Or using data.table 或使用data.table

library(data.table)
setDT(df)[, Count := rowid(brand)]

使用dplyr在组内的累计计数

问题描述

1 个解决方案

解决方案1
1 2019-08-20 16:00:31

使用dplyr在组内的累计计数

问题描述

1 个解决方案

解决方案1 1 2019-08-20 16:00:31

解决方案1
1 2019-08-20 16:00:31