[英]Loop over all observations with a unique value of a variable/column i R
I have dataset that looks like this:我有如下所示的数据集:
Chain链 | Product产品 | Week星期 | Sales销售量 |
---|---|---|---|
Chain1链1 | Prod1产品1 | 1 1 | 0 0 |
Chain1链1 | Prod1产品1 | 2 2 | 0 0 |
-------- -------- | -------- -------- | ----- ----- | ----- ----- |
Chain1链1 | Prod1产品1 | 51 51 | 10 10 |
Chain1链1 | Prod1产品1 | 52 52 | 14 14 |
Chain2链2 | Prod1产品1 | 1 1 | 10 10 |
Chain2链2 | Prod1产品1 | 2 2 | 11 11 |
-------- -------- | -------- -------- | ----- ----- | ----- ----- |
Chain2链2 | Prod1产品1 | 51 51 | 12 12 |
Chain2链2 | Prod1产品1 | 52 52 | 15 15 |
Chain1链1 | Prod2产品2 | 1 1 | 3 3 |
Chain1链1 | Prod2产品2 | 2 2 | 4 4 |
-------- -------- | -------- -------- | ----- ----- | ----- ----- |
Chain1链1 | Prod2产品2 | 51 51 | 8 8 |
Chain1链1 | Prod2产品2 | 52 52 | 10 10 |
Chain2链2 | Prod2产品2 | 1 1 | 11 11 |
Chain2链2 | Prod2产品2 | 2 2 | 12 12 |
-------- -------- | -------- -------- | ----- ----- | ----- ----- |
Chain2链2 | Prod2产品2 | 51 51 | 15 15 |
Chain2链2 | Prod2产品2 | 52 52 | 7 7 |
This means I have weekly observations of sales for different products and chains.这意味着我每周都会观察不同产品和连锁店的销售情况。 I would like to create an innovation dummy-variable which is equal to 1 when a new product is launched.我想创建一个创新虚拟变量,当推出新产品时它等于 1。 This is the case for Product 1 in Chain 1 in week 51 (here, the sale goes from 0 to 10 in week 51, assuming that the sales are 0 between week 2 and 50).第 51 周的链 1 中的产品 1 就是这种情况(这里,假设第 2 周和第 50 周之间的销售额为 0,第 51 周的销售额从 0 变为 10)。 I then want my dummy, I, to be 1:然后我希望我的假人 I 为 1:
Chain链 | Product产品 | Week星期 | Sales销售量 | I我 |
---|---|---|---|---|
Chain1链1 | Prod1产品1 | 1 1 | 0 0 | 0 0 |
Chain1链1 | Prod1产品1 | 2 2 | 0 0 | 0 0 |
-------- -------- | -------- -------- | ----- ----- | ----- ----- | - - |
Chain1链1 | Prod1产品1 | 51 51 | 10 10 | 1 1 |
Chain1链1 | Prod1产品1 | 52 52 | 14 14 | 0 0 |
Chain2链2 | Prod1产品1 | 1 1 | 10 10 | 0 0 |
Chain2链2 | Prod1产品1 | 2 2 | 11 11 | 0 0 |
-------- -------- | -------- -------- | ----- ----- | ----- ----- | - - |
Chain2链2 | Prod1产品1 | 51 51 | 12 12 | 0 0 |
Chain2链2 | Prod1产品1 | 52 52 | 15 15 | 0 0 |
Chain1链1 | Prod2产品2 | 1 1 | 3 3 | 0 0 |
Chain1链1 | Prod2产品2 | 2 2 | 4 4 | 0 0 |
-------- -------- | -------- -------- | ----- ----- | ----- ----- | - - |
Chain1链1 | Prod2产品2 | 51 51 | 8 8 | 0 0 |
Chain1链1 | Prod2产品2 | 52 52 | 10 10 | 0 0 |
Chain2链2 | Prod2产品2 | 1 1 | 11 11 | 0 0 |
Chain2链2 | Prod2产品2 | 2 2 | 12 12 | 0 0 |
-------- -------- | -------- -------- | ----- ----- | ----- ----- | - - |
Chain2链2 | Prod2产品2 | 51 51 | 15 15 | 0 0 |
Chain2链2 | Prod2产品2 | 52 52 | 7 7 | 0 0 |
I would guess I should create a loop that loops over the weekly observations of sale for each type of product in each chain and detects the when the sale starts at 0 and then changes to some value.我想我应该创建一个循环,循环遍历每个链中每种产品的每周销售观察结果,并检测销售何时从 0 开始,然后更改为某个值。 How should this be done in R?这应该如何在 R 中完成?
Thank you.谢谢你。
For each Product
in each Chain
we can find the row where first time the Sale
value was greater than 0 and change that row value to 1. If your data is called df
.对于每个Chain
中的每个Product
,我们可以找到第一次Sale
值大于 0 的行并将该行值更改为 1。如果您的数据称为df
。
library(dplyr)
df %>%
group_by(Chain, Product) %>%
mutate(I = as.integer(row_number() == match(TRUE, Sale > 0))) -> result
result
We can use base R
我们可以使用base R
df$I <- with(df, ave(Sale > 0, Chain, Product, FUN = which.max))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.