简体   繁体   English

根据条件分配值1或0

[英]Assign A Value of 1 or 0 based on condition

I have monthly reports from October through April and have stacked all of the data. 我有从10月到4月的月度报告,并堆积了所有数据。 I sorted the data by UniqueID and then Date. 我对数据按UniqueID排序,然后按日期排序。

I want to create a dummy variable that will meet the following condition: 我想创建一个满足以下条件的虚拟变量:

1.) If the last occurrence of a specific UniqueID is not in the last month (April), then I want the variable to =1, otherwise 0. 1.)如果最后一次出现的特定UniqueID不在上个月(4月),则我希望变量为= 1,否则为0。

The Freq column counts how many times the UniqueID shows up in the entire dataset of stacked monthly reports. 频率列计算唯一ID在堆叠的月度报告的整个数据集中显示的次数。

UniqueID Date        Freq
XX343_1  02/01/2019  3
XX343_1  03/01/2019  3  
XX343_1  04/01/2019  3
SD229_1  11/01/2018  4 
SD229_1  12/01/2018  4
SD229_1  01/01/2019  4
SD229_1  02/01/2019  4
WE321_1  10/01/2018  1

Basically, I would want the following output: 基本上,我想要以下输出:

UniqueID Date        Freq Dummy
XX343_1  02/01/2019  3    0
XX343_1  03/01/2019  3    0
XX343_1  04/01/2019  3    0
SD229_1  11/01/2018  4    0
SD229_1  12/01/2018  4    0
SD229_1  01/01/2019  4    0
SD229_1  02/01/2019  4    1
WE321_1  10/01/2018  1    1

The following code is what I have attempted: 以下代码是我尝试过的:

 data$Dummy=ifelse(data$Date=="2018-10-01" & data$Freq==1,1,ifelse(
                   data$Date=="2018-10-01" & data$Freq>=2,0,ifelse(
                   data$Date=="2018-11-01" & data$Freq<=2,1,ifelse(
                   data$Date=="2018-11-01" & data$Freq >2,0,ifelse(
                   data$Date=="2018-12-01" & data$Freq<=3,1,ifelse(
                   data$Date=="2018-12-01" & data$Freq >3,0,ifelse(
                   data$Date=="2019-01-01" & data$Freq<=4,1,ifelse(
                   data$Date=="2019-01-01" & data$Freq >4,0,ifelse(
                   data$Date=="2019-02-01" & data$Freq<=5,1,ifelse(
                   data$Date=="2019-02-01" & data$Freq >5,0,ifelse(
                   data$Date=="2019-03-01" & data$Freq<=6,1,ifelse(
                   data$Date=="2019-03-01" & data$Freq >6,0,0
               ))))))))))))

I keep getting errors and I'm not sure how to fix my problems. 我不断收到错误,并且不确定如何解决我的问题。 I get a lot of situations where if the first occurrence of a UniqueID is not in October, then the Dummy will = 0 in the second to last month. 在很多情况下,如果第一次出现唯一ID的时间不是在十月,则虚拟对象在第二个月到上个月将为0。 Can someone point me in the right direction? 有人可以指出我正确的方向吗?

library(dplyr); library(lubridate)
data <- read.table(header = T, stringsAsFactors = F,
  text = "UniqueID Date        Freq
  XX343_1  02/01/2019  3
  XX343_1  03/01/2019  3  
  XX343_1  04/01/2019  3
  SD229_1  11/01/2018  4 
  SD229_1  12/01/2018  4
  SD229_1  01/01/2019  4
  SD229_1  02/01/2019  4
  WE321_1  10/01/2018  1"
) %>% 
  mutate(Date = mdy(Date))

ID_dummy <- data %>%
  group_by(UniqueID) %>%
  summarize(last_Date = max(Date))

data %>%
  left_join(ID_dummy) %>%
  mutate(Dummy = if_else(last_Date == Date & month(last_Date) != 4, 1, 0))
#Joining, by = "UniqueID"
#  UniqueID       Date Freq  last_Date Dummy
#1  XX343_1 2019-02-01    3 2019-04-01     0
#2  XX343_1 2019-03-01    3 2019-04-01     0
#3  XX343_1 2019-04-01    3 2019-04-01     0
#4  SD229_1 2018-11-01    4 2019-02-01     0
#5  SD229_1 2018-12-01    4 2019-02-01     0
#6  SD229_1 2019-01-01    4 2019-02-01     0
#7  SD229_1 2019-02-01    4 2019-02-01     1
#8  WE321_1 2018-10-01    1 2018-10-01     1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM