简体   繁体   中英

Previous column matching and mutate a new col in R

I have the following data:

Cust <- c(1,1,1,1,1,2,2,2,2,3)
Date <- c("2017-07-10","2017-07-10","2017-07-10","2017-07-10","2017-07-11","2017-07-15","2017-07-15","2017-07-15","2017-06-19","2017-07-19")
TCode <- c(123,123,125,125,124,231,231,234,236,332)
H <- c("A","B","C","D","E","FF",'G',"H","J","GG")
df <- data.frame(Cust,Date,TCode,H)

Now, I have to make a new column "Newcol(df$NewCol)' in such a way that if Cust[1] == Cust[2] and TCode[1]==TCode[2] , then the value in df$new_col[2]=df$new_col[1] will be same as the previous one, else add 1 to it. This will change when the Cust Number change and again start with 1.

NOTE: For every new value in df$Cust , the first occurrence in df$new_col will always be 1 The total number of rows is greater than 1M, so need it to be dynamic.

The output needed is as follows:

在此处输入图像描述

Using base R

df$New_col <- with(df, ave(TCode, Cust, FUN = function(x) match(x, unique(x))))

Using data.table

library(data.table)
setDT(df)
df[, New_col := rleid(TCode), by = Cust]

Using dplyr with rleid from data.table

df %>% 
  group_by(Cust) %>% 
  mutate(New_col = rleid(TCode))

Gives us:

  Cust       Date TCode  H New_col
 1:    1 2017-07-10   123  A       1
 2:    1 2017-07-10   123  B       1
 3:    1 2017-07-10   125  C       2
 4:    1 2017-07-10   125  D       2
 5:    1 2017-07-11   124  E       3
 6:    2 2017-07-15   231 FF       1
 7:    2 2017-07-15   231  G       1
 8:    2 2017-07-15   234  H       2
 9:    2 2017-06-19   236  J       3
10:    3 2017-07-19   332 GG       1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM