简体   繁体   中英

Creating multiple new columns based on existing columns

This is a continuation to my older question Creating new columns based on conditions of the existing columns I have a similar case as before, but now I have an additional column in the input and 3 additional columns in the output - and 6 conditions to check

The dataset is now like this

structure(list(Item = c("P_6000_1", "P_6000_1", "P_6000_1", "P_6000_3", 
                    "P_6000_3", "P_6000_3", "P_6000_5", "P_6000_5", "P_6000_5"), 
           Customer = c("Customer_4", "Customer_4", "Customer_4", "Customer_4", 
                        "Customer_4", "Customer_4", "Customer_1", "Customer_1", "Customer_1"), 
           DemandID = c("Order_175", "Order_176", "Order_177", "Order_186", 
                           "Order_187", "Order_188", "Order_195", "Order_196", "Order_197"),
           Order = c(450L, 479L, 365L, 2890L, 3450L, 2500L, 234L, 443L, 321L), 
           Forecast = c(3300L, 3300L, 3300L, 3846L, 3846L, 3846L, 3070L, 3070L, 3070L), 
           RTF = c(3113L, 3113L, 3113L, 0L, 0L, 0L, 3200L, 3200L, 3200L)), 
      row.names = c(NA, -9L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x00000000025c1ef0>)

I have 2 new columns DemandID and Order

For each Item, Customer combination, there is only one value of Forecast and RTF - they are repeated to fill up the rows

My conditions for the new columns are

1. Forecast < Order < RTF -> COM_O = Forecast, NEW_O = Order - Forecast, UNF_O = 0, COM_F = 0, NEW_F = 0

2. Forecast < RTF < Order -> COM_O = Forecast, NEW_O = RTF - Forecast, UNF_O = Order - RTF, COM_F = 0, NEW_F = 0
    
3. RTF < Order < Forecast -> COM_O = RTF, NEW_O = Order - RTF, UNF_O = 0, COM_F = 0, NEW_F = Forecast - Order, 

4. RTF < Forecast < Order -> COM_O = RTF, NEW_O = Forecast - RTF, UNF_O = Order - Forecast, COM_F = 0, NEW_F = 0

5. Order < Forecast < RTF -> COM_Order = Order, NEW_Order = 0, UNF_Order = 0, COM_FCST = Forecast - Order, NEW_FCST = 0

6. Order < RTF < Forecast -> COM_Order = Order, NEW_Order = 0, UNF_Order = 0, COM_Forecast = RTF - Order, NEW_Fcst = Forecast - RTF

I know I can get the Forecast and RTF for an Item Customer combination by using

cols = c('Item','Customer')
tempDT <- dt[, ItemCust := (paste0(unlist(.SD), collapse="")), .SDcols= cols,     by=.(row=seq_len(nrow(dt)))]
tempDT1 = tempDT[,.(ItemCust, Forecast, RTF)]
tempDT1 <- tempDT1[, .(Forecast = mean(Forecast), RTF = mean(RTF)), by = .(ItemCust)]

But I am stuck after this.

My question is how do I loop through the demandids, for each item - customer combination and use the conditions above. The RTF and Forecast values should also be updated after the new columns are generated against the demandid row

Edit: added input and expected output

input = data.frame(Item = c("P_6000_1", "P_6000_1", "P_6000_1"), 
              Customer = c("Customer_4", "Customer_4", "Customer_4"), 
              DemandID = c("Order_175", "Order_176", "Order_177"),
              Order = c(450L, 479L, 365L), 
              Forecast = c(3300L, 3300L, 3300L), 
              RTF = c(3113L, 3113L, 3113L)) 

output1 = data.frame(Item = c("P_6000_1", "P_6000_1", "P_6000_1"), 
                Customer = c("Customer_4", "Customer_4", "Customer_4"), 
                DemandID = c("Order_175", "Order_176", "Order_177"),
                Order = c(450L, 479L, 365L), 
                COM_O = c(450, 479, 365)) 

output2 = data.frame(Item = c("P_6000_1"), 
                 Customer = c("Customer_4"),
                 Forecast = c(2006),
                 RTF = c(1819))

Ouput 2 can be derived from the tempDT1 which I created. This is whatever is remaining from forecast and rtf after orders have been subtracted. On this datatable, I will run the query for the last 2 conditions on the basis that Order = 0

tempDT1[, c("New_F","Com_F") := 
          .(fifelse(Forecasts > RTF, Forecasts - RTF, 0), 
            fifelse(Forecasts > RTF, RTF, Forecasts))] 

And get the required columns for New_F and Com_F and then join the tables back based on Item and Customer.

I have tried several specifications. I think the procedure below yields the output most likely to be your desired one.

dt[, 
   .(Order = sum(Order)),
   by = .(Item, Customer, Forecast, RTF)
][,
  COM_O := pmin(Forecast, RTF, Order)
][, `:=`(
  NEW_O = pmin(Order, pmax(Forecast, RTF)) - COM_O,
  UNF_O = pmax(Order - pmax(Forecast, RTF), 0), 
  COM_F = pmax(pmin(Forecast, RTF) - Order, 0), 
  NEW_F = pmax(Forecast - pmax(Order, RTF), 0), 
  Forecast = Forecast - COM_O, 
  RTF = RTF - COM_O
)]

Output

       Item   Customer Forecast  RTF Order COM_O NEW_O UNF_O COM_F NEW_F
1: P_6000_1 Customer_4     2006 1819  1294  1294     0     0  1819   187
2: P_6000_3 Customer_4     3846    0  8840     0  3846  4994     0     0
3: P_6000_5 Customer_1     2072 2202   998   998     0     0  2072     0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM