简体   繁体   中英

sumif in ifelse condition R

I have a DT with multiple columns and I need to give a condition in ifelse and do the calculations accordingly. I want it to do count/sum(count) grouped by segment. Here is the DT

Segment  Count  Flag
A        23     Y
B        45     N
A        56     N 
B        212    Y

I want the fourth column as count per total count of the segment based on the flag so the out put should look something like this. For flag N it is the share of the count per segment. For flag Y, it is the revenue percentage calculation if the No(N) becomes Yes(Y) and in that case the revenue that could be earned. I am sorry as it is clumsy but kindly ask me in comments if you have any doubts.

Segment  Count  Flag   Rev   Value
    A        23     Y  34    ((56/23)*34)/(34+69)
    B        45     N  48    45/(45+212)
    A        56     N  23    56/(56+23)
    B        212    Y  67    ((45/212)*67)/(67+12)
    A        65     Y  69     ...
    B        10     Y  12    ...

Any help is appreciated. Thanks!

We can do this with data.table . Convert the 'data.frame' to 'data.table' ( setDT(DT) ), grouped by 'Segment', create the 'Value' column by diviing the 'Count' by the sum of 'Count', then we update the 'Value' where the Flag' is 'N'

library(data.table)
setDT(DT)[, Value := Count/sum(Count), Segment
              ][Flag == "N", Value := Count/sum(Count), Segment]


DT
#   Segment Count Flag      Value
#1:       A    23    Y 0.18852459
#2:       B    45    N 1.00000000
#3:       A    56    N 1.00000000
#4:       B   212    Y 0.78810409
#5:       A    43    Y 0.35245902
#6:       B    12    Y 0.04460967

Just checking with the OPs expected output 'Value'

> 23/122
#[1] 0.1885246
> 212/269
#[1] 0.7881041
> 43/122
#[1] 0.352459
> 12/269
#[1] 0.04460967

Update3

Based on the update No:3 in Op's post

s1 <-  setDT(DT1)[, .(rn = .I[Flag == "Y"], Value = (Rev[Flag=="Y"] *
    (Count[Flag == "N"]/Count[Flag=="Y"]))/sum(Rev[Flag == "Y"])), Segment]
s2 <-  DT1[, .(rn = .I[Flag == "N"], Value = Count[Flag == "N"]/(Count[Flag == "N"] + 
               Count[Flag=="Y"][1])), Segment]

DT1[, Value := rbind(s1, s2)[order(rn)]$Value]
DT1
#   Segment Count Flag Rev     Value
#1:       A    23    Y  34 0.8037146
#2:       B    45    N  48 0.1750973
#3:       A    56    N  23 0.7088608
#4:       B   212    Y  67 0.1800215
#5:       A    65    Y  69 0.5771471
#6:       B    10    Y  12 0.6835443


>((56/23)*34)/(34+69)
#[1] 0.8037146
> 45/(45+212)
#[1] 0.1750973
>  56/(56+23)
#[1] 0.7088608
> ((45/212)*67)/(67+12)
#[1] 0.1800215

data

DT <- structure(list(Segment = c("A", "B", "A", "B", "A", "B"), Count = c(23L, 
45L, 56L, 212L, 43L, 12L), Flag = c("Y", "N", "N", "Y", "Y", 
"Y")), .Names = c("Segment", "Count", "Flag"), row.names = c(NA, 
-6L), class = "data.frame")

DT1 <- structure(list(Segment = c("A", "B", "A", "B", "A", "B"), Count = c(23L, 
45L, 56L, 212L, 65L, 10L), Flag = c("Y", "N", "N", "Y", "Y", 
"Y"), Rev = c(34L, 48L, 23L, 67L, 69L, 12L)), .Names = c("Segment", 
"Count", "Flag", "Rev"), class = "data.frame", row.names = c(NA, 
-6L))

Alternatively we could have also used dplyr pkg for that...

Updating based on the suggestions provided by @Aramis7d - thanks!

library(data.table)
df <- fread("Segment  Count  Flag
 A        23     Y
    B        45     N
            A        56     N
            B        212    Y
            A        43     Y
            B        12     Y")

library(dplyr)

df %>% 
      group_by(Segment) %>% 
      mutate(Value = Count/sum(Count)) %>%
      group_by(Segment, Flag) %>%
      mutate(Value = if_else( Flag == "N", Count/sum(Count), Value))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM