简体   繁体   中英

Assigning column values based on which value is highest in a row of three other columns in R

I would like to do the following in R:

I have purchase frequencies of three types of products: AB and C (customer being the identifier here). Now I would like to create a fourth column that per row has a value of 0 if product A has the highest purchase frequency, a value of 1 if product B has the highest purchase frequency, and a value of 2 if product C has the highest purchase frequency. Ideally, if the frequency of purchases for two or even all three products is equally high I would like to assign one of the three values randomly. Of course, this random assignment should only be done for just two of the three categories if only two categories (and not three) are highest and equal.

So say I have the following table:

customer    A    B    C
1           2    3    4
2           4    6    5
3           4    2    4
4           4    2    4
5           4    4    4
6           2    2    2
7           4    4    4

I would like to create (for example) the following column:

highest_purchase_freq
2
1
0
2
0
1
2

It would be amazing if someone could help me.

Thanks in advance!

The which.is.max function of nnet could be what you are looking for ?
It finds the maximum position in a vector, breaking ties at random.

set.seed(12345)
library(nnet)
df$highest_purchase_freq <- apply(df[2:4], 1, which.is.max)-1
df  

  customer A B C highest_purchase_freq
1        1 2 3 4                     2
2        2 4 6 5                     1
3        3 4 2 4                     2
4        4 4 2 4                     2
5        5 4 4 4                     2
6        6 2 2 2                     2
7        7 4 4 4                     1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM