[英]Assigning column values based on which value is highest in a row of three other columns in R
I would like to do the following in R: 我想在R中执行以下操作:
I have purchase frequencies of three types of products: AB and C (customer being the identifier here). 我有三种产品的购买频率:AB和C(此处是客户的标识)。 Now I would like to create a fourth column that per row has a value of 0 if product A has the highest purchase frequency, a value of 1 if product B has the highest purchase frequency, and a value of 2 if product C has the highest purchase frequency.
现在,我想创建第四列,如果产品A的购买频率最高,则每行的值为0;如果产品B的购买频率最高,则为1;如果产品C的购买频率最高,则为2购买频率。 Ideally, if the frequency of purchases for two or even all three products is equally high I would like to assign one of the three values randomly.
理想情况下,如果两个或什至所有三个产品的购买频率都很高,我想随机分配这三个值之一。 Of course, this random assignment should only be done for just two of the three categories if only two categories (and not three) are highest and equal.
当然,如果只有两个类别(而不是三个类别)是最高且相等的,则只能对三个类别中的两个类别进行随机分配。
So say I have the following table: 所以说我有下表:
customer A B C
1 2 3 4
2 4 6 5
3 4 2 4
4 4 2 4
5 4 4 4
6 2 2 2
7 4 4 4
I would like to create (for example) the following column: 我想创建(例如)以下列:
highest_purchase_freq
2
1
0
2
0
1
2
It would be amazing if someone could help me. 如果有人可以帮助我,那将是惊人的。
Thanks in advance! 提前致谢!
The which.is.max
function of nnet
could be what you are looking for ? 您正在寻找
nnet
的which.is.max
函数吗?
It finds the maximum position in a vector, breaking ties at random. 它找到向量中的最大位置,随机断开关系。
set.seed(12345)
library(nnet)
df$highest_purchase_freq <- apply(df[2:4], 1, which.is.max)-1
df
customer A B C highest_purchase_freq
1 1 2 3 4 2
2 2 4 6 5 1
3 3 4 2 4 2
4 4 4 2 4 2
5 5 4 4 4 2
6 6 2 2 2 2
7 7 4 4 4 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.