如何确定每行多列中类别标签的最高出现次数

Question

I am trying to determine the label name with the highest occurrence across multiple columns and set the another pandas columns with that label. 我正在尝试确定在多列中出现次数最多的标签名称，并用该标签设置另一个熊猫列。

For examples, given this dataframe: 例如，给定此数据框：

    Class_1     Class_2     Class_3
0   versicolor  setosa      setosa
1   virginica   versicolor  virginica
2   virginica   setosa      setosa
3   versicolor  setosa      setosa
4   versicolor  versicolor  virginica

I want to add a column called Predictions per the reasoning above: 我想根据上述原因添加一列称为“预测”：

    Class_1     Class_2     Class_3    Predictions
0   versicolor  setosa      setosa     setosa
1   virginica   versicolor  virginica  virginica
2   virginica   setosa      setosa     setosa
3   versicolor  setosa      setosa     setosa
4   versicolor  versicolor  virginica  versicolor

Answer 1

Use value_counts for return first index by most common value per rows with apply and axis=1 : 将value_counts用于返回第一个索引，按apply和axis=1每行的最常用值：

df['Predictions'] = df.apply(lambda x: x.value_counts().index[0], axis=1)
print (df)
      Class_1     Class_2    Class_3 Predictions
0  versicolor      setosa     setosa      setosa
1   virginica  versicolor  virginica   virginica
2   virginica      setosa     setosa      setosa
3  versicolor      setosa     setosa      setosa
4  versicolor  versicolor  virginica  versicolor

Alternative with Counter.most_common : Counter.most_common替代方案：

from collections import Counter

df['Predictions'] = [Counter(x).most_common(1)[0][0] for x in df.itertuples()]
print (df)
      Class_1     Class_2    Class_3 Predictions
0  versicolor      setosa     setosa      setosa
1   virginica  versicolor  virginica   virginica
2   virginica      setosa     setosa      setosa
3  versicolor      setosa     setosa      setosa
4  versicolor  versicolor  virginica  versicolor

如何确定每行多列中类别标签的最高出现次数

问题描述

1 个解决方案

解决方案1
2 已采纳 2018-03-30 06:46:13

如何确定每行多列中类别标签的最高出现次数

问题描述

1 个解决方案

解决方案1 2 已采纳 2018-03-30 06:46:13

解决方案1
2 已采纳 2018-03-30 06:46:13