我需要帮助 pandas dataframe

Question

I have a big dataframe of items which is simplified as below.我有一个很大的 dataframe 项目，简化如下。 I am looking for good way to find the the item(A, B, C) in each row which is repeated more than or equal to 2 times.我正在寻找在每一行中找到重复超过或等于 2 次的项目（A、B、C）的好方法。
for example in row1 it is A and in row2 result is B.例如，在第 1 行中它是 A，在第 2 行中结果是 B。

simplified df:简化 df:

df = pd.DataFrame({'C1':['A','B','A','A','C'],
               'C2':['B','A','A','C','B'],
               'C3':['A','B','A','C','C']}, 
              index =['ro1','ro2','ro3','ro4','ro5']                            
             )

Answer 1

As you have three columns and always a non unique , you can conveniently use mode .由于您有三列并且始终是非唯一的，因此您可以方便地使用mode 。

df.mode(1)[0]

Output: Output：

ro1    A
ro2    B
ro3    A
ro4    C
ro5    C
Name: 0, dtype: object

If you might have all unique values (eg A/B/C), you need to check that the mode is not unique:如果您可能拥有所有唯一值（例如 A/B/C），则需要检查模式是否不是唯一的：

m = df.mode(1)[0]
m2 = df.eq(m, axis=0).sum(1).le(1)
m.mask(m2)

Answer 2

Like mozway suggested, we don't know what will be your output. I will assume you need a list.就像 mozway 建议的那样，我们不知道您的 output 是什么。我假设您需要一个列表。

You can try something like this.你可以尝试这样的事情。

import pandas as pd 
from collections import Counter

holder = []
for index in range(len(df)):
    temp = Counter(df.iloc[index,:].values)
    holder.append(','.join([key for key,value in temp.items() if value >= 2]))

我需要帮助 pandas dataframe

问题描述

2 个解决方案

解决方案1
0 2022-04-11 20:47:35

解决方案2
0 已采纳 2022-04-11 20:49:37

我需要帮助 pandas dataframe

问题描述

2 个解决方案

解决方案1 0 2022-04-11 20:47:35

解决方案2 0 已采纳 2022-04-11 20:49:37

解决方案1
0 2022-04-11 20:47:35

解决方案2
0 已采纳 2022-04-11 20:49:37