如何在python中的数据框中找到最常见的两列组合

Question

I have my data in pandas data frame as follows: 我在pandas数据框中的数据如下：

df = pd.DataFrame({'a':[1,2,3,3,4,4,4], 'b':[2,3,4,4,5,5,5]})

So the dataframe looks like this: 所以数据框看起来像这样：

The column 'a','b' combination here are: 12(1), 23(1), 34(2), 45(3). 这里的列'a'，'b'组合是：12（1），23（1），34（2），45（3）。 I am trying to select 4 and 5 and print them out because their combination has most occurrences (3 times). 我试图选择4和5并将它们打印出来，因为它们的组合最常出现（3次）。

My code is: 我的代码是：

counts = df.groupby(['a','b']).size().sort_values(ascending=False)
print(counts)

Output: 输出：

a  b
4  5    3
3  4    2
2  3    1
1  2    1
dtype: int64

But this only gives me a column [3,2,1,1]. 但这只给我一个专栏[3,2,1,1]。 This are the numbers combination counts. 这是数字组合计数。 How can I access elements 4 and 5 individually so I can print them out? 如何单独访问元素4和5以便打印出来？

Thanks in advance! 提前致谢！

Answer 1

Pandas GroupBy objects are indexed by grouper key(s). Pandas GroupBy对象由石斑鱼键索引。 In the case of multiple keys, this will mean a MultiIndex . 在多个键的情况下，这将意味着MultiIndex 。 You can just extract the first index of your result to give a tuple representing the most common combination: 您可以只提取结果的第一个索引，以给出表示最常见组合的tuple ：

counts.index[0]  # (4, 5)

Answer 2

Using idxmax , even the result is inorder, you still can find the index of max value 使用idxmax ，即使结果是idxmax ，您仍然可以找到最大值的索引

df.groupby(['a','b']).size().idxmax()
Out[15]: (4, 5)

Answer 3

Simplest one to use mode in pandas DataFrame. 最简单的一种在pandas DataFrame中使用模式。 It'll give a most frequent values across the rows or columns: 它会在行或列中提供最常用的值：

>>> df
   a  b
0  1  2
1  2  3
2  3  4
3  3  4
4  4  5
5  4  5
6  4  5

>>> df.mode()
   a  b
0  4  5

如何在python中的数据框中找到最常见的两列组合

问题描述

3 个解决方案

解决方案1
2 2018-10-29 01:44:00

解决方案2
2 2018-10-29 01:53:05

解决方案3
1 2018-10-29 03:37:33

如何在python中的数据框中找到最常见的两列组合

问题描述

3 个解决方案

解决方案1 2 2018-10-29 01:44:00

解决方案2 2 2018-10-29 01:53:05

解决方案3 1 2018-10-29 03:37:33

解决方案1
2 2018-10-29 01:44:00

解决方案2
2 2018-10-29 01:53:05

解决方案3
1 2018-10-29 03:37:33