简体   繁体   中英

how do I find count of unique combination in 2 columns of dataframe in pandas

df = pd.DataFrame({'col1': [1,2,4,3], 'col2': [2,1,3,4]})
   col1 col2
0   1     2
1   2     1
2   4     3
3   3     4

Desired outcome

  col1 col2 count
0   1     2     2
1   4     3     2

I tried

(df.groupby(['team1','team2']).size()
   .sort_values(ascending=False)
   .reset_index(name='Count')
)

but this is not giving me unique combination

IIUC, you can first compute afrozenset from your two columns, then use named aggregation :

# compute unordered grouper
group = df[['col1', 'col2']].agg(frozenset, axis=1)

# craft a dictionary of expected output
# first rows for the existing columns + new column for count
d = {c: (c, 'first') for c in df}
d.update({'count': ('col1', 'count')})
# {'col1': ('col1', 'first'),
#  'col2': ('col2', 'first'),
#  'count': ('col1', 'count')}

# perform the aggregation
df.groupby(group, as_index=False).agg(**d)

output:

   col1  col2  count
0     1     2      2
1     4     3      2

You do something like this also,

df.apply(set, axis=1).value_counts()

Output:

{1, 2}    2
{3, 4}    2
dtype: int64

Let us check

df[:] = np.sort(df.to_numpy(),axis=1)
df.value_counts()
Out[132]: 
col1  col2
1     2       2
3     4       2
dtype: int64

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM