简体   繁体   中英

Counting mode occurrences for all columns in a dataframe

I have a dataframe that looks like below.

dataframe1 =

In  AA   BB  CC
0   10   1   0
1   11   2   3
2   10   6   0
3   9    1   0
4   10   3   1
5   1    2   0

now I want to create a dataframe that gives me the count of modes for each column, for column AA the count is 3 for mode 10, for columns CC the count is 4 for mode 0, but for BB there are two modes 1 and 2, so for BB I want the sum of counts for the modes. so for BB the count is 2+2=4, for mode 1 and 2.

Therefore the final dataframe that I want looks like below.

Columns  Counts
AA        3
BB        4
CC        4

How to do it?

Another slightly more scalable solution using list comprehension:

pd.concat([df.eq(x) for _, x in df.mode().iterrows()]).sum()

[out]

AA    3
BB    4
CC    4
dtype: int64

You can compare columns with mode s and count matches by sum :

df = pd.DataFrame({'Columns': df.columns,
                   'Val':[df[x].isin(df[x].mode()).sum() for x in df]})
print (df)
  Columns  Val
0      AA    3
1      BB    4
2      CC    4

First we get the modes of the columns with DataFrame.mode

Then we compare each column to it's modes and use Series.isin to check the amount of modes and sum these.

modes = df.iloc[:, 1:].mode()
data = {col: df[col].isin(modes[col]).sum() for col in df.iloc[:, 1:].columns}
df = pd.DataFrame.from_dict(data, orient='index', columns=['Counts'])

    Counts
AA       3
BB       4
CC       4

Used pyjanitor module to bring in the transform function and return a dataframe:

(df.melt(id_vars='In')
 .groupby('variable')
 .agg(numbers=('value','value_counts'))
 .groupby_agg(by='variable',
 #here, it subtracts the max of numbers(for each group) from each 
  number in the group
              agg = lambda x : x - x.max(),
              agg_column_name='numbers',
              new_column_name = 'test'
             )
 .query('test==0')
 .groupby('variable')
 .agg(count=('numbers','sum'))
   )

          count
variable    
AA          3
BB          4
CC          4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM