Compare elements in dataframe columns for each row - Python

Question

I have a really huge dataframe (thousends of rows), but let's assume it is like this:

   A  B  C  D  E  F
0  2  5  2  2  2  2
1  5  2  5  5  5  5
2  5  2  5  2  5  5
3  2  2  2  2  2  2
4  5  5  5  5  5  5

I need to see which value appears most frequently in a group of columns for each row. For instance, the value that appears most frequently in columns ABC and in columns DEF in each row, and put them in another column. In this example, my expected output is

How can I do it in Python??? Thanks!!

Answer 1

Here is one way using columns groupby

mapperd={'A':'ABC','B':'ABC','C':'ABC','D':'DEF','E':'DEF','F':'DEF'}
df.groupby(mapperd,axis=1).agg(lambda x : x.mode()[0])
Out[826]: 
   ABC  DEF
0    2    2
1    5    5
2    5    5
3    2    2
4    5    5

Answer 2

For a good performance you can work with the underlying numpy arrays, and use scipy.stats.mode to compute the mode :

from scipy import stats
cols = ['ABC','DEF']
a = df.values.reshape(-1, df.shape[1]//2)
pd.DataFrame(stats.mode(a, axis=1).mode.reshape(-1,2), columns=cols)

    ABC  DEF
0    2    2
1    5    5
2    5    5
3    2    2
4    5    5

Answer 3

You try using column header index filtering:

grp = ['ABC','DEF']
pd.concat([df.loc[:,[*g]].mode(1).set_axis([g], axis=1, inplace=False) for g in grp], axis=1)

Output:

   ABC  DEF
0    2    2
1    5    5
2    5    5
3    2    2
4    5    5

Compare elements in dataframe columns for each row - Python

Question

3 answers

solution1
8 2019-04-30 17:45:03

solution2
4 2019-04-30 17:46:23

solution3
3 2019-04-30 17:58:30

Compare elements in dataframe columns for each row - Python

Question

3 answers

solution1 8 2019-04-30 17:45:03

solution2 4 2019-04-30 17:46:23

solution3 3 2019-04-30 17:58:30

solution1
8 2019-04-30 17:45:03

solution2
4 2019-04-30 17:46:23

solution3
3 2019-04-30 17:58:30