简体   繁体   中英

How to see the highest value between two columns of a dataframe and how many values in one column are greater than the other?

I need to compare two columns (a, b) of a dataframe to see how many values of "a" are greater than "b in Pandas.

I've tried this way but I don't know if it's the best option:

def result(y,z):
    if(y > z):
          return True

df_filtered.apply(lambda y: result(y['a'],y['b']), axis = 1)

This shows me as a result a list of true and false results, but I would need to know the amount of each.

您可以检查value_counts

df['a'].gt(df['b']).value_counts()

You need:

(df['a'] > df['b']).sum()

Consider following example:

df = pd.DataFrame({
    'a':[10,20,30,40],
    'b':[1,200,300,4]
})

Output:

    a   b
0   10  1
1   20  200
2   30  300
3   40  4

Then

 (df['a'] > df['b']).sum()

Output

2

You did it right, simply add the value_counts() such that:

df_filtered.apply(lambda y: result(y['a'],y['b']), axis = 1).value_counts()

better yet, if your function result is trivial you can write:

df.apply(lambda x: x['a']>x['b'], axis=1).value_counts()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM