简体   繁体   中英

How to compare two dataframes in pandas?

I have a dataframe like this:

df = pd.DataFrame([[1,'aaa',50],[0,'aaa',1000],[0,'aba',30],[1,'aaa',50],[1,'aba',10]], 
columns=['A','B','C'])
df



    A   B   C
0   1   aaa 50
1   0   aaa 1000
2   0   aba 30
3   1   aaa 50
4   1   aba 10

I want for each item in 'B'(which also there are repeated items), check its value in 'A'. If it's 1, it should calculate the sum of values in 'C' for that item. If it's 0, it should count the number of items which their 'A' value is zero. Then the final result would be: sum/count.

In the end, I want to show the result like this:

    ID  Value
0   aaa 100
1   aba 10

For example, 'aaa' has two 1 which their sum is 50 + 50 = 100, and one 0 which its count is 1. So the result is 100 / 1 = 100.

How can I do something like that in an efficient way? I tried to use groupby() and have the sum and count in different dataframes, but I don't know how to compare them and get this result.

Try groupby aggregate on columns A and B , while summing and sizing the C column. Then divide A==1 'sum' by A==0 'count':

new_df = df.groupby(['A', 'B']).aggregate(sum=('C', 'sum'), count=('C', 'size'))
new_df = (new_df.loc[1, 'sum'] / new_df.loc[0, 'count']).reset_index()
new_df.columns = ['ID', 'Value']  # Rename Columns

new_df :

    ID  Value
0  aaa  100.0
1  aba   10.0

*Beware division by 0. It is possible some group would have 0 entries for a given B value.

In [90]: df[df['A'] == 1].groupby('B')['C'].sum() /  df[df['A'] == 0].groupby('B').size()
Out[90]:
B
aaa    100.0
aba     10.0
dtype: float64

this should take care of dividing correctly as both the series are indexed by the column 'B' because of the grouping

You can do a groupy and select the right group:

import pandas as pd

df_grouped = df.groupby(['A', 'B']).sum().loc[1]

B      C
aaa  100
aba   10

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM