简体   繁体   中英

Pandas dataframe column match and group by

I have two dataframe say A and B as below:

A = [1,2,3,2,1,3]
B = [1,3,3,1,1,3]

I want to match each value of dataframe A with B and count the matched value, for example there is total 2 rows with value 1 in same row, 0 rows with value 2 in same row and 2 rows with value 3 in same row.

I want output as below:
Value - > Count
1 -> 2
2 -> 0
3 -> 2

I have tried the following code but it only displays true and false for matched rows.

print(A.isin(B))
Output:
True
False
True
False
True
True

I tired using.count() and.value_count() but couldn't get the expected value. Someone teach me how to do it.

Try boolean indexing with == condition, then use value_counts , reindex and to_dict :

A = pd.DataFrame([1,2,3,2,1,3])
B = pd.DataFrame([1,3,3,1,1,3])

d = A[A == B][0].value_counts().reindex(A[0].unique(), fill_value=0).to_dict()
print(d)

[out]

{1: 2, 2: 0, 3: 2}

Use:

df1 = pd.DataFrame({'A': [1, 2, 3, 2, 1, 3]})
df2 = pd.DataFrame({'B': [1, 3, 3, 1, 1, 3]})

result = (
    df1.assign(Count=df1['A'].eq(df2['B']))
    .groupby('A')['Count'].sum().astype(int)
    .reset_index().rename(columns={'A': "Value"})
)

print(result)

After exceuting the code the result would be:

   Value  Count
0      1      2
1      2      0
2      3      2

Data

df=pd.DataFrame({'A':[1,2,3,2,1,3]})
df1=pd.DataFrame({'B':[1,3,3,1,1,3]})

Resolve it using groupby, sort_value and drop any duplicates keeping highest value in each group

 df['count']=df[df.A.isin(df1.B)].groupby('A')['A'].transform('count')
 df2=df.sort_values(by='count', ascending=True).drop_duplicates(subset='A', keep="last").fillna(0)
 #df2['count']=df2['count'].sort_values(ascending=True).astype(int)

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM