Create new column based on number of rows matching value in another dataframe

Question

I want to create new column based on the number of rows each fruit is present in df2.

Expected Output of df1

No  | Fruit_Name | 2018 | 2019 | 2020 
1   | Apple      |  2   |   1  | 0
2   | Banana     |  0   |   0  | 1
3   | Cherries   |  0   |   0  | 1

     df1                                       df2
No | Fruit_Name |                year   | farmer | fruit_farmed
1  | Apple      |                2018   | John   |   Apple
2  | Banana     |                2019   | Timo   |   Apple
3  | Cherries   |                2020   | Eva    |   Cherries
                                 2020   | Frey   |   Banana
                                 2018   | Ali    |   Apple

The code that doesn't work:

i=0
for i in range(3):
    df1['2018'] = len(df2.loc[df2['fruit_farmed'] == df1['Fruit_Name'][i]])
    df1['2019'] = len(df2.loc[df2['fruit_farmed'] == df1['Fruit_Name'][i]])
    df1['2020'] = len(df2.loc[df2['fruit_farmed'] == df1['Fruit_Name'][i]])
    i=i+1

Output:
    No  Fruit_Name  2018    2019    2020
0   1      Apple     1        1      1
1   2      Banana    1        1      1
2   3     Cherries   1        1      1

Answer 1

You can try with crosstab then join

s = pd.crosstab(df2.fruit_farmed, df2.year)
s = s.reindex(df1.Fruit_Name)
s.index=df1.index
df1 = df1.join(s)

Answer 2

Another way can be to groupby fruit_farmed, year and then unstack year.

import pandas as pd
df2 = pd.DataFrame([[2018,'John','Apple'],[2019,'Timo','Apple'], 
                   [2020,'Eva','Cherries'],[2020,'Frey','Banna'], 
                   [2018,'Ali','Apple']],
                   columns=['year','farmer','fruit_farmed'])

df1 = df2.groupby(['fruit_farmed','year']).count().unstack('year').reset_index().fillna(0)

#rename the columns
df1.columns = ['fruit_farmed','2018','2019','2020']
print(df1)

  fruit_farmed  2018  2019  2020
0        Apple   2.0   1.0   0.0
1        Banna   0.0   0.0   1.0
2     Cherries   0.0   0.0   1.0

Create new column based on number of rows matching value in another dataframe

Question

2 answers

solution1
2 ACCPTED 2020-08-21 14:53:04

solution2
0 2020-08-21 15:31:56

Create new column based on number of rows matching value in another dataframe

Question

2 answers

solution1 2 ACCPTED 2020-08-21 14:53:04

solution2 0 2020-08-21 15:31:56

solution1
2 ACCPTED 2020-08-21 14:53:04

solution2
0 2020-08-21 15:31:56