I have recently started using dataframes in Python and I don't know how can I do the following exercise.
I have two dataframes, both with the same columns ( Type
column and Count
column) like this:
main_df
:
Index | Type | Count |
---|---|---|
0 | Album | 12 |
1 | Book | 4 |
2 | Person | 3 |
df2
:
Index | Type | Count |
---|---|---|
0 | Album | 9 |
1 | Person | 4 |
2 | Film | 4 |
Same Type
value can have different Index
value, as you can see with Type = Person
( Index = 2
in main_df
and Index = 1
in df2
).
I want to have all data in main_df
. In this case result will be:
main_df
:
Index | Type | Count |
---|---|---|
0 | Album | 21 |
1 | Book | 4 |
2 | Person | 7 |
3 | Film | 4 |
If Type
column value in df2
is already in main_df
, simply sum the corresponding Count
value of df2
. If Type
column value in df2
is not in main_df
, add that row ( Type
and Count
value) at the end of main_df
.
Hope you can help me with this. Thanks in advance.
You can try with pd.concat
and then use groupby
:
main_df = pd.concat([main_df,df2]).groupby('Type', sort=False).agg({'Count': sum}).reset_index()
Type Count
0 Album 21
1 Book 4
2 Person 7
3 Film 4
try via append()
, groupby()
and sum()
:
out=(df1.append(df2)
.groupby('Type',as_index=False,sort=False)
.sum())
Output of out
:
Type Index Count
0 Album 0 21
1 Book 1 4
2 Person 3 7
3 Film 2 4
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.