I have a Pandas DataFrame that contains some values and I want to sum up those values according to the date
column.
The DataFrame looks like the following:
and when I run pandas.DataFrame.groupby(['date']).sum()
I get
As you can see, this isn't the result that I want because I want all of the columns summed up, not just polarity
and subjectivity
.
Does anybody know why it's only summing up these two, and how might I get the desired result?
Thank you.
We need numeric
columns to be able to do calculation on them, in this case sum
:
#Example dataframe
df = pd.DataFrame({'date':['2019-01-04', '2019-01-04', '2019-01-03', '2018-12-22', '2018-08-31'],
'replies_count':['46', '143', '64', '154', '50'],
'polarity':[10, 20, 30, 40, 50]})
print(df)
date replies_count polarity
0 2019-01-04 46 10
1 2019-01-04 143 20
2 2019-01-03 64 30
3 2018-12-22 154 40
4 2018-08-31 50 50
Check types of columns
print(df.dtypes)
date object
replies_count object
polarity int64
dtype: object
Apply groupby
with sum
print(df.groupby('date').sum())
polarity
date
2018-08-31 50
2018-12-22 40
2019-01-03 30
2019-01-04 30
Now change type of replies_count
column to int
and do the same groupby
with sum
df['replies_count'] = df['replies_count'].astype(int)
print(df.groupby('date').sum())
replies_count polarity
date
2018-08-31 50 50
2018-12-22 154 40
2019-01-03 64 30
2019-01-04 189 30
As we can see, the column is included now.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.