I have the foll. dataframe (hourly time stamp index):
relative_humidity condition fid
2017-08-02 10:00:00 0.49 Chance of a Thunderstorm 1
2017-08-02 11:00:00 0.50 Chance of a Thunderstorm 1
2017-08-02 12:00:00 0.54 Partly Cloudy 1
2017-08-02 13:00:00 0.58 Partly Cloudy 2
2017-08-02 14:00:00 0.68 Partly Cloudy 2
How can I compute the condition which occurs most often daily and put that in a dataframe with the date as index. Also need to separate by fid
?
I tried:
df.groupby(['fid', pd.Grouper(freq='D')])['condition']
You need value_counts
with index[0]
, because data are sorted and first value is top:
d = {'level_1':'date'}
df1 = df.groupby(['fid', pd.Grouper(freq='D')])['condition'] \
.apply(lambda x: x.value_counts().index[0]).reset_index().rename(columns=d)
print (df1)
fid date condition
0 1 2017-08-02 Chance of a Thunderstorm
1 2 2017-08-02 Partly Cloudy
df.groupby(['fid',pd.Grouper(freq='D'),'condition']).size().groupby(level=[0,1]).head(1)
Output:
fid condition
1 2017-08-02 Chance of a Thunderstorm 2
2 2017-08-02 Partly Cloudy 2
dtype: int64
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.