I am trying to add a column into a pandas dataframe
, that inserts Morning
, Evening
or Afternoon
, based on the time slots that I choose.
The code I am trying is as follows:
df_agg['timeOfDay'] = df_agg.apply(lambda _: '', axis=1)
for i in range (len(df_agg)):
if df_agg['time_stamp'].iloc[i][0].hour < 12:
df_agg['timeOfDay'].iloc[i] = 'Morning'
elif df_agg['time_stamp'].iloc[i][0].hour < 17 & df_agg['time_stamp'].iloc[i][0].hour > 12:
df_agg['timeOfDay'].iloc[i] = 'Afternoon'
else:
df_agg['timeOfDay'].iloc[i] = 'Evening'
When I go to return my df_agg
, it returns an empty timeOfDay
column. Does anyone know what I am doing wrong, when trying to insert these elements into the row, based on the time of day?
pandas
use pd.cut
to break it by bins and give labels. This method makes it trivial to create more granular time slots as well
df_agg.assign(
timeOfDay=pd.cut(
df_agg.time_stamp.dt.hour,
[-1, 12, 17, 24],
labels=['Morning', 'Afternoon', 'Evening']))
numpy
using searchsorted
hours = df_agg.time_stamp.dt.hour.values
times = np.array(['Morning', 'Afternoon', 'Evening'])
df_agg.assign(timeOfDay=times[np.array([12, 17]).searchsorted(hours)])
both yield
time test
small data set
large data set
start = pd.to_datetime('2015-02-24 10:00:00')
rng = pd.date_range(start, periods=10000, freq='1h')
df_agg = pd.DataFrame({'time_stamp': rng, 'a': range(len(rng))})
setup
borrowed @jezrael's setup df_agg
start = pd.to_datetime('2015-02-24 10:00:00')
rng = pd.date_range(start, periods=12, freq='1h')
df_agg = pd.DataFrame({'time_stamp': rng, 'a': range(len(rng))})
print (df_agg)
I think you can use double numpy.where
, please check if is not necessary change <
to <=
or >
to >=
:
start = pd.to_datetime('2015-02-24 10:00:00')
rng = pd.date_range(start, periods=12, freq='1h')
df_agg = pd.DataFrame({'time_stamp': rng, 'a': range(12)})
print (df_agg)
a time_stamp
0 0 2015-02-24 10:00:00
1 1 2015-02-24 11:00:00
2 2 2015-02-24 12:00:00
3 3 2015-02-24 13:00:00
4 4 2015-02-24 14:00:00
5 5 2015-02-24 15:00:00
6 6 2015-02-24 16:00:00
7 7 2015-02-24 17:00:00
8 8 2015-02-24 18:00:00
9 9 2015-02-24 19:00:00
10 10 2015-02-24 20:00:00
11 11 2015-02-24 21:00:00
hours = df_agg.time_stamp.dt.hour.values
df_agg['timeOfDay'] = np.where(hours <= 12, 'Morning',
np.where(hours >= 17, 'Evening', 'Afternoon'))
a time_stamp timeOfDay
0 0 2015-02-24 10:00:00 Morning
1 1 2015-02-24 11:00:00 Morning
2 2 2015-02-24 12:00:00 Morning
3 3 2015-02-24 13:00:00 Afternoon
4 4 2015-02-24 14:00:00 Afternoon
5 5 2015-02-24 15:00:00 Afternoon
6 6 2015-02-24 16:00:00 Afternoon
7 7 2015-02-24 17:00:00 Evening
8 8 2015-02-24 18:00:00 Evening
9 9 2015-02-24 19:00:00 Evening
10 10 2015-02-24 20:00:00 Evening
11 11 2015-02-24 21:00:00 Evening
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.