I am trying to create a binary column in python based if a given data in the GameDate column is between a set of dates.
Here is what I tried:
df5['pre_season'] = df5.gameDate.apply(lambda x: 1 if pd.to_datetime('2021-02-28') <= x <= pd.to_datetime('2021-03-30'):
elif pd.to_datetime('2017-02-22') <= x <= pd.to_datetime('2017-04-01'):
elif pd.to_datetime('2018-02-21') <= x <= pd.to_datetime('2018-03-27'):
elif pd.to_datetime('2019-02-21') <= x <= pd.to_datetime('2019-03-26'):
elif pd.to_datetime('2020-02-21') <= x <= pd.to_datetime('2020-07-22'):
else 0).astype(int)
Through this I was hoping to create a binary column that call pre_season that would have a 1 or 0 when the GameDate falls in these ranges.
You can either use pandas.Series.dt.to_pydatetime
with a listcomp :
dates = pd.date_range(start='01/01/2017', end='01/01/2022')
df5 = pd.DataFrame({'gameDate': dates}).sample(10, ignore_index=True)
date_ranges = list(map(pd.to_datetime, [('2021-02-28', '2021-03-30'),
('2017-02-22', '2017-04-01'),
('2018-02-21', '2018-03-27'),
('2019-02-21', '2019-03-26'),
('2020-02-21', '2020-07-22')]))
df5['pre_season'] = [1 if any([(start < x.to_pydatetime() < end)
for start, end in date_ranges]) else 0
for x in df5['gameDate']]
Or use numpy.where
with pandas.Series.isin
:
df5['pre_season'] = np.where(df5['gameDate'].isin([x for start, end in date_ranges
for x in pd.date_range(start, end)]), 1, 0)
#array([1, 0, 0, 1, 0, 0, 0, 0, 0, 0])
Output:
gameDate pre_season
0 2020-04-20 1
1 2021-12-28 0
2 2020-01-04 0
3 2018-03-07 1
4 2018-05-22 0
5 2017-01-10 0
6 2019-10-20 0
7 2018-10-27 0
8 2017-11-27 0
9 2017-05-24 0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.