I have a dataframe
entity response date
p a1 1-Feb-14
p a2 2-Feb-14
p a3 3-Feb-14
p a4 4-Feb-14
p a5 5-Feb-14
p a6 6-Feb-14
p a7 7-Feb-14
p a8 8-Feb-14
p a9 9-Feb-14
p a10 10-Feb-14
p a11 11-Feb-14
p a12 12-Feb-14
p a13 13-Feb-14
p a14 14-Feb-14
p a15 15-Feb-14
and another data frame :
entity start_date end_date
p 2-Feb-14 4-Feb-14
p 6-Feb-14 7-Feb-14
p 9-Feb-14 12-Feb-14
q 1-Feb-14 7-Feb-14
based on the second data frame I have to create a True False column in the 1st dataframe for P if the date lies between any of start and end date window it should be true else false.
What could be the fastest way of doing this and shortest as well. I tried iterating over the whole data frame but that takes time and makes the code long as well
Maybe I'm overthinking, but
def f(s):
f2 = lambda d, n: ((d >= df2[df2.entity == n].start_date) & (d <= df2[df2.entity==n].end_date)).any()
return(s.transform(f2, n=s.name))
df.groupby('entity').date.transform(f)
0 False
1 True
2 True
3 True
4 False
5 True
6 True
7 False
8 True
9 True
10 True
11 True
12 False
13 False
14 False
15 False
Name: date, dtype
You can also do some preprocessing first to speed up the process
df2['j'] = df2.agg(lambda k: pd.Interval(k.start_date, k.end_date), 1)
dic = df2.groupby('entity').agg(lambda k: list(k)).to_dict()['j']
df[['entity', 'date']].transform(lambda x: any(x['date'] in z for z in dic[x['entity']]), 1)
Notice that this uses pd.Interval
by default closed only on the right, but should be around 20x faster than chained transforms.
IMHO, depending on your data, sometimes it's acceptable to expand date range first
df2 = pd.concat([
pd.DataFrame(pd.date_range(start_date, end_date), columns=['date']).assign(entity=entity)
for _, (entity, start_date, end_date) in df2.iterrows()
]).drop_duplicates()
df.merge(df2, on=['entity', 'date'], how='left', indicator=True)['_merge'] == 'both'
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.