I'm a Python beginner.
I'm encountering a problem during my loop to fill an absence matrix.
The absence matrix:
The index represents the date from the beginning of 2020
to today
and the columns represent the USER IDs
.
The dataframe is as follows:
ID_USER NOM PRENOM DATE_first DATE_last
1 X X 30/05/2020 00:00 01/06/2020 23:59
1 X X 01/06/2020 00:00 02/06/2020 23:59
2 X X 01/06/2020 00:00 03/06/2020 23:59
and the result I want:
DATE user1 user2
29/05/2020 0 0
30/05/2020 1 0
01/06/2020 1 1
02/06/2020 1 1
03/06/2020 0 1
The objective is to fill the absence matrix with 1
and 0
. 1
when the ID is absent between DATE_DEBUT_ABSENCE
and DATE_FIN_ABSENCE
.
Exemple :
ID_USER=1
was absent between 2020/01/01
and 2020/01/05
:columns = 1
Here is the code I started :
for i in agenda.columns:
for j in absence_df.ID_USER:
if i==j and agenda.index[i]==absence_df.iloc[j,4]:
agenda.index[i]==1
else :
print('false')
j=j+1
i= i+1
break
print(agenda)
I'm assuming here your dates are in the datetime format, though I'm not sure this will work at first try (dates are tricky in python). It would be better if you could share a sample of the dataset, instead of just a snapchot...
import datetime
import pandas as pd
start = datetime.date(2020, 1, 1)
end = datetime.date(2020,1,5)
daterange = pd.date_range(start, end)
users = sorted(list(set(df.ID_USER)))
agenda = pd.DataFrame(index=daterange, columns=users)
agenda.fillna(0, inplace=True)
for date in date_range:
ix = df[
(df.DATE_first < date) & (date < df.DATE_last)
].index
users_absent = df.loc[ix, 'ID_USER'].tolist()
agent.loc[date, users_absent] = 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.