[英]filter time from attendance log report using csv file
I have an employee's attendance log report in a csv file and i need to filter all the employee's attendances that were late (after 9:30). 我在csv文件中有一个员工的出勤日志报告,我需要过滤所有迟到的员工出勤(9:30之后)。
I created a function that generated an attendance. 我创建了一个产生出勤率的函数。 Employee enters his ID and the attendance is marked. 员工输入其ID并标记出勤。 the program gets date and time from computer clock, and stores the attendance in a log file. 该程序从计算机时钟获取日期和时间,并将出勤记录存储在日志文件中。
#function that generated an attendance
def attandance_log():
dnt = datetime.datetime.now()
dnt_string = dnt.strftime("%d/%m/%Y %H:%M:%S")
empid = input("Enter Your ID :")
empname=input("Enter Your Name :")
df1 = pd.DataFrame(data=[[dnt_string,empid,empname]],columns=["Today's Date & Time", "Employee's ID", "Employee's Name"])
with open('/Users/sapir/Documents/python/final project- employee attandance log/attandance_log.csv', 'a') as f:
df1.to_csv(f, header=False)
return df1
attandance_df= attandance_log()
#the functions that filters all late attendances:
def late_emp_report():
df = pd.read_csv('/Users/sapir/Documents/python/final project- employee attandance log/attandance_log.csv',index_col=0)
#df[1] = pd.to_datetime(df[1], unit='s')
# Add to employees list existing file
#df.loc['29/07/2019 09:30:00': ].head()------->???
#df_filtered = df[(df[1] <= datetime.time(9,30))]------>???
print (df_filtered)
with open('/Users/sapir/Documents/python/final project- employee attandance log/emplist.csv', 'w') as f:
df.to_csv(f, header=False)
return df
late_emp_report()
I have no idea how to create a file that shows all attendances after 9:30... 我不知道如何创建一个文件来显示9:30之后的所有出勤情况...
You can apply filters in this form on the whole dataframe at once: 您可以将这种形式的过滤器立即应用于整个数据帧:
filtered_df = original_df[original_df[column_to_filter_on] > somevalue]
This will return a dataframe with all rows from original_df where the column value column_to_filter_on
is greater than some_value
这将返回一个数据帧,其中包含original_df中的所有行,其中列值column_to_filter_on
大于some_value
I prefer not using 1
as a column header and give it a name instead - to prevent later confusion with indexing. 我更喜欢不使用1
作为列标题,而是给它起一个名字-以防止以后与索引混淆。
You'll run into an issue when trying to compare a recurring time (9:30) vs a datetime, so instead, you can introduce a late_flag
using .apply()
to compare any datetime vs the 9:30 on that date. 尝试比较循环时间(9:30)与日期时间时会遇到问题,因此,您可以使用.apply()
引入late_flag
来比较该日期与9:30的任何日期时间。
# 'Initialize' datetime column in order to later grab df[1]
df['datetime'] = 0
df['datetime'] = pd.to_datetime(df[1], unit='s')
# Calculate late flag - compare datetime vs 9:30 on the same date for each row
df['late_flag'] = df['datetime'].apply(lambda x: 1 if x > x.replace(hour=9, minute=30, second=0, microsecond=0) else 0)
# Filter out just where late_flag is 1
df_filtered = df[df['late_flag'] == 1]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.