使用CSV文件从考勤日志报告中过滤时间

Question

I have an employee's attendance log report in a csv file and i need to filter all the employee's attendances that were late (after 9:30). 我在csv文件中有一个员工的出勤日志报告，我需要过滤所有迟到的员工出勤（9:30之后）。

I created a function that generated an attendance. 我创建了一个产生出勤率的函数。 Employee enters his ID and the attendance is marked. 员工输入其ID并标记出勤。 the program gets date and time from computer clock, and stores the attendance in a log file. 该程序从计算机时钟获取日期和时间，并将出勤记录存储在日志文件中。

#function that generated an attendance 
def attandance_log():

    dnt =  datetime.datetime.now()
    dnt_string = dnt.strftime("%d/%m/%Y %H:%M:%S")
    empid = input("Enter Your ID :")
    empname=input("Enter Your Name :")
    df1 = pd.DataFrame(data=[[dnt_string,empid,empname]],columns=["Today's Date & Time", "Employee's ID", "Employee's Name"])
    with open('/Users/sapir/Documents/python/final project- employee attandance log/attandance_log.csv', 'a') as f:
        df1.to_csv(f, header=False)
    return df1
attandance_df= attandance_log()

#the functions that filters all late attendances:
def late_emp_report():

    df = pd.read_csv('/Users/sapir/Documents/python/final project- employee attandance log/attandance_log.csv',index_col=0)
    #df[1] = pd.to_datetime(df[1], unit='s')
    # Add to employees list existing file
    #df.loc['29/07/2019 09:30:00': ].head()------->???
    #df_filtered = df[(df[1] <= datetime.time(9,30))]------>???

    print (df_filtered)
    with open('/Users/sapir/Documents/python/final project- employee attandance log/emplist.csv', 'w') as f:
        df.to_csv(f, header=False)
    return df


late_emp_report()

I have no idea how to create a file that shows all attendances after 9:30... 我不知道如何创建一个文件来显示9:30之后的所有出勤情况...

Answer 1

You can apply filters in this form on the whole dataframe at once: 您可以将这种形式的过滤器立即应用于整个数据帧：

filtered_df = original_df[original_df[column_to_filter_on] > somevalue]

This will return a dataframe with all rows from original_df where the column value column_to_filter_on is greater than some_value 这将返回一个数据帧，其中包含original_df中的所有行，其中列值column_to_filter_on大于some_value

I prefer not using 1 as a column header and give it a name instead - to prevent later confusion with indexing. 我更喜欢不使用1作为列标题，而是给它起一个名字-以防止以后与索引混淆。

You'll run into an issue when trying to compare a recurring time (9:30) vs a datetime, so instead, you can introduce a late_flag using .apply() to compare any datetime vs the 9:30 on that date. 尝试比较循环时间（9:30）与日期时间时会遇到问题，因此，您可以使用.apply()引入late_flag来比较该日期与9:30的任何日期时间。

# 'Initialize' datetime column in order to later grab df[1]
df['datetime'] = 0
df['datetime'] = pd.to_datetime(df[1], unit='s')

# Calculate late flag - compare datetime vs 9:30 on the same date for each row
df['late_flag'] = df['datetime'].apply(lambda x: 1 if x > x.replace(hour=9, minute=30, second=0, microsecond=0) else 0)

# Filter out just where late_flag is 1
df_filtered = df[df['late_flag'] == 1]

使用CSV文件从考勤日志报告中过滤时间

问题描述

1 个解决方案

解决方案1
0 2019-08-07 08:56:17

使用CSV文件从考勤日志报告中过滤时间

问题描述

1 个解决方案

解决方案1 0 2019-08-07 08:56:17

解决方案1
0 2019-08-07 08:56:17