简体   繁体   中英

Python iterate through list with pointer

I am trying to create a session template of dates for a dataframe in pandas based on the start and end day of the week of my given dataframe. I have the start and end day abbreviations (Mo, Tu, We, etc.) and the start/end time (8:30 AM, 5:30 PM, etc.).

What I want to create is a template that gives the start day abbreviation, the times over the days that it spans, and the end day. For example, my dataframe currently looks like the following:

Start Time  End Time    Start/End            Namestart Nameend    Days           Session Template   
Mo 8:30 AM  Th 5:30 PM  Mo 8:30 AM-Th 5:30 PM    Mo      Th       4 Day       4 Day Mo 8:30 AM-Th 5:30 PM
We 8:30 AM  Fr 12:30 PM We 8:30 AM-Fr 12:30 PM   We      Fr       3 Day     3 Day We 8:30 AM-Fr 12:30 PM

The current session template gives me the day count, start time, end time, and the day of week it begins/ends on. However, I would like for it to give each individual day that the item spans. For the examples above it should yield:

4 Day Mo 8:30 AM-5:30 PM, Tu 8:30 AM-5:30 PM, We 8:30 AM-5:30 PM, Th 8:30 AM-5:30 PM. 
3 Day We 8:30 AM-5:30 PM, Th 8:30 AM-5:30 PM, Fr 8:30 AM-12:30 PM

Here is how you can do it:

import pandas as pd
import re
import itertools
pd.set_option('display.max_columns', 100)
pd.set_option('display.width', 1000)

df = pd.read_csv("data.csv")
print(df, "\n")

days = ['Mo', 'Tu', 'We', 'Th', 'Fr', 'Sa', 'Su']

for index, row in df.iterrows():
    # get the start and end days
    start_day = row['Namestart']
    end_day = row['Nameend']
    # get the start end end times
    start_time = re.findall(r'\s(\d+\:\d{2}\s?(?:AM|PM|am|pm))',
                           row['Start Time'])[0]
    end_time = re.findall(r'\s(\d+\:\d{2}\s?(?:AM|PM|am|pm))',
                         row['End Time'])[0]
    # get the indices corresponding to the start and end days
    start_index = days.index(start_day)
    end_index = days.index(end_day)+1
    # count the number of days
    cnt = end_index - start_index
    print(cnt, "days\t", end='')
    # slice the days list from start_index to end_index
    for day in itertools.islice(days, start_index, end_index):
        if (day!=end_day):
            print(day, start_time, "- 5:30 PM\t", end='')
        else:
            print(day, start_time, "-", end_time, end='')
    print() # to start a new line before printing each row

Output:

   Start Time     End Time               Start/End Namestart Nameend   Days              Session Template
0  Mo 8:30 AM   Th 5:30 PM   Mo 8:30 AM-Th 5:30 PM        Mo      Th  4 Day   4 Day Mo 8:30 AM-Th 5:30 PM
1  We 8:30 AM  Fr 12:30 PM  We 8:30 AM-Fr 12:30 PM        We      Fr  3 Day  3 Day We 8:30 AM-Fr 12:30 PM 

4 days  Mo 8:30 AM - 5:30 PM    Tu 8:30 AM - 5:30 PM    We 8:30 AM - 5:30 PM    Th 8:30 AM - 5:30 PM
3 days  We 8:30 AM - 5:30 PM    Th 8:30 AM - 5:30 PM    Fr 8:30 AM - 12:30 PM

The comments should explain the code. An explanation of the regular expressions I have used can be found in this answer - https://stackoverflow.com/a/49217300/6590393 .

Also, please note that the above code is based on the assumption that you only move forward in the list. So for example, Sa-Mo, would not yield the expected result. I would leave it to you to handle the boundary cases if you need.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM