简体   繁体   English

你如何在 Python 中建模随时间变化的东西?

[英]How do you model something-over-time in Python?

I'm looking for a data type to help me model resource availability over fluid time.我正在寻找一种数据类型来帮助我模拟流动时间内的资源可用性。

  • We're open from 9 til 6 and can handle 5 parallel jobs.我们从 9 点到 6 点开放,可以处理 5 个并行作业。 In my imaginary programming land, I've just initialised an object with that range with a value of 3 across the board.在我想象中的编程领域,我刚刚初始化了一个具有该范围的对象,其值为 3。
  • We have appointments on the books, each with start and end times.我们在书上有约会,每个约会都有开始和结束时间。
  • I need to punch each of those out of the day我需要把每一天都打出来
  • That leaves me with a graph of sorts where the availability goes up and down, but ultimately allowing me to quickly find time ranges where there is remaining availability.这给我留下了可用性上升和下降的各种图表,但最终让我能够快速找到剩余可用性的时间范围。

I've come at this problem from many directions but always come back to the fundamental problem of not knowing a data type to model something as simple as an integer over time.我从多个方向解决了这个问题,但总是回到不知道数据类型来建模像整数这样简单的东西的基本问题。

I could convert my appointments into time series events (eg appointment arrives means -1 availability, appointment leaves means +1) but I still don't know how to manipulate that data so that I can distil out periods where the availability is greater than zero.我可以将我的约会转换为时间序列事件(例如,约会到达意味着 -1 可用性,约会离开意味着 +1)但我仍然不知道如何操作该数据,以便我可以提取可用性大于零的时间段.


Somebody's left a close-vote citing lack of focus, but my goal here seems pretty singular so I'll try to explain the problem graphically.有人以缺乏重点为由进行了近距离投票,但我在这里的目标似乎很单一,因此我将尝试以图形方式解释问题。 I'm trying to infer the periods of time where the number of active jobs falls below a given capacity.我试图推断活动作业数量低于给定容量的时间段。

在此处输入图片说明

Turning a range of known parallel capacity (eg 3 between 9-6) and a list of jobs with variable start/ends, into a list of time ranges of available time.将一系列已知的并行容量(例如 9-6 之间的 3 个)和具有可变开始/结束的作业列表转换为可用时间的时间范围列表。

My approach would be to build the time series, but include the availability object with a value set to the availability in that period.我的方法是构建时间序列,但包括可用性对象,该对象的值设置为该期间的可用性。

availability: 
[
  {
    "start": 09:00,
    "end": 12:00,
    "value": 4
  },
  {
     "start": 12:00,
     "end": 13:00,
     "value": 3
  }
]
data: [
  {
    "start": 10:00,
    "end": 10:30,
  }
]

Build the time series indexing on start/end times, with the value as the value.在开始/结束时间建立时间序列索引,以值作为值。 A start time for availability is +value, end time -value.可用性的开始时间是+值,结束时间是-值。 While for an event, it'd be -1 or +1 as you said.对于事件,如您所说,它是 -1 或 +1。

"09:00" 4
"10:00" -1
"10:30" 1
"12:00" -4
"12:00" 3
"13:00" -3

Then group by index, sum and cumulative sum.然后按索引、总和和累计总和分组。

getting:得到:

"09:00" 4
"10:00" 3
"10:30" 4
"12:00" 3
"13:00" 0

Example code in pandas:熊猫中的示例代码:

import numpy as np
import pandas as pd


data = [
  {
    "start": "10:00",
    "end": "10:30",
  }
]

breakpoints = [
  {
    "start": "00:00",
    "end": "09:00",
    "value": 0
  },
  {
    "start": "09:00",
    "end": "12:00",
    "value": 4
  },
  {
    "start": "12:00",
    "end": "12:30",
    "value": 4
  },
  {
    "start": "12:30",
    "end": "13:00",
    "value": 3
  },
  {
    "start": "13:00",
    "end": "00:00",
    "value": 0
  }
]

df = pd.DataFrame(data, columns=['start', 'end'])

print(df.head(5))

starts = pd.DataFrame(data, columns=['start'])
starts["value"] = -1
starts = starts.set_index("start")

ends = pd.DataFrame(data, columns=['end'])
ends["value"] = 1
ends = ends.set_index("end")

breakpointsStarts = pd.DataFrame(breakpoints, columns=['start', 'value']).set_index("start")

breakpointsEnds = pd.DataFrame(breakpoints, columns=['end', 'value'])
breakpointsEnds["value"] = breakpointsEnds["value"].transform(lambda x: -x)
breakpointsEnds = breakpointsEnds.set_index("end")

countsDf = pd.concat([starts, ends, breakpointsEnds, breakpointsStarts]).sort_index()
countsDf = countsDf.groupby(countsDf.index).sum().cumsum()

print(countsDf)

# Periods that are available

df = countsDf
df["available"] = df["value"] > 0

# Indexes where the value of available changes
# Alternatively swap out available for the value.
time_changes = df["available"].diff()[df["available"].diff() != 0].index.values
newDf = pd.DataFrame(time_changes, columns= ["start"])

# Setting the end column to the value of the next start
newDf['end'] = newDf.transform(np.roll, shift=-1)
print(newDf)

# Join this back in to get the actual value of available
mergedDf = newDf.merge(df, left_on="start", right_index=True)

print(mergedDf)

returning at the end:最后返回:

   start    end  value  available
0  00:00  09:00      0      False
1  09:00  13:00      4       True
2  13:00  00:00      0      False

I'd approach it the same way you did with the appointments.我会像对待约会一样对待它。 Model the free time as appointments on its own.将空闲时间建模为单独的约会。 For each ending appointment check if theres another on ongoing, if so, skip here.对于每个结束约会,检查是否还有另一个正在进行中,如果是,请跳过此处。 If not, find the next starting appointment (one with a start date greater than this ones enddate.)如果不是,请查找下一个开始约会(开始日期大于此结束日期的约会。)

After you iterated all off your appointments, you should have an inverted mask of it.在您迭代完所有约会之后,您应该有一个倒置的面具。

To me, this problem would be well-represented by a list of boolean values.对我来说,这个问题可以用布尔值列表很好地表示。 For ease of explanation, let's assume the length of every potential job is a multiple of 15 minutes.为了便于解释,我们假设每个潜在作业的长度是 15 分钟的倍数。 So, from 9 to 6, we have 135 "time slots" that we want to track availability for.因此,从 9 点到 6 点,我们有 135 个要跟踪可用性的“时间段”。 We represent a queue's availability in a time slot with boolean variables: False if the queue is processing a job, True if the queue is available.我们用布尔变量表示一个时隙中队列的可用性:如果队列正在处理作业,则为False ,如果队列可用则为True

First, we create a list of time slots for every queue as well as the output.首先,我们为每个队列以及输出创建一个时隙列表。 So, every queue and the output has time slots t k , 1 <= k <= 135.因此,每个队列和输出都有时隙 t k ,1 <= k <= 135。

Then, given five job queues, q j , 1 <= j <= 5, we say that t k is "open" at time k if there exists at least one q j where the time slot list at index k is True .然后,给定五个作业队列 q j , 1 <= j <= 5,如果至少存在一个 q j ,其中索引 k 处的时隙列表为True ,我们说 t k在时间 k 是“开放的”。

We can implement this in standalone Python as follows:我们可以在独立的 Python 中实现它,如下所示:

slots = [ True ] * 135
queues = [ slots ] * 5
output = [ False ] * 135

def available (k):

 for q in queues:
  if q[k]:
   return True

 return False

We can then assume there exists some function dispatch (length) that assigns a job to an available queue, setting the appropriate slots in queue[q] to False .然后我们可以假设存在一些函数dispatch (length)将作业分配给可用队列,将queue[q]的适当插槽设置为False

Finally, to update the output, we simply call:最后,要更新输出,我们只需调用:

def update():

 for k in range(0, 135):
  output[k] = available[k]

Or, for increased efficiency:或者,为了提高效率:

def update(i, j):
 for k in range(i, j):
  output[k] = available[k]

Then, you could simply call update(i, j) whenever dispatch() updates time slots i thru j for a new job.然后,您可以简单地调用update(i, j)每当dispatch()更新时间槽ij以用于新作业。 In this way, dispatching and updating is an O(n) operation, where n is how many time slots are being changed, regardless of how many time slots there are.这样,调度和更新是一个 O(n) 操作,其中n是有多少个时隙被改变,而不管有多少个时隙。

This would allow you to make a simple function that maps human-readable time onto the range of time slot values, which would allow for making time slots larger or smaller as you wish.这将允许您创建一个简单的函数,将人类可读的时间映射到时隙值的范围,这将允许根据需要使时隙变大或变小。

You could also easily extend this idea to use a pandas data frame where each column is one queue, allowing you to use Series.any() on every row at once to quickly update the output column.您还可以轻松地扩展此想法以使用每列是一个队列的Series.any()数据框,从而允许您一次在每一行上使用Series.any()来快速更新输出列。

Would love to hear suggestions regarding this approach!很想听听关于这种方法的建议! Perhaps there's a complexity of the problem I've missed, but I think this is a nice solution.也许我错过了这个问题的复杂性,但我认为这是一个很好的解决方案。

You can use (datetime, increment) tuples to keep track of the changes in availability.您可以使用(datetime, increment)元组来跟踪可用性的变化。 A job-start event has increment = 1 and a job-end event has increment = -1 .作业开始事件的increment = 1作业结束事件的increment = -1 Then itertools.accumulate allows for computing the cumulative availability as jobs start and end over time.然后itertools.accumulate允许计算随着时间的推移作业开始和结束时的累积可用性。 Here's an example implementation:这是一个示例实现:

from datetime import time
import itertools as it

def compute_availability(jobs, opening_hours, capacity):
    jobs = [((x, -1), (y, +1)) for x, y in jobs]
    opens, closes = opening_hours
    events = [[opens, capacity]] + sorted(t for job in jobs for t in job) + [(closes, 0)]
    availability = list(it.accumulate(events,
                                      lambda x, y: [y[0], x[1] + y[1]]))
    for x, y in zip(availability, availability[1:]):
        # If multiple events happen at the same time, only yield the last one.
        if y[0] > x[0]:
            yield x

This adds artificial (opens, capacity) and (closes, 0) events to initialize the computation.这增加了人工(opens, capacity)(closes, 0)事件来初始化计算。 The above example considers a single day but it is easy to extend it to multiple days by creating opens and closes datetime objects that share the day of the first and last job respectively.上面的示例考虑了一天,但通过创建分别共享第一个和最后一个作业的datetime openscloses datetime对象,很容易将其扩展到多天。

Example例子

Here is the output for the OP's example schedule:这是 OP 示例计划的输出:

from pprint import pprint

jobs = [(time(10), time(15)),
        (time(9), time(11)),
        (time(12, 30), time(16)),
        (time(10), time(18))]

availability = list(compute_availability(
    jobs, opening_hours=(time(9), time(18)), capacity=3
))
pprint(availability)

which prints:打印:

[[datetime.time(9, 0), 2],
 [datetime.time(10, 0), 0],
 [datetime.time(11, 0), 1],
 [datetime.time(12, 30), 0],
 [datetime.time(15, 0), 1],
 [datetime.time(16, 0), 2]]

The first element indicates when the availability changes and the second element denotes the availability that results from that change.第一个元素表示可用性何时发生变化,第二个元素表示该变化导致的可用性。 For example at 9am one job is submitted causing the availability to drop from 3 to 2 and then at 10am two more jobs are submitted while the first one is still running (hence availability drops to 0).例如,上午 9 点提交一个作业,导致可用性从 3 下降到 2,然后在上午 10 点提交另外两个作业,而第一个作业仍在运行(因此可用性下降到 0)。

Adding new jobs添加新工作

Now that we have the initial availability computed an important aspect is to update it as new jobs are added.现在我们已经计算了初始可用性,一个重要的方面是在添加新作业时更新它。 Here it is desirable not to recompute the availability from the full job list since that might be costly if many jobs are being tracked.这里最好不要从完整的作业列表中重新计算可用性,因为如果正在跟踪许多作业,这可能会很昂贵。 Because the availability is already sorted we can use the bisect module to determine the relevant update range in O(log(N)).因为availability已经排序,我们可以使用bisect模块来确定 O(log(N)) 中的相关更新范围。 Then a number of steps need to be performed.然后需要执行许多步骤。 Let's say the job is scheduled as [x, y] where x , y are two datetime objects.假设作业被安排为[x, y] ,其中x , y是两个日期时间对象。

  1. Check if the availability in the [x, y] interval is greater than zero (including the event to the left of x (ie the previous event)).检查[x, y]区间内的可用性是否大于零(包括x左侧的事件(即前一个事件))。
  2. Decrease the availability of all events in [x, y] by 1.[x, y]中所有事件的可用性降低 1。
  3. If x is not in the list of events we need to add it, otherwise we need to check whether we can merge the x event with the one left to it.如果x不在事件列表中,我们需要添加它,否则我们需要检查是否可以将x事件与剩下的事件合并。
  4. If y is not in the list of events we need to add it.如果y不在事件列表中,我们需要添加它。

Here is the relevant code:这是相关的代码:

import bisect

def add_job(availability, job, *, weight=1):
    """weight: how many lanes the job requires"""
    job = list(job)
    start = bisect.bisect(availability, job[:1])
    # Emulate a `bisect_right` which doens't work directly since
    # we're comparing lists of different length.
    if start < len(availability):
        start += (job[0] == availability[start][0])
    stop = bisect.bisect(availability, job[1:])

    if any(slot[1] < weight for slot in availability[start-1:stop]):
        raise ValueError('The requested time slot is not available')

    for slot in availability[start:stop]:
        slot[1] -= weight

    if job[0] > availability[start-1][0]:
        previous_availability = availability[start-1][1]
        availability.insert(start, [job[0], previous_availability - weight])
        stop += 1
    else:
        availability[start-1][1] -= weight
        if start >= 2 and availability[start-1][1] == availability[start-2][1]:
            del availability[start-1]
            stop -= 1

    if stop == len(availability) or job[1] < availability[stop][0]:
        previous_availability = availability[stop-1][1]
        availability.insert(stop, [job[1], previous_availability + weight])

Example schedule示例时间表

We can test it by adding some jobs to the OP's example schedule:我们可以通过向 OP 的示例计划添加一些作业来测试它:

for job in [[time(15), time(17)],
            [time(11, 30), time(12)],
            [time(13), time(14)]]:  # this one should raise since availability is zero
    print(f'\nAdding {job = }')
    add_job(availability, job)
    pprint(availability)

which outputs:输出:

Adding job = [datetime.time(15, 0), datetime.time(17, 0)]
[[datetime.time(9, 0), 2],
 [datetime.time(10, 0), 0],
 [datetime.time(11, 0), 1],
 [datetime.time(12, 30), 0],
 [datetime.time(16, 0), 1],
 [datetime.time(17, 0), 2]]

Adding job = [datetime.time(11, 30), datetime.time(12, 0)]
[[datetime.time(9, 0), 2],
 [datetime.time(10, 0), 0],
 [datetime.time(11, 0), 1],
 [datetime.time(11, 30), 0],
 [datetime.time(12, 0), 1],
 [datetime.time(12, 30), 0],
 [datetime.time(16, 0), 1],
 [datetime.time(17, 0), 2]]

Adding job = [datetime.time(13, 0), datetime.time(14, 0)]
Traceback (most recent call last):
  [...]
ValueError: The requested time slot is not available

Blocking night hours阻止夜间时间

We can also use this interface to block all lanes during hours when the service is unavailable (eg from 6pm to 9am on the next day).我们还可以使用此接口在服务不可用的时间段(例如从下午 6 点到第二天上午 9 点)封锁所有车道。 Just submit a job with weight=capacity for that time span:只需在该时间段内提交一个weight=capacity的工作:

add_job(availability,
        [datetime(2020, 3, 14, 18), datetime(2020, 3, 15, 9)]
        weight=3)

Build full schedule from scratch从头开始构建完整的时间表

We can also use add_job to build the full schedule from scratch:我们还可以使用add_job从头开始构建完整的计划:

availability = availability = list(compute_availability(
    [], opening_hours=(time(9), time(18)), capacity=3
))
print('Initial availability')
pprint(availability)
for job in jobs:
    print(f'\nAdding {job = }')
    add_job(availability, job)
    pprint(availability)

which outputs:输出:

Initial availability
[[datetime.time(9, 0), 3]]

Adding job = (datetime.time(10, 0), datetime.time(15, 0))
[[datetime.time(9, 0), 3],
 [datetime.time(10, 0), 2],
 [datetime.time(15, 0), 3]]

Adding job = (datetime.time(9, 0), datetime.time(11, 0))
[[datetime.time(9, 0), 2],
 [datetime.time(10, 0), 1],
 [datetime.time(11, 0), 2],
 [datetime.time(15, 0), 3]]

Adding job = (datetime.time(12, 30), datetime.time(16, 0))
[[datetime.time(9, 0), 2],
 [datetime.time(10, 0), 1],
 [datetime.time(11, 0), 2],
 [datetime.time(12, 30), 1],
 [datetime.time(15, 0), 2],
 [datetime.time(16, 0), 3]]

Adding job = (datetime.time(10, 0), datetime.time(18, 0))
[[datetime.time(9, 0), 2],
 [datetime.time(10, 0), 0],
 [datetime.time(11, 0), 1],
 [datetime.time(12, 30), 0],
 [datetime.time(15, 0), 1],
 [datetime.time(16, 0), 2],
 [datetime.time(18, 0), 3]]

Unless your time resolution is finer than a minute, I would suggest using a map of minutes in the day with a set of jobIds assigned over the time span of each job除非您的时间分辨率小于一分钟,否则我建议使用一天中的分钟图,并在每个工作的时间跨度内分配一组 jobId

For example:例如:

# convert time to minute of the day (assumes24H time, but you can make this your own way)
def toMinute(time): 
    return sum(p*t for p,t in zip(map(int,time.split(":")),(60,1)))

def toTime(minute):
    return f"{minute//60}:{minute%60:02d}"

# booking a job adds it to all minutes covered by its duration
def book(timeMap,jobId,start,duration):
    startMin = toMinute(start)
    for m in range(startMin,startMin+duration):
        timeMap[m].add(jobId)

# unbooking a job removes it from all minutes where it was present
def unbook(timeMap,jobId):
    for s in timeMap:
        s.discard(jobId)

# return time ranges for minutes meeting a given condition
def minuteSpans(timeMap,condition,start="09:00",end="18:00"):
    start,end  = toMinute(start),toMinute(end)
    timeRange  = timeMap[start:end]
    match      = [condition(s) for s in timeRange]
    breaks     = [True] + [a!=b for a,b in zip(match,match[1:])]
    starts     = [i for (i,a),b in zip(enumerate(match),breaks) if b]
    return [(start+s,start+e) for s,e in zip(starts,starts[1:]+[len(match)]) if match[s]]

def timeSpans(timeMap,condition,start="09:00",end="18:00"):
    return [(toTime(s),toTime(e)) for s,e in minuteSpans(timeMap,condition,start,end)]

# availability is ranges of minutes where the number of jobs is less than your capacity
def available(timeMap,start="09:00",end="18:00",maxJobs=5):
    return timeSpans(timeMap,lambda s:len(s)<maxJobs,start,end)

sample usage:示例用法:

timeMap = [set() for _ in range(1440)]

book(timeMap,"job1","9:45",25)
book(timeMap,"job2","9:30",45)
book(timeMap,"job3","9:00",90)

print(available(timeMap,maxJobs=3))
[('9:00', '9:45'), ('10:10', '18:00')]

print(timeSpans(timeMap,lambda s:"job3" in s))
[('9:00', '10:30')]

With a few adjustments you could even have discontinuous jobs that skip over some periods (eg lunch time).通过一些调整,您甚至可以拥有跳过某些时间段(例如午餐时间)的不连续工作。 You can also block out some periods by placing fake jobs in them.您还可以通过在其中放置假工作来阻止某些时期。

If you need to manage job queues individually, you can have separate time maps (one per queue) and combine them into one when you need to have a global picture:如果您需要单独管理作业队列,您可以拥有单独的时间图(每个队列一个),并在需要全局图时将它们合二为一:

 print(available(timeMap1,maxJobs=1))
 print(available(timeMap2,maxJobs=1))
 print(available(timeMap3,maxJobs=1))

 globalMap = list(set.union(*qs) for qs in zip(timeMap1,timeMap2,timeMap3))
 print(available(globalMap),maxJobs=3)

Put all this into a TimeMap class (instead of individual functions) and you should have a pretty good toolset to work with.将所有这些放入 TimeMap 类(而不是单个函数)中,您应该有一个非常好的工具集可以使用。

You can use a dedicated class representing a lane that can run jobs.您可以使用表示可以运行作业的通道的专用类。 These objects can keep track of jobs and correspondingly of their availability:这些对象可以跟踪作业及其可用性:

import bisect
from datetime import time
from functools import total_ordering
import math


@total_ordering
class TimeSlot:
    def __init__(self, start, stop, lane):
        self.start = start
        self.stop = stop
        self.lane = lane

    def __contains__(self,  other):
        return self.start <= other.start and self.stop >= other.stop

    def __lt__(self, other):
        return (self.start, -self.stop.second) < (other.start, -other.stop.second)

    def __eq__(self, other):
        return (self.start, -self.stop.second) == (other.start, -other.stop.second)

    def __str__(self):
        return f'({self.lane}) {[self.start, self.stop]}'

    __repr__ = __str__


class Lane:
    @total_ordering
    class TimeHorizon:
        def __repr__(self):
            return '...'
        def __lt__(self, other):
            return False
        def __eq__(self, other):
            return False
        @property
        def second(self):
            return math.inf
        @property
        def timestamp(self):
            return math.inf

    time_horizon = TimeHorizon()
    del TimeHorizon

    def __init__(self, start, nr):
        self.nr = nr
        self.availability = [TimeSlot(start, self.time_horizon, self)]

    def add_job(self, job):
        if not isinstance(job, TimeSlot):
            job = TimeSlot(*job, self)
        # We want to bisect_right but only on the start time,
        # so we need to do it manually if they are equal.
        index = bisect.bisect_left(self.availability, job)
        if index < len(self.availability):
            index += (job.start == self.availability[index].start)
        index -= 1  # select the corresponding free slot
        slot = self.availability[index]
        if slot.start > job.start or slot.stop is not self.time_horizon and job.stop > slot.stop:
            raise ValueError('Requested time slot not available')
        if job == slot:
            del self.availability[index]
        elif job.start == slot.start:
            slot.start = job.stop
        elif job.stop == slot.stop:
            slot.stop = job.start
        else:
            slot_end = slot.stop
            slot.stop = job.start
            self.availability.insert(index+1, TimeSlot(job.stop, slot_end, self))

A Lane object can be used as follows:可以按如下方式使用Lane对象:

lane = Lane(start=time(9), nr=1)
print(lane.availability)
lane.add_job([time(11), time(14)])
print(lane.availability)

which outputs:输出:

[(1) [datetime.time(9, 0), ...]]
[(1) [datetime.time(9, 0), datetime.time(11, 0)],
 (1) [datetime.time(14, 0), ...]]

After adding the job, the availability gets updated as well.添加作业后,可用性也会更新。

Now we could use multiple of these lane objects together to represent a full schedule.现在我们可以一起使用多个这些车道对象来表示一个完整的时间表。 Jobs can be added as required and the availability will be updated automatically:可以根据需要添加作业,可用性将自动更新:

class Schedule:
    def __init__(self, n_lanes, start):
        self.lanes = [Lane(start, nr=i) for i in range(n_lanes)]

    def add_job(self, job):
        for lane in self.lanes:
            try:
                lane.add_job(job)
            except ValueError:
                pass
            else:
                break

Testing on example schedule按示例时间表测试

from pprint import pprint

# Example jobs from OP.
jobs = [(time(10), time(15)),
        (time(9), time(11)),
        (time(12, 30), time(16)),
        (time(10), time(18))]

schedule = Schedule(3, start=time(9))
for job in jobs:
    schedule.add_job(job)

for lane in schedule.lanes:
    pprint(lane.availability)

which outputs:输出:

[(0) [datetime.time(9, 0), datetime.time(10, 0)],
 (0) [datetime.time(15, 0), ...]]
[(1) [datetime.time(11, 0), datetime.time(12, 30)],
 (1) [datetime.time(16, 0), ...]]
[(2) [datetime.time(9, 0), datetime.time(10, 0)],
 (2) [datetime.time(18, 0), ...]]

Automatic load balancing for jobs作业的自动负载平衡

We can create a dedicated tree-like structure which keeps track of time slots of all lanes for selecting the best suited slot when registering a new job.我们可以创建一个专用的树状结构,跟踪所有通道的时隙,以便在注册新作业时选择最适合的时隙。 A node in the tree represents a single time slot and its children are all time slots that are contained within that slot.树中的节点代表单个时隙,其子节点是包含在该时隙内的所有时隙。 Then, when registering a new job, we can search the tree to find an optimal slot.然后,在注册新工作时,我们可以搜索树以找到最佳位置。 The tree and the lanes share the same time slots so we only need to adjust slots manually when they are either deleted or new ones are inserted.树和车道共享相同的时隙,因此我们只需要在删除或插入新时隙时手动调整时隙。 Here is the relevant code, it is a bit lengthy (just a quick draft):这是相关代码,它有点冗长(只是一个快速草稿):

import itertools as it

class OneStepBuffered:
    """Can back up elements that are consumed by `it.takewhile`.
       From: https://stackoverflow.com/a/30615837/3767239
    """
    _sentinel = object()

    def __init__(self, it):
        self._it = iter(it)
        self._last = self._sentinel
        self._next = self._sentinel

    def __iter__(self):
        return self

    def __next__(self):
        sentinel = self._sentinel
        if self._next is not sentinel:
            next_val, self._next = self._next, sentinel
            return next_val
        try:
            self._last = next(self._it)
            return self._last
        except StopIteration:
            self._last = self._next = sentinel
            raise

    def step_back(self):
        if self._last is self._sentinel:
            raise ValueError("Can't back up a step")
        self._next, self._last = self._last, self._sentinel


class SlotTree:
    def __init__(self, slot, subslots, parent=None):
        self.parent = parent
        self.slot = slot
        self.subslots = []
        slots = OneStepBuffered(subslots)
        for slot in slots:
            subslots = it.takewhile(lambda x: x.stop <= slot.stop, slots)
            self.subslots.append(SlotTree(slot, subslots, self))
            try:
                slots.step_back()
            except ValueError:
                break

    def __str__(self):
        sub_repr = ['\n|   '.join(str(slot).split('\n'))
                    for slot in self.subslots]
        sub_repr = [f'|   {x}' for x in sub_repr]
        sub_repr = '\n'.join(sub_repr)
        sep = '\n' if sub_repr else ''
        return f'{self.slot}{sep}{sub_repr}'

    def find_minimal_containing_slot(self, slot):
        try:
            return min(self.find_containing_slots(slot),
                       key=lambda x: x.slot.stop.second - x.slot.start.second)
        except ValueError:
            raise ValueError('Requested time slot not available') from None

    def find_containing_slots(self, slot):
        for candidate in self.subslots:
            if slot in candidate.slot:
                yield from candidate.find_containing_slots(slot)
                yield candidate

    @classmethod
    def from_slots(cls, slots):
        # Ascending in start time, descending in stop time (secondary).
        return cls(cls.__name__, sorted(slots))


class Schedule:
    def __init__(self, n_lanes, start):
        self.lanes = [Lane(start, i+1) for i in range(n_lanes)]
        self.slots = SlotTree.from_slots(
            s for lane in self.lanes for s in lane.availability)

    def add_job(self, job):
        if not isinstance(job, TimeSlot):
            job = TimeSlot(*job, lane=None)
        # Minimal containing slot is one possible strategy,
        # others can be implemented as well.
        slot = self.slots.find_minimal_containing_slot(job)
        lane = slot.slot.lane
        if job == slot.slot:
            slot.parent.subslots.remove(slot)
        elif job.start > slot.slot.start and job.stop < slot.slot.stop:
            slot.parent.subslots.insert(
                slot.parent.subslots.index(slot) + 1,
                SlotTree(TimeSlot(job.stop, slot.slot.stop, lane), [], slot.parent))
        lane.add_job(job)

Now we can use the Schedule class to automatically assign jobs to lanes and update their availability:现在我们可以使用Schedule类自动将作业分配到通道并更新它们的可用性:

if __name__ == '__main__':
    jobs = [(time(10), time(15)),  # example from OP
            (time(9), time(11)),
            (time(12, 30), time(16)),
            (time(10), time(18))]

    schedule = Schedule(3, start=time(9))
    print(schedule.slots, end='\n\n')
    for job in jobs:
        print(f'Adding {TimeSlot(*job, "new slot")}')
        schedule.add_job(job)
        print(schedule.slots, end='\n\n')

which outputs:输出:

SlotTree
|   (1) [datetime.time(9, 0), ...]
|   (2) [datetime.time(9, 0), ...]
|   (3) [datetime.time(9, 0), ...]

Adding (new slot) [datetime.time(10, 0), datetime.time(15, 0)]
SlotTree
|   (1) [datetime.time(9, 0), datetime.time(10, 0)]
|   (1) [datetime.time(15, 0), ...]
|   (2) [datetime.time(9, 0), ...]
|   (3) [datetime.time(9, 0), ...]

Adding (new slot) [datetime.time(9, 0), datetime.time(11, 0)]
SlotTree
|   (1) [datetime.time(9, 0), datetime.time(10, 0)]
|   (1) [datetime.time(15, 0), ...]
|   (2) [datetime.time(11, 0), ...]
|   (3) [datetime.time(9, 0), ...]

Adding (new slot) [datetime.time(12, 30), datetime.time(16, 0)]
SlotTree
|   (1) [datetime.time(9, 0), datetime.time(10, 0)]
|   (1) [datetime.time(15, 0), ...]
|   (2) [datetime.time(11, 0), datetime.time(12, 30)]
|   (2) [datetime.time(16, 0), ...]
|   (3) [datetime.time(9, 0), ...]

Adding (new slot) [datetime.time(10, 0), datetime.time(18, 0)]
SlotTree
|   (1) [datetime.time(9, 0), datetime.time(10, 0)]
|   (1) [datetime.time(15, 0), ...]
|   (2) [datetime.time(11, 0), datetime.time(12, 30)]
|   (2) [datetime.time(16, 0), ...]
|   (3) [datetime.time(9, 0), datetime.time(10, 0)]
|   (3) [datetime.time(18, 0), ...]

The numbers (i) indicate the lane number and the [] indicate the available time slots on that lane.数字(i)表示通道编号, []表示该通道上的可用时隙。 A ... indicates "open end" (time horizon). A ...表示“开放式结束”(时间范围)。 As we can see the tree doesn't restructure itself when time slots are adjusted;正如我们所看到的,当调整时隙时,树不会自行重组; this would be a possible improvement.这将是一个可能的改进。 Ideally for each new job, the corresponding best fitting time slot would be popped from the tree and then, depending on how the job fits in the slot, an adjusted version and possibly new slots are pushed back to the tree (or none at all if the job fits the slot exactly).理想情况下,对于每个新工作,相应的最佳时隙将从树中弹出,然后,根据工作如何适应时隙,将调整后的版本和可能的新时隙推回到树中(或者根本没有,如果该工作完全适合该插槽)。

The above examples consider just a single day and time objects, but the code is easy to extend for usage with datetime objects as well.上面的例子只考虑了一个日期和time对象,但代码很容易扩展以使用datetime对象。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM