简体   繁体   English

如何将日期列表与模式匹配?

[英]How to match a list of dates to a pattern?

I have a Python list of tuples with three objects: a string (ex: title), date and another string (ex: a name). 我有一个带有三个对象的tuples的Python list :一个string (例如:title), date和另一个string (例如:name)。

Example: 例:

scientific_works = [
    ('SW 1', datetime.date(2000, 10, 15), 'auth 1'),
    ('SW 2', datetime.date(2000, 11, 3), 'auth 1'),
    ('SW 3', datetime.date(2000, 11, 4), 'auth 1'),
    ('SW 4', datetime.date(2000, 12, 1), 'auth 1'),
]

Then I have a pattern: 然后我有一个模式:

from date until date , (at least) int items from list per int days/weeks/months/years datedate ,(至少) int从项目listint天/周/月/年

Example: 例:

from  datetime.date(2000, 11, 1)
until datetime.date(2000, 11, 30)
1 item per day

What I would like the algorithm to do: 我想要该算法做什么:

  • Given that list and that pattern, do the filtered items match the rules? 给定该列表和该模式,过滤出的项目是否符合规则?

In the case of examples, this pattern would match 2 items, all of them matching the rule here: 1 item complete per day , however, since there aren't an item for each day block, the algorithm would return false . 在示例的情况下,此模式将匹配2个项目,所有这些项目均符合此处的规则: 1 item complete per day ,但是,由于每天的区块中都没有an item ,因此算法将返回false

Another example: 另一个例子:

  • Is there (at least) Int_1 amount of items (works) per Int_2 (day/week/month)? 每个Int_2(每天/每周/每月)是否至少有Int_1个项目(作品)?
    • 1 work per day would mean at least 1 item per 1 day block of given the date range. 每天进行1项工作,则意味着在给定的日期范围内,每1天至少要有1个项目。 2 works per week would mean, at least 2 works each week (or 7 day block) of date range. 每周2次作品意味着日期范围内每周至少2次作品(或7天)。

I can iterate over the list and find out which items match from and until pattern, of course. 我可以遍历列表,并找出哪些项目匹配fromuntil模式,当然。

However, I am really confused over matching them with the rest of the rules to see if its a positive or a negative match. 但是,对于将它们与其余规则进行匹配,以查看其是肯定的还是否定的,我确实感到困惑。

My question: 我的问题:

  • How can I construct an algorithm, provide it with a list and a pattern of rules (x items per y day OR week OR month OR year), and see if it matches or not? 如何构造算法,为其提供列表和规则模式(每个y天或每周或每月或每年x个项目),并查看是否匹配?

I am working on a little component for an application where given a certain data (list) and rules (pattern), if an author unlocks a reward or not. 我正在为应用程序的一个小组件工作,无论作者是否解锁奖励,给定的数据(列表)和规则(模式)。

I have completed udacity's several Python classes, including most of algorithms but really can not find my way around this. 我已经完成了udacity的几个Python类,包括大多数算法,但确实找不到解决方法。

So far I thought of this: 到目前为止,我想到了这一点:

  1. Filter list items with the given date range. 过滤具有给定日期范围的列表项。
  2. Calculate the range blocks within the range: 1 day from d1 until d2 = 5 days - 1 week from d1 until d2 = 3 weeks 计算范围内的范围块:从d1到d2的1天= 5天-从d1到d2的3周= 3周
  3. Create a loop in range of int calculated above. 在上面计算的int范围内创建一个循环。
  4. Convert weeks, months, years to days in each step of the loop. 在循环的每个步骤中,将周,月,年转换为天。
  5. Add the amount to the start date and see if items match the date range. 将金额添加到开始日期,然后查看项目是否与日期范围匹配。
  6. Add the amount to next start of date range and repeat. 将金额添加到日期范围的下一个开始并重复。

However, this doesn't work and I don't think converting blocks to days is efficient at all. 但是,这不起作用,我也不认为将块转换为几天是完全有效的。

Thank you. 谢谢。

Can you post a better example of the rules that the match has to follow? 您能否发布一个更好的示例来说明比赛必须遵循的规则? Are you looking for a certain number of items per author per time period? 您是否在每个时间段寻找每位作者一定数量的项目? Or are you looking for certain entries over a time period and then finding who they belong to? 或者,您是在一段时间内寻找某些条目,然后找到它们所属的人? That will effect the sort. 这将影响排序。

I think you will end up having to use a sort algorithm on this data, which is not horrible if you go about it the right way. 我认为您最终将不得不对这些数据使用排序算法,如果以正确的方式进行操作,这并不可怕。

From the bottom part of your question I think that if you are searching for x items per n time-periods (day/week/month) and then determining the authors it might be a bit messy. 从问题的底部开始,我认为如果您每n个时间段(天/周/月)搜索x项,然后确定作者,则可能会有些混乱。 If you have a finite number of authors it might be easier to flip that around and create an array for each author and store the item and date in there. 如果您的作者人数有限,可能更容易将其翻转并为每个作者创建一个数组,然后将项目和日期存储在其中。 Then you just run a testing loop over each author that checks all their entries to see if they fit the requirements. 然后,您只需对每个作者运行一个测试循环,即可检查他们的所有条目以查看它们是否符合要求。

For Python classes, MIT OpenCourseware's 6.00 Introduction to Computer Science and Programming is very good. 对于Python类,MIT OpenCourseware的6.00《计算机科学与编程简介》非常好。 It can be found at http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-00-introduction-to-computer-science-and-programming-fall-2008/ > 可以在以下网址找到它: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-00-introduction-to-computer-science-and-programming-fall-2008/ >

I would use following design: main generator function which iterates over sequence of works and yields "good" ones; 我将使用以下设计:主生成器函数,它在工作序列上进行迭代并产生“好”的函数; and a set of pluggable filters which implement particular rules, such as date range, N items per day, per week, per month, etc. 以及一组可实现特定规则的可插拔过滤器,例如日期范围,每天,每周,每月N个项目等。

Following is an small example to illustrate the idea: 下面是一个小例子来说明这个想法:

from datetime import date
from pprint import pprint

scientific_works = [
    ('SW 1', date(2000, 10, 15), 'auth 1'),
    ('SW 2', date(2000, 11, 3), 'auth 1'),
    ('SW 3', date(2000, 11, 4), 'auth 1'),
    ('SW 4', date(2000, 11, 5), 'auth 1'),
    ('SW 5', date(2000, 12, 1), 'auth 1'),
    ('SW 6', date(2000, 12, 15), 'auth 1'),
    ('SW 7', date(2000, 12, 18), 'auth 1'),
    ('SW 8', date(2000, 12, 22), 'auth 1'),
]

def filter_works(works, *filters):
    for work in works:
        good = True
        for fil in filters:
            good = good and fil(work)
        if good:
            yield work

class RangeFilter(object):
    def __init__(self, from_date, to_date):
        self.from_date = from_date
        self.to_date = to_date

    def __call__(self, work):
        return self.from_date <= work[1] <= self.to_date


class WorksPerMonthFilter(object):
    def __init__(self, limit):
        self.limit = limit
        self._current_month = date.min
        self._current_number = 0

    def __call__(self, work):
        month = date(work[1].year, work[1].month, 1)
        if month == self._current_month:
            self._current_number += 1
        else:
            self._current_month = month
            self._current_number = 1
        return self._current_number <= self.limit


if __name__ == '__main__':
    pprint(list(filter_works(scientific_works, RangeFilter(date(2000, 10, 1), date(2000, 11, 30)), WorksPerMonthFilter(2))))
    pprint(list(filter_works(scientific_works, RangeFilter(date(2000, 10, 1), date(2000, 12, 31)), WorksPerMonthFilter(2))))
    pprint(list(filter_works(scientific_works, RangeFilter(date(2000, 10, 1), date(2000, 12, 31)), WorksPerMonthFilter(3))))

If the pattern is: 如果模式是:

from  start_date
until end_date
X items per period

then to find out whether scientific_works matches the pattern, an analog of numpy.histogram() function could be used: 然后找出是否scientific_works图案,的模拟匹配numpy.histogram()函数可以使用:

import datetime
import numpy as np

ts = datetime.date.toordinal # or any monotonic numeric `date` function 
hist = np.histogram(map(ts, (date for title, date, name in scientific_works)),
                    bins=map(ts, daterange(start_date, end_date, period))[0]
does_it_match = all(x >= X for x in hist)

where: 哪里:

def daterange(start_date, end_date, period):
    d = start_date
    while d < end_date:
        yield d
        d += period

Example: 例:

>>> from datetime import date, timedelta
>>> list(daterange(date(2000, 1, 1), date(2000, 2, 1), timedelta(days=7)))
[datetime.date(2000, 1, 1), datetime.date(2000, 1, 8),
 datetime.date(2000, 1, 15), datetime.date(2000, 1, 22),
 datetime.date(2000, 1, 29)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM