[英]How to match a list of dates to a pattern?
I have a Python list
of tuples
with three objects: a string
(ex: title), date
and another string
(ex: a name). 我有一个带有三个对象的
tuples
的Python list
:一个string
(例如:title), date
和另一个string
(例如:name)。
Example: 例:
scientific_works = [
('SW 1', datetime.date(2000, 10, 15), 'auth 1'),
('SW 2', datetime.date(2000, 11, 3), 'auth 1'),
('SW 3', datetime.date(2000, 11, 4), 'auth 1'),
('SW 4', datetime.date(2000, 12, 1), 'auth 1'),
]
Then I have a pattern: 然后我有一个模式:
from date
until date
, (at least) int
items from list
per int
days/weeks/months/years 从
date
到date
,(至少) int
从项目list
每int
天/周/月/年
Example: 例:
from datetime.date(2000, 11, 1)
until datetime.date(2000, 11, 30)
1 item per day
What I would like the algorithm to do: 我想要该算法做什么:
In the case of examples, this pattern would match 2 items, all of them matching the rule here: 1 item complete per day
, however, since there aren't an item
for each day block, the algorithm would return false
. 在示例的情况下,此模式将匹配2个项目,所有这些项目均符合此处的规则:
1 item complete per day
,但是,由于每天的区块中都没有an item
,因此算法将返回false
。
Another example: 另一个例子:
I can iterate over the list and find out which items match from
and until
pattern, of course. 我可以遍历列表,并找出哪些项目匹配
from
和until
模式,当然。
However, I am really confused over matching them with the rest of the rules to see if its a positive or a negative match. 但是,对于将它们与其余规则进行匹配,以查看其是肯定的还是否定的,我确实感到困惑。
My question: 我的问题:
I am working on a little component for an application where given a certain data (list) and rules (pattern), if an author unlocks a reward or not. 我正在为应用程序的一个小组件工作,无论作者是否解锁奖励,给定的数据(列表)和规则(模式)。
I have completed udacity's several Python classes, including most of algorithms but really can not find my way around this. 我已经完成了udacity的几个Python类,包括大多数算法,但确实找不到解决方法。
So far I thought of this: 到目前为止,我想到了这一点:
int
calculated above. int
范围内创建一个循环。 However, this doesn't work and I don't think converting blocks to days is efficient at all. 但是,这不起作用,我也不认为将块转换为几天是完全有效的。
Thank you. 谢谢。
Can you post a better example of the rules that the match has to follow? 您能否发布一个更好的示例来说明比赛必须遵循的规则? Are you looking for a certain number of items per author per time period?
您是否在每个时间段寻找每位作者一定数量的项目? Or are you looking for certain entries over a time period and then finding who they belong to?
或者,您是在一段时间内寻找某些条目,然后找到它们所属的人? That will effect the sort.
这将影响排序。
I think you will end up having to use a sort algorithm on this data, which is not horrible if you go about it the right way. 我认为您最终将不得不对这些数据使用排序算法,如果以正确的方式进行操作,这并不可怕。
From the bottom part of your question I think that if you are searching for x items per n time-periods (day/week/month) and then determining the authors it might be a bit messy. 从问题的底部开始,我认为如果您每n个时间段(天/周/月)搜索x项,然后确定作者,则可能会有些混乱。 If you have a finite number of authors it might be easier to flip that around and create an array for each author and store the item and date in there.
如果您的作者人数有限,可能更容易将其翻转并为每个作者创建一个数组,然后将项目和日期存储在其中。 Then you just run a testing loop over each author that checks all their entries to see if they fit the requirements.
然后,您只需对每个作者运行一个测试循环,即可检查他们的所有条目以查看它们是否符合要求。
For Python classes, MIT OpenCourseware's 6.00 Introduction to Computer Science and Programming is very good. 对于Python类,MIT OpenCourseware的6.00《计算机科学与编程简介》非常好。 It can be found at http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-00-introduction-to-computer-science-and-programming-fall-2008/ >
可以在以下网址找到它: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-00-introduction-to-computer-science-and-programming-fall-2008/ >
I would use following design: main generator function which iterates over sequence of works and yields "good" ones; 我将使用以下设计:主生成器函数,它在工作序列上进行迭代并产生“好”的函数; and a set of pluggable filters which implement particular rules, such as date range, N items per day, per week, per month, etc.
以及一组可实现特定规则的可插拔过滤器,例如日期范围,每天,每周,每月N个项目等。
Following is an small example to illustrate the idea: 下面是一个小例子来说明这个想法:
from datetime import date
from pprint import pprint
scientific_works = [
('SW 1', date(2000, 10, 15), 'auth 1'),
('SW 2', date(2000, 11, 3), 'auth 1'),
('SW 3', date(2000, 11, 4), 'auth 1'),
('SW 4', date(2000, 11, 5), 'auth 1'),
('SW 5', date(2000, 12, 1), 'auth 1'),
('SW 6', date(2000, 12, 15), 'auth 1'),
('SW 7', date(2000, 12, 18), 'auth 1'),
('SW 8', date(2000, 12, 22), 'auth 1'),
]
def filter_works(works, *filters):
for work in works:
good = True
for fil in filters:
good = good and fil(work)
if good:
yield work
class RangeFilter(object):
def __init__(self, from_date, to_date):
self.from_date = from_date
self.to_date = to_date
def __call__(self, work):
return self.from_date <= work[1] <= self.to_date
class WorksPerMonthFilter(object):
def __init__(self, limit):
self.limit = limit
self._current_month = date.min
self._current_number = 0
def __call__(self, work):
month = date(work[1].year, work[1].month, 1)
if month == self._current_month:
self._current_number += 1
else:
self._current_month = month
self._current_number = 1
return self._current_number <= self.limit
if __name__ == '__main__':
pprint(list(filter_works(scientific_works, RangeFilter(date(2000, 10, 1), date(2000, 11, 30)), WorksPerMonthFilter(2))))
pprint(list(filter_works(scientific_works, RangeFilter(date(2000, 10, 1), date(2000, 12, 31)), WorksPerMonthFilter(2))))
pprint(list(filter_works(scientific_works, RangeFilter(date(2000, 10, 1), date(2000, 12, 31)), WorksPerMonthFilter(3))))
If the pattern is: 如果模式是:
from start_date until end_date X items per period
then to find out whether scientific_works
matches the pattern, an analog of numpy.histogram()
function could be used: 然后找出是否
scientific_works
图案,的模拟匹配numpy.histogram()
函数可以使用:
import datetime
import numpy as np
ts = datetime.date.toordinal # or any monotonic numeric `date` function
hist = np.histogram(map(ts, (date for title, date, name in scientific_works)),
bins=map(ts, daterange(start_date, end_date, period))[0]
does_it_match = all(x >= X for x in hist)
where: 哪里:
def daterange(start_date, end_date, period):
d = start_date
while d < end_date:
yield d
d += period
Example: 例:
>>> from datetime import date, timedelta
>>> list(daterange(date(2000, 1, 1), date(2000, 2, 1), timedelta(days=7)))
[datetime.date(2000, 1, 1), datetime.date(2000, 1, 8),
datetime.date(2000, 1, 15), datetime.date(2000, 1, 22),
datetime.date(2000, 1, 29)]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.