简体   繁体   English

在 SQLAlchemy 中过滤关系属性

[英]Filtering a relationship attribute in SQLAlchemy

I have some code with a Widget object that must undergo some processing periodically.我有一些带有Widget对象的代码,必须定期进行一些处理。 Widgets have a relationship to a Process object that tracks individual processing attempts and holds data about those attempts, such as state information, start and end times, and the result.小部件与Process对象有关系,该对象跟踪单个处理尝试并保存有关这些尝试的数据,例如状态信息、开始和结束时间以及结果。 The relationship looks something like this:这种关系看起来像这样:

class Widget(Base):
   _tablename_ = 'widget'
   id = Column(Integer, primary_key=True)
   name = Column(String)
   attempts = relationship('Process')

class Process(Base):
   _tablename_ = 'process'
   id = Column(Integer, primary_key=True)
   widget_id = Column(Integer, ForeignKey='widget.id'))
   start = Column(DateTime)
   end = Column(DateTime)
   success = Column(Boolean)

I want to have a method on Widget to check whether it's time to process that widget yet, or not.我想在Widget上有一个方法来检查是否是时候处理该小部件了。 It needs to look at all the attempts, find the most recent successful one, and see if it is older than the threshold.它需要查看所有尝试,找到最近成功的一次,并查看它是否比阈值更旧。

One option is to iterate Widget.attempts using a list comprehension.一种选择是使用列表Widget.attempts迭代Widget.attempts Assuming now and delay are reasonable datetime and timedelta objects, then something like this works when defined as a method on Widget :假设nowdelay是合理的datetimetimedelta对象,那么当定义为Widget上的方法时,类似的东西会起作用:

def ready(self):
   recent_success = [attempt for attempt in self.attempts if attempt.success is True and attempt.end >= now - delay]
   if recent_success:
      return False
   return True

That seems like good idiomatic Python, but it's not making good use of the power of the SQL database backing the data, and it's probably less efficient than running a similar SQL query especially once there are a large number of Process objects in the attempts list.这似乎是很好的惯用 Python,但它没有充分利用支持数据的 SQL 数据库的强大功能,而且它的效率可能低于运行类似的 SQL 查询,尤其是在attempts列表中有大量 Process 对象时。 I'm having a hard time figuring out the best way to implement this as a query, though.不过,我很难找出将其实现为查询的最佳方法。

It's easy enough to run the query inside Widget something like this:Widget运行查询很容易,如下所示:

def ready(self):
   recent_success = session.query(Process).filter(
      and_(
         Process.widget_id == self.id,
         Process.success == True,
         Process.end >= now - delay
      )
   ).order_by(Process.end.desc()).first()
   if recent_success:
      return False
   return True

But I run into problems in unit tests with getting session set properly inside the module that defines Widget.但是我在单元测试中遇到了在定义 Widget 的模块中正确设置session问题。 It seems to me that's a poor style choice, and probably not how SQLAlchemy objects are meant to be structured.在我看来,这是一个糟糕的风格选择,而且可能不是 SQLAlchemy 对象的结构方式。

I could make the ready() function something external to the Widget class, which would fix the problems with setting session in unit tests, but that seems like poor OO structure.我可以使ready()函数成为 Widget 类的外部函数,这将解决在单元测试中设置session的问题,但这似乎是糟糕的 OO 结构。

I think the ideal would be if I could somehow filter Widget.attempts with SQL-like code that's more efficient than a list comprehension, but I haven't found anything that suggests that's possible.我认为理想的情况是,如果我能以某种方式使用比列表理解更有效的类似 SQL 的代码过滤Widget.attempts ,但我没有发现任何表明这是可能的。

What is actually the best approach for something like this?对于这样的事情,实际上最好的方法是什么?

You are thinking in the right direction.您正在朝着正确的方向思考。 Any solution within the Widget instance implies you need to process all instances. Widget实例中的任何解决方案意味着您需要处理所有实例。 Seeking the external process would have better performance and easier testability.寻求外部过程将具有更好的性能和更容易的可测试性。

You can get all the Widget instances which need to be scheduled for next processing using this query:您可以使用此查询获取所有需要安排用于下一次处理的Widget实例:

q = (
    session
    .query(Widget)
    .filter(Widget.attempts.any(and_(
        Process.success == True,
        Process.end >= now - delay,
    )))
)

widgets_to_process = q.all()

If you really want to have a property on the model, i would not create a separate query, but just use the relationship:如果你真的想在模型上有一个属性,我不会创建一个单独的查询,而只是使用关系:

def ready(self, at_time):
    successes = [
        attempt 
        for attempt in sorted(self.attempts, key=lambda v: v.end)
        if attempt.success and attempt.end >= at_time  # at_time = now - delay
    ]
    return bool(successes)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM