简体   繁体   中英

Filtering a relationship attribute in SQLAlchemy

I have some code with a Widget object that must undergo some processing periodically. Widgets have a relationship to a Process object that tracks individual processing attempts and holds data about those attempts, such as state information, start and end times, and the result. The relationship looks something like this:

class Widget(Base):
   _tablename_ = 'widget'
   id = Column(Integer, primary_key=True)
   name = Column(String)
   attempts = relationship('Process')

class Process(Base):
   _tablename_ = 'process'
   id = Column(Integer, primary_key=True)
   widget_id = Column(Integer, ForeignKey='widget.id'))
   start = Column(DateTime)
   end = Column(DateTime)
   success = Column(Boolean)

I want to have a method on Widget to check whether it's time to process that widget yet, or not. It needs to look at all the attempts, find the most recent successful one, and see if it is older than the threshold.

One option is to iterate Widget.attempts using a list comprehension. Assuming now and delay are reasonable datetime and timedelta objects, then something like this works when defined as a method on Widget :

def ready(self):
   recent_success = [attempt for attempt in self.attempts if attempt.success is True and attempt.end >= now - delay]
   if recent_success:
      return False
   return True

That seems like good idiomatic Python, but it's not making good use of the power of the SQL database backing the data, and it's probably less efficient than running a similar SQL query especially once there are a large number of Process objects in the attempts list. I'm having a hard time figuring out the best way to implement this as a query, though.

It's easy enough to run the query inside Widget something like this:

def ready(self):
   recent_success = session.query(Process).filter(
      and_(
         Process.widget_id == self.id,
         Process.success == True,
         Process.end >= now - delay
      )
   ).order_by(Process.end.desc()).first()
   if recent_success:
      return False
   return True

But I run into problems in unit tests with getting session set properly inside the module that defines Widget. It seems to me that's a poor style choice, and probably not how SQLAlchemy objects are meant to be structured.

I could make the ready() function something external to the Widget class, which would fix the problems with setting session in unit tests, but that seems like poor OO structure.

I think the ideal would be if I could somehow filter Widget.attempts with SQL-like code that's more efficient than a list comprehension, but I haven't found anything that suggests that's possible.

What is actually the best approach for something like this?

You are thinking in the right direction. Any solution within the Widget instance implies you need to process all instances. Seeking the external process would have better performance and easier testability.

You can get all the Widget instances which need to be scheduled for next processing using this query:

q = (
    session
    .query(Widget)
    .filter(Widget.attempts.any(and_(
        Process.success == True,
        Process.end >= now - delay,
    )))
)

widgets_to_process = q.all()

If you really want to have a property on the model, i would not create a separate query, but just use the relationship:

def ready(self, at_time):
    successes = [
        attempt 
        for attempt in sorted(self.attempts, key=lambda v: v.end)
        if attempt.success and attempt.end >= at_time  # at_time = now - delay
    ]
    return bool(successes)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM