简体   繁体   English

如何在SQLAlchemy中加速混合属性查询?

[英]How can I speed up hybrid property queries in SQLAlchemy?

Is there a good way to speed up querying hybrid properties in SQLALchemy that involve relationships? 有没有一种很好的方法可以加速查询涉及关系的SQLALchemy中的混合属性? I have the following two tables: 我有以下两个表:

class Child(Base):
     __tablename__ = 'Child'
     id = Column(Integer, primary_key=True) 
     is_boy = Column(Boolean, default=False)
     parent_id = Column(Integer, ForeignKey('Parent.id'))


class Parent(Base):
     __tablename__ = 'Parent'
     id = Column(Integer, primary_key=True) 
     children = relationship("Child", backref="parent")

     @hybrid_property
     def children_count(self):
         return self.children_count.count()

     @children_count.expression
     def children_count(cls):
         return (select([func.count(Children.id)]).
            where(Children.parent_id == cls.id).
            label("children_count")
            )

When I query Parent.children_count across 50,000 rows (each parent has on average roughly 2 children), it's pretty slow. 当我在50,000行中查询Parent.children_count(每个父级平均大约有2个孩子)时,它非常慢。 Is there a good way through indexes or something else for me to speed these queries up? 有没有一种很好的方法通过索引或其他东西来加速这些查询?

By default, PostgreSQL doesn't create indexes on foreign keys. 默认情况下,PostgreSQL不会在外键上创建索引。

So the first thing I'd do is add an index, which SQLAlchemy makes really easy: 所以我要做的第一件事是添加一个索引,SQLAlchemy非常容易:

parent_id = Column(Integer, ForeignKey('Parent.id'), index=True)

This will probably result in a fast enough retrieval time given the size of your current dataset--try it and see. 考虑到当前数据集的大小,这可能会导致足够快的检索时间 - 尝试并查看。 Be sure to try the query a few times in a row to warm up the PostgreSQL cache. 请务必连续几次尝试查询以预热PostgreSQL缓存。

For a larger dataset, or if the queries still aren't fast enough, you could look into pre-calculating the counts and caching them... A number of ways to cache, the easiest hack is probably throw an extra column in your Parent table and just make sure whenever a new child is added that you write app logic to increment the count. 对于更大的数据集,或者如果查询仍然不够快,你可以考虑预先计算计数并缓存它们......有许多方法可以缓存,最简单的黑客攻击可能会在你的父级中增加一个额外的列表,只需确保添加新子项时,您编写应用程序逻辑以增加计数。 It's a little hacky that way. 这样有点哈哈。 Another option is caching the count in Redis/memcache, or even using a Materialized View (this is a great solution if it's okay for the count to occasionally be out of date by a few minutes). 另一种选择是在Redis / memcache中缓存计数,甚至使用物化视图(如果计数偶尔会过时几分钟,这是一个很好的解决方案)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM