简体   繁体   English

查询 SQLAlchemy 耗时太长

[英]Query SQLAlchemy taking too long

folks.伙计们。 I got a query on SQLalchemy that is taking too long.我收到了关于 SQLalchemy 的查询,该查询耗时过长。 I don't if it's because the db is too big (6.5 millions of rows, so I don't think so)or if I'm doing something wrong.如果是因为数据库太大(650 万行,所以我不这么认为)或者我做错了什么,我不会。 The following are the tables.以下是表格。

class A(Base):
    __tablename__ = 'tbl_a'

    id = Column(Integer, primary_key=True, autoincrement=True)
    
    a = Column(CHAR(3))
    b = Column(DATE)
    c = Column(Integer)
    d = Column(Integer)
    e = Column(Integer)
    

class B(Base):
    __tablename__ = 'tbl_b'

    id = Column(Integer, primary_key=True, autoincrement=True)
    a = Column(Integer)
    b  = Column(DATE)
    c  = Column(Integer)
    d  = Column(Integer)

And this is the query:这是查询:

row  = session.query(A).get(id)

value = session.query(B.d).filter((B.b == row.b)&
                                  (B.d == row.c)&
                                  (B.e == row.d)).first()

    

With just one condition Bb == row.b it takes 2 minutes, with more I didn't saw it end after 10 minutes.只有一个条件Bb == row.b需要 2 分钟,更多我没有看到它在 10 分钟后结束。 It takes 0.5 seconds to get the row value.获取row值需要 0.5 秒。 One thing that I saw was that indexing columns you intend to use the where clause one speed up things.我看到的一件事是您打算使用 where 子句的索引列可以加快速度。 If so, can I do it after I uploaded the records to my db?如果是这样,我可以在我将记录上传到我的数据库后这样做吗?

ORM is not related to this issue, so it's not about SQLAlchemy, but how can we solve this? ORM 与此问题无关,因此与 SQLAlchemy 无关,但我们该如何解决呢? well, 6 million data is so much.嗯,600 万条数据就这么多了。 that's normal to take time.需要时间是正常的。 if you wanna fix the problem you should use timescaledb which is an extension on SQL databases.如果你想解决这个问题,你应该使用timescaledb ,它是 SQL 数据库的扩展。

Timescaledb documents Timescaledb 文档

What does timesacledb do? timesacledb 是做什么的?

It chunks your database tables into separated smaller tables based on time created or the id which is indexing in other words.它根据创建时间或indexing的 id 将您的数据库表分块为单独的较小表。 when you ask for 6 million data it doesn't need to process all the rows, because it has indexing and chunks, so data will be provided in a blink of an eye!!当你要求600万条数据时,它不需要处理所有的行,因为它有索引和块,所以数据会在眨眼间提供!!

How can I use timescaledb?如何使用 timescaledb?

Well, as the issue is not about Python or ORM, the solution is not also related to them, you need to configure timescaledb and somehow manage how many chunks should be created and how many rows should a chunk contain.好吧,由于问题与 Python 或 ORM 无关,因此解决方案也与它们无关,您需要配置timescaledb并以某种方式管理应该创建多少块以及一个块应该包含多少行。 that is up to you and your data.这取决于您和您的数据。

Good luck pal.祝你好运朋友。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM