简体   繁体   English

为什么SQLAlchemy count()比原始查询慢得多?

[英]Why is SQLAlchemy count() much slower than the raw query?

I'm using SQLAlchemy with a MySQL database and I'd like to count the rows in a table (roughly 300k). 我正在使用带有MySQL数据库的SQLAlchemy,我想计算表中的行数(大约300k)。 The SQLAlchemy count function takes about 50 times as long to run as writing the same query directly in MySQL. SQLAlchemy 计数函数的运行时间大约是在MySQL中直接写入相同查询的50倍。 Am I doing something wrong? 难道我做错了什么?

# this takes over 3 seconds to return
session.query(Segment).count()

However: 然而:

SELECT COUNT(*) FROM segments;
+----------+
| COUNT(*) |
+----------+
|   281992 |
+----------+
1 row in set (0.07 sec)

The difference in speed increases with the size of the table (it is barely noticeable under 100k rows). 速度的差异随着桌子的大小而增加(在100k行下几乎不可察觉)。

Update 更新

Using session.query(Segment.id).count() instead of session.query(Segment).count() seems to do the trick and get it up to speed. 使用session.query(Segment.id).count()而不是session.query(Segment).count()似乎可以解决这个问题并使其加速。 I'm still puzzled why the initial query is slower though. 我仍然感到困惑,为什么初始查询速度较慢。

Unfortunately MySQL has terrible, terrible support of subqueries and this is affecting us in a very negative way. 不幸的是,MySQL对子查询的支持非常可怕,这对我们的影响很大。 The SQLAlchemy docs point out that the "optimized" query can be achieved using query(func.count(Segment.id)) : SQLAlchemy文档指出可以使用query(func.count(Segment.id))来实现“优化”查询:

Return a count of rows this Query would return. 返回此Query将返回的行数。

This generates the SQL for this Query as follows: 这将为此Query生成SQL,如下所示:

 SELECT count(1) AS count_1 FROM ( SELECT <rest of query follows...> ) AS anon_1 

For fine grained control over specific columns to count, to skip the usage of a subquery or otherwise control of the FROM clause, or to use other aggregate functions, use func expressions in conjunction with query(), ie: 要对要计数的特定列进行细粒度控制,跳过子查询的使用或以其他方式控制FROM子句,或使用其他聚合函数,请将func表达式与query()结合使用,即:

 from sqlalchemy import func # count User records, without # using a subquery. session.query(func.count(User.id)) # return count of user "id" grouped # by "name" session.query(func.count(User.id)).\\ group_by(User.name) from sqlalchemy import distinct # count distinct "name" values session.query(func.count(distinct(User.name))) 

It took me a long time to find this as the solution to my problem. 我花了很长时间才发现这是我问题的解决方案。 I was getting the following error: 我收到以下错误:

sqlalchemy.exc.DatabaseError: (mysql.connector.errors.DatabaseError) 126 (HY000): Incorrect key file for table '/tmp/#sql_40ab_0.MYI'; sqlalchemy.exc.DatabaseError:(mysql.connector.errors.DatabaseError)126(HY000):表'/tmp/#sql_40ab_0.MYI'的密钥文件不正确; try to repair it 尝试修复它

The problem was resolved when I changed this: 当我改变这个问题时问题得到了解决:

query = session.query(rumorClass).filter(rumorClass.exchangeDataState == state)
return query.count()

to this: 对此:

query = session.query(func.count(rumorClass.id)).filter(rumorClass.exchangeDataState == state)
return query.scalar()

The reason is that SQLAlchemy's count() is counting the results of a subquery which is still doing the full amount of work to retrieve the rows you are counting. 原因是SQLAlchemy的count()正在计算子查询的结果,该子查询仍在完成检索您正在计算的行的全部工作量。 This behavior is agnostic of the underlying database; 此行为与底层数据库无关; it isn't a problem with MySQL. 它不是MySQL的问题。

The SQLAlchemy docs explain how to issue a count without a subquery by importing func from sqlalchemy . SQLAlchemy 文档通过从sqlalchemy导入func来解释如何在没有子查询的情况下发出计数。

session.query(func.count(User.id)).scalar()

>>>SELECT count(users.id) AS count_1 \nFROM users')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为什么django ORM比原始SQL慢得多 - Why is django ORM so much slower than raw SQL 为什么 SELECT COUNT(*) 比 MySQL 中带有 WHERE 子句的 SELECT 慢得多? - Why SELECT COUNT(*) is much slower than SELECT with a WHERE clause in MySQL? 为什么此MYSQL查询中的子查询比此子查询慢得多? - Why is the subquery in this MYSQL query so much slower than this one? 为什么通过原始MySQLdb游标通过ORM 5-8x加载SQLAlchemy对象比使用行慢? - Why is loading SQLAlchemy objects via the ORM 5-8x slower than rows via a raw MySQLdb cursor? COUNT比常规选择慢很多 - COUNT is MUCH slower than regular select PDO比mysql查询慢得多? - PDO much slower than mysql query? 使用&#39;=&#39;的MySQL查询比&#39;LIKE&#39;慢得多 - Mysql query with '=' is much slower than 'LIKE' 为什么Node JS中的Mysql查询执行比直接的Mysql查询执行要慢得多? - Why is a Mysql query execution in Node JS so much slower than a direct Mysql query execution? 为什么从大表查询COUNT()比SUM()快得多 - Why is COUNT() query from large table much faster than SUM() MySQL查询调优 - 为什么使用变量中的值比使用文字慢得多? - MySQL Query Tuning - Why is using a value from a variable so much slower than using a literal?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM