[英]Any faster way to do mysql update query in R? in python?
I tried to run this query:我试图运行这个查询:
update table1 A
set number = (select count(distinct(id)) from table2 B where B.col1 = A.col1 or B.col2 = A.col2);
but it takes forever bc table1 has 1,100,000 rows and table2 has 350,000,000 rows.但它需要永远 bc table1 有 1,100,000 行,table2 有 350,000,000 行。
Is there any faster way to do this query in R?有没有更快的方法在 R 中执行此查询? or in python?
还是在蟒蛇?
I rewrote your query with three subqueries instead of one - with UNION
and two INNER JOIN
statements:我用三个子查询而不是一个子查询重写了您的查询 - 使用
UNION
和两个INNER JOIN
语句:
UPDATE table1 as A
SET number = (SELECT COUNT(DISTINCT(id))
FROM
(SELECT A.id as id
FROM table1 as A
INNER JOIN table2 as B
ON A.col1 = B.col1) -- condition for col1
UNION DISTINCT
(SELECT A.id as id
FROM table1 as A
INNER JOIN table2 as B
ON A.col2 = B.col2) -- condition for col2
)
My notes:我的笔记:
table1
doesn't look like a good idea, because we have to touch 1.1M rows.table1
中的所有行看起来不是一个好主意,因为我们必须接触 1.1M 行。 Probably, another data structure for storing number
would have better performancenumber
数据结构会具有更好的性能table1
(only part of the query in parenthesistable1
情况下运行部分查询(仅括号中的部分查询EXPLAIN
, if you need more general approach for optimization of SQL queries: https://dev.mysql.com/doc/refman/5.7/en/using-explain.htmlEXPLAIN
,如果您需要更通用的 SQL 查询优化方法: https : //dev.mysql.com/doc/refman/5.7/en/using-explain.html
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.