简体   繁体   English

在 R 中进行 mysql 更新查询的任何更快方法? 在蟒蛇?

[英]Any faster way to do mysql update query in R? in python?

I tried to run this query:我试图运行这个查询:

update table1 A 
set number = (select count(distinct(id)) from table2 B where B.col1 = A.col1 or B.col2 = A.col2);

but it takes forever bc table1 has 1,100,000 rows and table2 has 350,000,000 rows.但它需要永远 bc table1 有 1,100,000 行,table2 有 350,000,000 行。

Is there any faster way to do this query in R?有没有更快的方法在 R 中执行此查询? or in python?还是在蟒蛇?

I rewrote your query with three subqueries instead of one - with UNION and two INNER JOIN statements:我用三个子查询而不是一个子查询重写了您的查询 - 使用UNION和两个INNER JOIN语句:

UPDATE table1 as A
SET number = (SELECT COUNT(DISTINCT(id)) 
              FROM
                  (SELECT A.id as id
                   FROM table1 as A
                   INNER JOIN table2 as B
                   ON A.col1 = B.col1) -- condition for col1

                   UNION DISTINCT

                  (SELECT A.id as id
                   FROM table1 as A
                   INNER JOIN table2 as B
                   ON A.col2 = B.col2) -- condition for col2
              )

My notes:我的笔记:

  • Updating all of the rows in table1 doesn't look like a good idea, because we have to touch 1.1M rows.更新table1中的所有行看起来不是一个好主意,因为我们必须接触 1.1M 行。 Probably, another data structure for storing number would have better performance可能,另一种用于存储number数据结构会具有更好的性能
  • Try to run part of the query without update of table1 (only part of the query in parenthesis尝试在不更新table1情况下运行部分查询(仅括号中的部分查询
  • Take a look into EXPLAIN , if you need more general approach for optimization of SQL queries: https://dev.mysql.com/doc/refman/5.7/en/using-explain.html查看EXPLAIN ,如果您需要更通用的 SQL 查询优化方法: https : //dev.mysql.com/doc/refman/5.7/en/using-explain.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM