简体   繁体   English

MySQL:如何从MySQL中提取大量数据而不会使其窒息?

[英]MySQL: How to pull large amount of Data from MySQL without choking it?

My colleague run a script that pulls data from the db periodically. 我的同事运行了一个脚本,该脚本定期从数据库中提取数据。 He is using the query: 他正在使用查询:

SELECT url, data FROM table LIMIT {} OFFSET {}'.format( OFFSET, PAGE * OFFSET

We use Amazon AURORAS and he has his own slaves server but everytime it touches 98%+ 我们使用Amazon AURORAS,他有自己的奴隶服务器,但每次碰到98%+

Table have millions of records. 表有数百万条记录。

Would it be nice if we go for sqldump instead of SQL queries for fetching data? 如果我们选择sqldump而不是使用SQL查询来获取数据,那会很好吗?

The options come in my mind are: 我想到的选择是:

  • SQL DUMP of selective tables( not sure of benchmark) 选择性表的SQL DUMP(不确定基准)
  • Federate tables based on certain reference(date, ID etc) 基于某些参考(日期,ID等)的联合表

Thanks 谢谢

I'm making some fairly big assumptions here, but from 我在这里做一些相当大的假设,但是从

without choking it 不cho死

I'm guessing you mean that when your colleague runs the SELECT to grab the large amount of data, the database performance drops for all other operations - presumably your primary application - while the data is being prepared for export. 我猜您的意思是,当您的同事运行SELECT来获取大量数据时,在准备导出数据时,所有其他操作(可能是您的主应用程序)的数据库性能都会下降。

You mentioned SQL Dump so I'm also assuming that this colleague will be satisfied with data that is roughly correct, ie: it doesn't have to be up to the instant transactionally correct data. 您提到了SQL Dump,所以我还假设该同事会对大致正确的数据感到满意,即:不必依赖即时的交易正确数据。 Just good enough for something like analytics work. 对于分析工作来说已经足够好了。

If those assumptions are close, your colleague and your database might benefit from 如果这些假设很接近,您的同事和数据库可能会从中受益

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

This line of code should be used carefully and almost never in a line of business application but it can help people querying the live database with big queries, as long as you fully understand the implications. 这行代码应谨慎使用,几乎不要在业务应用程序中使用,但只要您完全理解其含义,它就可以帮助人们用大查询来查询实时数据库。

To use it, simply start a transaction and put this line before any queries you run. 要使用它,只需启动一个事务并将此行放在您运行的任何查询之前。

The 'choking' “窒息”

What you are seeing when your colleague runs a large query is record locking. 当您的同事运行大型查询时,您看到的是记录锁定。 Your database engine is - quite correctly - set up to provide an accurate view of your data, at any point. 您的数据库引擎-完全正确-可以随时提供准确的数据视图。 So, when a large query comes along the database engine first waits for all write locks (transactions) to clear, runs the large query and holds all future write locks until the query has run. 因此,当大型查询出现时,数据库引擎首先等待所有写锁(事务)清除,然后运行大型查询并保留所有将来的写锁,直到查询运行为止。

This actually happens for all transactions, but you only really notice it for the big ones. 这实际上发生在所有交易中,但是您只真正注意到了大交易。

What READ UNCOMMITTED does 读未提交的内容

By setting the transaction isolation level to READ UNCOMMITTED, you are telling the database engine that this transaction doesn't care about write locks and to go ahead and read anyway. 通过将事务隔离级别设置为READ UNCOMMITTED,您将告诉数据库引擎该事务不关心写锁,而是继续进行读取。

This is known as a 'dirty read', in that the long-running query could well read a table with a write lock on it and will ignore the lock. 这被称为“脏读”,因为长时间运行的查询可以很好地读取带有写锁的表,而将忽略该锁。 The data actually read could be the data before the write transaction has completed, or a different transaction could start and modify records before this query gets to it. 实际读取的数据可以是写事务完成之前的数据,或者其他事务可以在此查询到达之前启动并修改记录。

The data returned from anything with READ UNCOMMITTED is not guaranteed to be correct in the ACID sense of a database engine, but for some use cases it is good enough . 从数据库引擎的ACID角度来看,从具有READ UNCOMMITTED的任何对象返回的数据不能保证是正确的,但是对于某些用例来说,它已经足够好了

What the effect is 效果是什么

Your large queries magically run faster and don't lock the database while they are running. 您的大型查询可以更快地运行,并且在运行时不锁定数据库。

Use with caution and understand what it does before you use it though. 使用前请务必谨慎使用并了解其作用。

MySQL Manual on transaction isolation levels MySQL事务隔离级别手册

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM