如何诊断长时间运行的事务并尽可能保存数据？

Question

I'm running into an issue with a long running transaction that is blocking all writes to the database as a result. 我遇到了一个长期运行的事务，该事务阻塞了所有对数据库的写操作。 I'm wondering, what is the best way to diagnose this issue and is it possible to force the transactions to commit and save the data that is currently hung up? 我想知道，诊断此问题的最佳方法是什么，是否可以强制事务提交并保存当前挂断的数据？

Background info: The app is using a custom DAO/ORM written in python using pymysql. 背景信息：该应用程序使用的是使用pymysql以python编写的自定义DAO / ORM。 I don't have much confidence in this code, but it has worked for awhile and this issue starting popping up 2 days ago. 我对这段代码不太有信心，但是它已经工作了一段时间了，这个问题在2天前开始出现。

My attempt: I've never seen anything like this before so everything I tried is based mostly off of googlefu. 我的尝试：我之前从未见过这样的东西，所以我尝试的一切都主要基于googlefu。 I started by looking at the transactions given from SHOW ENGINE INNODB STATUS; 我首先查看了SHOW ENGINE INNODB STATUS;给出的交易SHOW ENGINE INNODB STATUS; (I tried to omit non relevant info). （我试图忽略不相关的信息）。

------------
TRANSACTIONS
------------
Trx id counter 279345410
Purge done for trx's n:o < 277758364 undo n:o < 0 state: running but idle
History list length 271773
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 277760341, ACTIVE 153800 sec
28 lock struct(s), heap size 376, 49 row lock(s), undo log entries 90
MySQL thread id 2310, OS thread handle 0x2b3044788700, query id 3783662 172.30.1.223 backers cleaned up
---TRANSACTION 277758227, ACTIVE 153933 sec
82 lock struct(s), heap size 376, 1452 row lock(s), undo log entries 4233
MySQL thread id 1972, OS thread handle 0x2b3061a43700, query id 3783650 54.0.0.1 syncmachine cleaned up

Those 2 transactions are from 2 different apps on 2 different databaseson the same server. 这两个事务来自同一服务器上2个不同数据库上的2个不同应用程序。 They have obviously been running longer than expected. 他们的运行时间显然比预期的要长。

Next I though there may be a deadlock or something similar so I tried select trx_id,trx_state from information_schema.innodb_trx; 接下来，我可能会出现死锁或类似的情况，因此我尝试select trx_id,trx_state from information_schema.innodb_trx;

mysql> select trx_id,trx_state from information_schema.innodb_trx;
+-----------+-----------+
| trx_id    | trx_state |
+-----------+-----------+
| 279387152 | RUNNING   |
| 279387149 | RUNNING   |
| 279387114 | RUNNING   |
| 279384295 | RUNNING   |
| 279381054 | RUNNING   |
| 279347599 | RUNNING   |
| 278841669 | RUNNING   |
| 277760341 | RUNNING   |*
| 277758227 | RUNNING   |*
| 277758147 | RUNNING   |
+-----------+-----------+
10 rows in set (0.02 sec)

so no LOCK WAIT s which I thought could have been the issue. 因此没有我认为可能是问题的LOCK WAIT 。

Lastly I tried SELECT * FROM information_schema.innodb_trx\\G 最后，我尝试了SELECT * FROM information_schema.innodb_trx\\G

mysql> SELECT * FROM information_schema.innodb_trx\G
*************************** 6. row ***************************
                    trx_id: 277760341
                 trx_state: RUNNING
               trx_started: 2017-08-22 19:13:06
     trx_requested_lock_id: NULL
          trx_wait_started: NULL
                trx_weight: 119
       trx_mysql_thread_id: 2310
                 trx_query: NULL
       trx_operation_state: NULL
         trx_tables_in_use: 0
         trx_tables_locked: 0
          trx_lock_structs: 28
     trx_lock_memory_bytes: 376
           trx_rows_locked: 49
         trx_rows_modified: 91
   trx_concurrency_tickets: 0
       trx_isolation_level: READ COMMITTED
         trx_unique_checks: 1
    trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
 trx_adaptive_hash_latched: 0
 trx_adaptive_hash_timeout: 0
          trx_is_read_only: 0
trx_autocommit_non_locking: 0
*************************** 7. row ***************************
                    trx_id: 277758227
                 trx_state: RUNNING
               trx_started: 2017-08-22 19:10:53
     trx_requested_lock_id: NULL
          trx_wait_started: NULL
                trx_weight: 4315
       trx_mysql_thread_id: 1972
                 trx_query: NULL
       trx_operation_state: NULL
         trx_tables_in_use: 0
         trx_tables_locked: 0
          trx_lock_structs: 82
     trx_lock_memory_bytes: 376
           trx_rows_locked: 1452
         trx_rows_modified: 4233
   trx_concurrency_tickets: 0
       trx_isolation_level: READ COMMITTED
         trx_unique_checks: 1
    trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
 trx_adaptive_hash_latched: 0
 trx_adaptive_hash_timeout: 0
          trx_is_read_only: 0
trx_autocommit_non_locking: 0
8 rows in set (0.01 sec)

but trx_query is NULL for both. 但是trx_query为NULL。 So how is that the case with so many locks involved? 那么，涉及这么多锁的情况又如何呢？

Does anyone have any suggestions on diagnosing this? 有没有人对此诊断有任何建议？ Also, is there some way to force the transactions through? 另外，有什么方法可以强制交易通过吗？ Will FLUSH TABLES accomplish this? FLUSH TABLES可以做到这一点？

Answer 1

You should be able to review the transaction history with the following: https://www.psce.com/en/blog/2015/01/22/tracking-mysql-query-history-in-long-running-transactions/ (You should save the resulting log as it will be a way to recover data lost in the transaction) 您应该可以使用以下内容查看交易历史记录： https : //www.psce.com/en/blog/2015/01/22/tracking-mysql-query-history-in-long-running-transactions/ （您应该保存结果日志，因为这将是恢复事务中丢失的数据的一种方式）

I don't know of anyway to recover a session other than break-pointing into the code that has the context with the current session and attempting a debug from there. 除了断点到具有当前会话上下文的代码中并尝试从那里进行调试之外，我什么都不知道要恢复会话。

Your best bet is to unlock the tables and perform a backup: https://dev.mysql.com/doc/mysql-backup-excerpt/5.7/en/backup-methods.html 最好的选择是解锁表并执行备份： https : //dev.mysql.com/doc/mysql-backup-excerpt/5.7/en/backup-methods.html

Review any new modifications to the code, perhaps someone forgot a commit() somewhere and there's some reason the session isn't being disconnected. 查看对代码的任何新修改，也许有人在某个地方忘记了commit（），并且由于某些原因未断开会话。 A background worker task could be the culprit as it would be able to hold on to the same context and put your tables in lock. 后台工作者任务可能是元凶，因为它可以保持相同的上下文并将表锁定。 This can be fixed by setting timeouts: MySQL rollback on transaction with lost/disconnected connection 这可以通过设置超时来解决： MySQL在连接丢失/断开的事务上回滚

如何诊断长时间运行的事务并尽可能保存数据？

问题描述

1 个解决方案

解决方案1
1 2017-08-24 16:49:44

如何诊断长时间运行的事务并尽可能保存数据？

问题描述

1 个解决方案

解决方案1 1 2017-08-24 16:49:44

解决方案1
1 2017-08-24 16:49:44