简体   繁体   English

MySQL 对数据库中所有表的校验和

[英]MySQL checksum to all tables in a database

I am evaluating a PHP/MySQL based software.我正在评估基于 PHP/MySQL 的软件。 I want to look which tables affected when certain operations triggered.我想查看触发某些操作时哪些表受到影响。

After some googling, I was told that checksum table tbl_name can do the job.在谷歌搜索之后,我被告知checksum table tbl_name可以完成这项工作。 I just need to know how to use checksum for all the tables in the db.我只需要知道如何对数据库中的所有表使用checksum

To checksum all the tables one by one manually definitely not preferred as the database contains hundreds of tables.由于数据库包含数百个表,因此手动对所有表进行checksum绝对不是首选。

Checksumming all tables seems like a lot of expensive calculation work just to detect which tables changed.对所有表进行校验和检查似乎是很多昂贵的计算工作,只是为了检测哪些表发生了变化。

I'd suggest to get this information using the sys.schema_table_statistics table.我建议使用sys.schema_table_statistics表获取此信息。

mysql> select table_schema, table_name, rows_fetched, rows_inserted, rows_updated, rows_deleted 
  from sys.schema_table_statistics where table_schema='test'
+--------------+---------------------+--------------+---------------+--------------+--------------+
| table_schema | table_name          | rows_fetched | rows_inserted | rows_updated | rows_deleted |
+--------------+---------------------+--------------+---------------+--------------+--------------+
| test         | sysbench_results    |          870 |           144 |            0 |            0 |
+--------------+---------------------+--------------+---------------+--------------+--------------+

You probably want to reset the counters between your tests.您可能希望在测试之间重置计数器。 Use sys.ps_truncate_all_tables()使用sys.ps_truncate_all_tables()

mysql> call sys.ps_truncate_all_tables(FALSE);
+---------------------+
| summary             |
+---------------------+
| Truncated 31 tables |
+---------------------+

mysql> select table_schema, table_name, rows_fetched, rows_inserted, rows_updated, rows_deleted
  from sys.schema_table_statistics where table_schema='test';
+--------------+---------------------+--------------+---------------+--------------+--------------+
| table_schema | table_name          | rows_fetched | rows_inserted | rows_updated | rows_deleted |
+--------------+---------------------+--------------+---------------+--------------+--------------+
| test         | sysbench_results    |            0 |             0 |            0 |            0 |
+--------------+---------------------+--------------+---------------+--------------+--------------+

The sys schema comes pre-installed in MySQL 5.7. sys 模式预先安装在 MySQL 5.7 中。

If you use MySQL 5.6, you may need to install it yourself.如果您使用 MySQL 5.6,您可能需要自行安装。 It's just an SQL script that creates some views into the performance_schema.它只是一个 SQL 脚本,用于在 performance_schema 中创建一些视图。 Very easy to install.非常容易安装。

You can get the sys schema here: https://github.com/mysql/mysql-sys您可以在此处获取 sys 架构: https : //github.com/mysql/mysql-sys

You could generate the CHECKSUM statements for all tables:您可以为所有表生成CHECKSUM语句:

SELECT CONCAT('CHECKSUM TABLE ', table_name, ';') AS statement
FROM information_schema.tables
WHERE table_schema = 'YourDBNameHere'

Then copy this output and paste it into Workbench or whatever tool you need to use.然后复制此输出并将其粘贴到 Workbench 或您需要使用的任何工具中。 If you need to do this from within application (eg PHP) code, then you would probably have to use pure dynamic MySQL.如果您需要从应用程序(例如 PHP)代码中执行此操作,那么您可能必须使用纯动态 MySQL。

I want to look which tables affected when certain operations triggered.我想查看触发某些操作时哪些表受到影响。

What do you mean by this?你这是什么意思?

Do you know what operations have been triggered, and you're merely attempting to understand what effect they had on your database (eg to verify their correctness)?您是否知道触发了哪些操作,而您只是想了解它们对您的数据库产生了什么影响(例如,验证它们的正确性)? Or do you not know what operations have been triggered (eg during some interval) but you nevertheless want to understand how the database has changed, perhaps in an attempt to determine what those operations were?或者您不知道触发了哪些操作(例如在某个时间间隔内),但您仍然想了解数据库是如何更改的,也许是为了确定这些操作是什么?

There are very few situations where I would expect the best approach to be that which you are exploring (inspecting the database for changes).在极少数情况下,我希望最好的方法是您正在探索的方法(检查数据库的更改)。 Instead, some form of logging —whether built-in to the RDBMS (such as MySQL's General Query Log or perhaps through triggers as suggested by Sumesh), or more likely at some higher level (eg within the accessing application)—would almost always be preferable.相反,某种形式的日志-无论是内置的RDBMS(如MySQL的通用查询日志或者是通过触发器由Sumesh建议)在(进行访问的应用程序中如)一些更高级别,或者更可能-would几乎总是更可取。 This leads me to lean toward thinking you have an XY Problem .这使我倾向于认为您有XY 问题

However, on the assumption that you really do want to identify the tables that have been modified since some last known good point in time, you can query the INFORMATION_SCHEMA.TABLES table, which contains not only the CHECKSUM for every table in the RDBMS but also other potentially useful information like UPDATE_TIME .但是,假设您确实希望识别自上次已知的某个时间点以来已被修改的表,您可以查询INFORMATION_SCHEMA.TABLES表,该表不仅包含 RDBMS 中每个表的CHECKSUM ,还包含其他可能有用的信息,如UPDATE_TIME So, for example, to identify all tables changed in the last five minutes one could do:因此,例如,要识别在过去五分钟内更改的所有表,可以执行以下操作:

SELECT TABLE_SCHEMA, TABLE_NAME
FROM   INFORMATION_SCHEMA.TABLES
WHERE  UPDATE_TIME > NOW() - INTERVAL 5 MINUTE

For those who came here for an answer how to get checksum for all the tables in one query (as it was in my case):对于那些来这里寻求答案的人来说,如何在一个查询中获取所有表的checksum (就像我的情况一样):

SET group_concat_max_len = CAST( 
    (  
        SELECT SUM(LENGTH(TABLE_NAME)) + COUNT(*) * LENGTH(', ') 
        FROM information_schema.tables WHERE `TABLE_SCHEMA` = 'your_database_name' 
    ) AS UNSIGNED
);

SET @sql_command:= ( 
    SELECT CONCAT( 
        'CHECKSUM TABLE ', 
        GROUP_CONCAT( TABLE_NAME ORDER BY `TABLE_NAME` SEPARATOR ', ' ) 
    )
    FROM information_schema.tables 
    WHERE `TABLE_SCHEMA` = 'your_database_name' 
    ORDER BY `TABLE_NAME` 
);

PREPARE statement FROM @sql_command;
EXECUTE statement;
DEALLOCATE PREPARE statement;

The mere idea is to create CHECKSUM TABLE statement which include all tables names in it.唯一的想法是创建CHECKSUM TABLE语句,其中包含所有表名称。 So yes, it is some sort of little bit upgraded version of answer given by Tim Biegeleisen.所以是的,它是 Tim Biegeleisen 给出的答案的某种升级版本。

First we set maximum permitted result lenght for GROUP_CONCAT() function (which is 1024 bytes by default).首先,我们为 GROUP_CONCAT() 函数设置允许的最大结果长度(默认为 1024 字节)。 It is calculated as number of symbols in all table names inculding the separator which will be putted between these names:它计算为所有表名称中的符号数,包括将放在这些名称之间的分隔符:

SET group_concat_max_len = CAST( 
    (  
        SELECT SUM(LENGTH(TABLE_NAME)) + COUNT(*) * LENGTH(', ') 
        FROM information_schema.tables WHERE `TABLE_SCHEMA` = 'your_database_name' 
    ) AS UNSIGNED
);

Then we put all the tables names together in one CHECKSUM TABLE statement and store it in string variable:然后我们将所有表名放在一个 CHECKSUM TABLE 语句中,并将其存储在字符串变量中:

SET @sql_command:= ( 
    SELECT CONCAT( 
        'CHECKSUM TABLE ', 
        GROUP_CONCAT( TABLE_NAME ORDER BY `TABLE_NAME` SEPARATOR ', ' ) 
    )
    FROM information_schema.tables 
    WHERE `TABLE_SCHEMA` = 'your_database_name' 
    ORDER BY `TABLE_NAME` 
);

And finally executing the statement to see the results:最后执行语句查看结果:

PREPARE statement FROM @sql_command;
EXECUTE statement;
DEALLOCATE PREPARE statement;

Unfortunately you can't further manipulate with result set using MySQL statements only (ie insert to table or join with other result sets).不幸的是,您不能仅使用 MySQL 语句对结果集进行进一步操作(即插入到表中或与其他结果集连接)。 So if you require to do some comparisons you will eventually need to use additional code in your favorite programming language (or use capable software) to accomplish the task.因此,如果您需要进行一些比较,您最终将需要使用您最喜欢的编程语言(或使用功能强大的软件)中的其他代码来完成任务。

The question does not state using a shell script to accomplish things isn't allowed, so I'll post one such approach here (PHP is able to invoke shell scripts - see http://php.net/manual/en/function.shell-exec.php - if safe mode is not enabled):该问题并未说明不允许使用 shell 脚本来完成任务,因此我将在此处发布一种此类方法(PHP 能够调用 shell 脚本 - 请参阅http://php.net/manual/en/function.js)。 shell-exec.php - 如果未启用安全模式):

If your script has shell access at its disposal and a checksum tool - like md5sum - one can also do something like this to collect checksums for each table:如果您的脚本可以使用 shell 访问权限和校验和工具(例如md5sum ,还可以执行以下操作来收集每个表的校验和:

#!/bin/bash
DATABASEPATH="/var/lib/mysql/yourdatabase"
cd "$DATABASEPATH" &&
for TABLEFILE in `ls -t *.ibd`; do
  SUMANDTABLE=`md5sum "$TABLEFILE"`
  echo "${SUMANDTABLE//.ibd}"
done

And optionally, if you don't want a checksum calculated for all tables, you could also check if the modification date of the "$TABLEFILE" is within range.并且可选地,如果您不想为所有表计算校验和,您还可以检查"$TABLEFILE"的修改日期是否在范围内。 If not, you just exit the script (the ls -t orders by modification date, descending).如果没有,您只需退出脚本( ls -t按修改日期降序排列)。

To access modification date use something like eg stat -c %Y "$TABLEFILE" .要访问修改日期,请使用例如stat -c %Y "$TABLEFILE" This would give you the modification date in seconds since Epoch.这将为您提供自 Epoch 以来以秒为单位的修改日期。

To access current date, also in seconds since Epoch use: date +%s .要访问当前日期,也是自 Epoch 以来的秒数,请使用: date +%s

One can then subtract the modification date from the current date to establish how many seconds ago a "$TABLEFILE" has changed.然后可以从当前日期中减去修改日期,以确定"$TABLEFILE"在多少秒前已更改。

Another related method, which in some cases could apply, would be to save the ls -t *.ibd listing (without even calculating checksums, just store filenames in order), then start an operation and at the end of that operation check for difference in file listing with another execution of ls -t *.ibd .在某些情况下可以应用的另一种相关方法是保存ls -t *.ibd列表(甚至不计算校验和,只需按顺序存储文件名),然后开始操作并在该操作结束时检查差异在另一个执行ls -t *.ibd文件列表中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM