简体   繁体   English

使用information_schema查找innodb数据库的大小

[英]using information_schema to find size of innodb databases

We have about 60-70 databases on an RDS server, and a lot of them can be deleted. RDS服务器上大约有60-70个数据库,其中很多可以删除。

I want to do a benchmark of size before and after, and they are all (to my knowledge) innoDB tables. 我想做一个基准的大小前后,它们都是(据我所知)innoDB表。

So, I'm using the information_schema table per this link: https://www.percona.com/blog/2008/03/17/researching-your-mysql-table-sizes/ and this is great, except the first query listed (and I presume the others) just runs and runs and eventually finishes after EIGHT MINUTES . 因此,我通过每个链接使用information_schema表: https : //www.percona.com/blog/2008/03/17/researching-your-mysql-table-sizes/ ,这很不错,除了第一个查询之外列出(我想其他人都可以)运行,并且在EIGHT MINUTES之后最终完成。

I can run this query instantly: 我可以立即运行此查询:

SELECT COUNT(*) FROM information_schema.TABLES;

And get about 12,500 tables. 并获得约12,500张桌子。

I also notice - ironically enough - that information_schema.TABLES has no indexes! 我还注意到-具有讽刺意味的是-information_schema.TABLES没有索引! My instinct is not to mess with that. 我的本能是不要为此惹恼。

My best option at this point is to dump the TABLES table, and run the query on a copy that I actually index. 此时,我最好的选择是转储TABLES表,并在实际上已建立索引的副本上运行查询。

My questions are: 1. how dynamic is the information_schema.TABLES table and in fact that entire database 2. why is it running so slow? 我的问题是:1. information_schema.TABLES表的动态性如何,实际上整个数据库的动态性如何?2.为什么运行这么慢? 3. would it be advisable to index some key fields to optimize the queries I want to do? 3.建议对一些关键字段建立索引以优化我想做的查询吗? 4. If I do do an SQL dump, will I be getting current table size information? 4.如果执行SQL转储,是否会获取当前表大小信息?

Thanks, I hope this question is instructive. 谢谢,我希望这个问题有启发性。

information_schema is currently a thin layer on top of some older stuff. information_schema当前是一些较旧的内容之上的一薄层。 The older stuff needed to "open" each table to discover its size, etc. That involved reading at least the .frm . 较旧的内容需要“打开”每个表以发现其大小,等等。这至少需要读取.frm But it did not need to open in order to count the number of tables. 但是它不需要打开即可计算表的数量。 Think of the difference between SHOW TABLES and SHOW TABLE STATUS . 考虑一下SHOW TABLESSHOW TABLE STATUS之间的区别。

table_open_cache and table_definition_cache probably did have all the tables in them when you did the 8 minute query. 当您进行8分钟的查询时, table_open_cachetable_definition_cache可能确实在其中包含了所有表。 Anyway, the values for those VARIABLES may have been less than 12,500, implying that there would have been churn. 无论如何,这些VARIABLES的值可能小于12,500,这意味着会有客户流失。

In the future (probably 5.8), all that info will probably be sitting in a single InnoDB table instead of splayed across the OS's file system. 将来(可能是5.8),所有这些信息可能都位于一个InnoDB表中,而不是在操作系统的文件系统中扩展。 At that point, it will be quite fast. 到那时,它将会非常快。 (Think of how fast a table scan of 12,500 rows can be done, especially if fully cached in RAM.) (考虑一下可以完成12500行的表扫描的速度,特别是如果完全缓存在RAM中的话。)

Since the information_schema does not have "real" tables, there is no way to add INDEXes . 由于information_schema没有“实际”表,因此无法添加INDEXes

mysqldump does not provide the table size info. mysqldump不提供表大小信息。 Even if it did, it would be no faster, since it would go through the same, old, mechanism. 即使这样做,也不会更快,因为它将通过相同的旧机制。

60 is a questionably large number of databases; 60个数据库实在太多了; 12K is a large number of tables. 12K是大量表。 Often this implies a schema design that chooses to create multiple tables instead of putting data into a single table? 通常,这意味着选择创建多个表而不是将数据放入单个表的架构设计?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM