简体   繁体   English

通过减少索引大小来获得MySQL性能?

[英]MySQL performance gain by reducing index size?

I have a table with ~1.2m rows in it. 我有一个约1.2米行的表。 It has 6 columns indexed, including one varchar(255) field that contains urls. 它有6列索引,包括一个包含url的varchar(255)字段。

I need to be able to scan the table to see whether a url exists in the table, hence the index, but I'm wondering whether I would see a performance gain by reducing the index size to around 50? 我需要能够扫描表以查看表中是否存在URL,因此索引,但我想知道我是否会通过将索引大小减小到50左右来看到性能提升?

Of course this would mean that it may have to scan more rows when searching for a url in the database.. but I only have to do this query about once every 30 seconds, so I'm wondering if the smaller index size would be worth it. 当然这意味着在搜索数据库中的url时可能需要扫描更多行..但是我只需要每30秒执行一次这样的查询,所以我想知道较小的索引大小是否值得它。 Thoughts? 思考?

Two reasons why lowering maybe better - (Assuming your index is useful) 降低可能更好的两个原因 - (假设您的索引很有用)

1) Indexes too get loaded in memory, so there maybe a rare possibility that your index size grows to an extent that it is not completely cacheable in memory. 1)索引也被加载到内存中,因此可能很少有可能你的索引大小增长到不能完全缓存在内存中的程度。 Thats when you will see a performance hit (with all the new hardware specs... hardly a possibility with 1.2M rows, but still worth noting). 那就是当你看到性能受到打击时(所有新的硬件规格......不可能有1.2M行,但仍然值得注意)。

2) Manytimes, just the first 'n' characters are good enough to be able to quickly identify each record. 2)很多时候,只有第一个'n'字符足够好,能够快速识别每条记录。 You may not need to index the whole 255 characters at all. 您可能根本不需要索引整个255个字符。

Two reason why you may not care - 您可能不关心的两个原因 -

1) As stated, you may never see your indexes growing to be out of your key buffer, so why worry. 1)如上所述,您可能永远不会看到您的索引增长到关键缓冲区之外,所以为什么要担心。

2) You will need to determine the first 'n' characters, and even after that the performance will less than or equal to a full index... never more. 2)您将需要确定第一个'n'个字符,甚至在此之后,性能将小于或等于完整索引......永远不会更多。 Do you really need to spend time on that? 你真的需要花时间吗? Is it worth the possible loss of accuracy? 是否值得丢失准确性?

From my SQL indexing tutorial (covers MySQL as well) : 从我的SQL索引教程(也包括MySQL)

Tip: Always aim to index the original data. 提示:始终旨在索引原始数据。 That is often the most useful information you can put into an index. 这通常是您可以放入索引的最有用的信息。

This is a general rule I suggest until there is a very strong reason to do something different. 这是我建议的一般规则,直到有充分理由做出不同的事情。

Space is not the issue, in most cases. 在大多数情况下,空间不是问题。

Performance wise, the index tree depth grows logarithmically with the number of index leaf nodes. 性能方面,索引树深度与索引叶节点的数量呈对数增长。 That means, cutting the index size half is probably not reducing the tree depth at all. 这意味着,将索引大小减半可能根本不会减少树的深度。 Hence, the performance gain might be limited to the improved cache-hit-rate. 因此,性能增益可能仅限于提高的缓存命中率。 But you mentioned you execute that query once every 30 seconds. 但是你提到你每30秒执行一次查询。 On a moderately loaded machine, that means you index will not be cached at all (except, maybe, you search for the same URL every 30 seconds). 在中等负载的计算机上,这意味着您根本不会缓存索引(除非您每隔30秒搜索一次相同的URL)。

After all: I don't see any reason to act against the general advice mentioned above. 毕竟:我认为没有任何理由反对上述一般性建议。

If you really want to save index space, try to find redundant indexes first (eg, those starting with the same columns). 如果您确实想要保存索引空间,请首先尝试查找冗余索引(例如,以相同列开头的索引)。 These are typically the low-hanging fruits. 这些通常是低调的果实。

保持您的网址的md5哈希值固定为32长度。

I doubt you would see any difference by changing the index to only use the first 50 characters. 我怀疑你会看到任何差异,通过改变索引只使用前50个字符。

Since it's a VARCHAR column, the indexed values will only be as long as each URL anyway, so looking at typical URL's you may only be indexing around 50 characters per URL already. 由于它是一个VARCHAR列,因此索引值只会与每个URL一样长,所以查看典型的URL,您可能只能为每个URL索引大约50个字符。

Even if the URL's are all significantly longer, reducing the index size may just increase the chance that that part of the index is already in memory, but again i doubt you would notice any difference. 即使URL都明显更长,减少索引大小可能只会增加索引的那部分已经在内存中的机会,但我再次怀疑你会注意到任何差异。 This might only be useful if it was very high volume and you needed to start micro-optimising for additional performance. 这可能仅在体积非常大且您需要开始微优化以获得额外性能时才有用。

index size only matters on disk space, So you wont be having serious problems by that. 索引大小仅对磁盘空间有影响,因此您不会遇到严重问题。

Having or not having an index could be based on your CRUD operations, do you have more selects or more insert/update/deletes ? 拥有或没有索引可以基于您的CRUD操作,您是否有更多选择或更多插入/更新/删除?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM