简体   繁体   English

我应该在MySQL二进制数据类型列上使用哪个索引

[英]Which index should I use on binary datatype column mysql

I am writing a simple tool to check duplicate files(ie files having same data). 我正在编写一个简单的工具来检查重复的文件(即具有相同数据的文件)。 The mechanism is to generate hashes for each file using sha-512 algorithm and then store these hashes in MYSQL database. 机制是使用sha-512算法为每个文件生成哈希,然后将这些哈希存储在MYSQL数据库中。 I store hashes in binary(64) unique not null column. 我将哈希存储在binary(64)唯一的非null列中。 Each row will have a unique binary hash and used to check file is duplicate or not. 每行都有一个唯一的二进制哈希,用于检查文件是否重复。

-- My questions are -- -我的问题是-

  1. Can I use indexes on binary column, my default table collation is latin1 - default collation? 我可以在二进制列上使用索引吗,我的默认表排序规则是latin1-默认排序规则?

  2. Which Indexing mechanism should I use Btree or Hash, for getting high performance? 我应该使用Btree或Hash哪种索引机制来获得高性能? I need to update or add 100 of rows per seconds. 我需要每秒更新或添加100行。

  3. What other things should I take care of to get best performance? 为了获得最佳性能,我还应该注意哪些其他事项?

  1. Can I use indexes on binary column, my default table collation is latin1 - default collation? 我可以在二进制列上使用索引吗,我的默认表排序规则是latin1-默认排序规则?

    Yes, you can; 是的你可以; collation is only relevant for character datatypes, not binary datatypes (it defines how characters should be ordered)—also, be aware that latin1 is a character encoding , not a collation. 排序规则仅与字符数据类型相关,而与二进制数据类型无关(它定义了字符的排序方式)—另外,请注意latin1字符编码 ,而不是排序规则。

  2. Which Indexing mechanism should I use Btree or Hash, for getting high performance? 我应该使用Btree或Hash哪种索引机制来获得高性能? I need to update or add 100 of rows per seconds. 我需要每秒更新或添加100行。

    Note that hash indexes are only available with the MEMORY and NDB storage engines, so you may not even have a choice. 请注意,哈希索引仅可用于MEMORYNDB存储引擎,因此您甚至别无选择。

    In any event, either would typically be able to meet your performance criteria—although for this particular application I see no benefit from using B-Tree (which is ordered), whereas Hash would give better performance. 无论如何,它们通常都能够满足您的性能标准-尽管对于该特定应用程序,我认为使用B-Tree(已订购)没有任何好处,而哈希可以提供更好的性能。 Therefore, if you have the choice, you may as well use Hash. 因此,如果您选择的话,也可以使用Hash。

    See Comparison of B-Tree and Hash Indexes for more information. 有关更多信息,请参见B树和哈希索引比较。

  3. What other things should I take care of to get best performance? 为了获得最佳性能,我还应该注意哪些其他事项?

    Depends on your definition of "best performance" and your environment. 取决于您对“最佳性能”的定义和您的环境。 In general, remember Knuth's maxim " premature optimisation is the root of all evil ": that is, only optimise when you know that there will be a problem with the simplest approach. 通常,请记住Knuth的格言“ 过早的优化是万恶之源 ”:也就是说,只有在您知道最简单的方法会有问题时才进行优化。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 我应该为列使用哪种数据类型? [MySQL的] - What datatype should I use for my column? [MySQL] 如何在MySQL中为二进制列建立索引,还是应该拆分成两个表? - How to index a binary column in MySQL, or should I split into two tables? 我应该使用哪种数据类型映射到 boolean - which datatype i should use for mapping to boolean mysql中所有类型的数值应使用哪种数据类型 - Which datatype should be use for all types of numeric values in mysql 哪个MySQL列应设置为索引 - Which MySQL column should be set as index MYSQL列中的9,000个字符字符串-使用哪种数据类型? - 9,000 char string in MYSQL column - which datatype to use? 我应该使用哪种数据类型来以毫秒为单位存储 Days 小时和分钟? - Which datatype should I use to store Days hours and minutes in milliseconds? 我应该使用哪种数据类型在MySQL上存储较长的加密消息? - What datatype should I use to store long encrypted messages on MySQL? 我应该使用什么数据类型在MySQL中为我的应用程序存储time()? - What datatype should I use to store time() in MySQL for my application? Rails 4 / postgresql索引-我应该使用可以包含无限多个值的datetime列作为索引过滤器吗? - Rails 4/ postgresql index - Should I use as index filter a datetime column which can have infinite number of values?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM