简体   繁体   English

BIT字段是否比SQL Server中的int字段快?

[英]Is BIT field faster than int field in SQL Server?

I have table with some fields that the value will be 1 0. This tables will be extremely large overtime. 我有一些字段的表,其值为1 0.这些表将是非常大的加班。 Is it good to use bit datatype or its better to use different type for performance? 使用bit数据类型或使用不同类型的性能更好吗? Of course all fields should be indexed. 当然,所有字段都应编入索引。

I can't give you any stats on performance, however, you should always use the type that is best representative of your data. 我无法向您提供有关性能的任何统计信息,但是,您应始终使用最能代表您的数据的类型。 If all you want is 1-0 then absolutely you should use the bit field. 如果您想要的只是1-0那么绝对应该使用位字段。

The more information you can give your database the more likely it is to get it's "guesses" right. 您可以为数据库提供的信息越多,就越有可能获得正确的“猜测”。

Officially bit will be fastest, especially if you don't allow nulls. 正式位将是最快的,特别是如果您不允许空值。 In practice it may not matter, even at large usages. 在实践中,即使在很大的用途中也可能无关紧要。 But if the value will only be 0 or 1, why not use a bit? 但如果该值仅为0或1,为什么不使用一点? Sounds like the the best way to ensure that the value won't get filled with invalid stuff, like 2 or -1. 听起来像是确保值不会被无效内容填充的最佳方法,如2或-1。

It depends. 这取决于。

If you would like to maximize speed of selects, use int (tinyint to save space), because bit in where clause is slower then int (not drastically, but every millisecond counts). 如果你想最大化选择速度,使用int(tinyint来节省空间),因为where子句中的位比int慢(不是很大,但每毫秒计数)。 Also make the column not null which also speeds things up. 也使列不为空,这也加快了速度。 Below is link to actual performance test, which I would recommend to run at your own database and also extend it by using not nulls, indexes and using multiple columns at once. 下面是实际性能测试的链接,我建议在您自己的数据库中运行,并通过使用非空值,索引和一次使用多个列来扩展它。 At home I even tried to compare using multiple bit columns vs multiple tinyint columns and tinyint columns were faster ( select count(*) where A=0 and B=0 and C=0 ). 在家里我甚至尝试使用多位列与多个tinyint列进行比较,tinyint列更快( select count(*) where A=0 and B=0 and C=0 )。 I thought that SQL Server (2014) would optimize by doing only one comparison using bitmask, so it should by three times faster but that wasn't the case. 我认为SQL Server(2014)会通过使用位掩码进行一次比较来优化,所以它应该快三倍但事实并非如此。 If you use indexes, you would need more than 5000000 rows (as used in the test) to notice any difference (which I didn't have the patience to do since filling table with multiple millions of rows would take ages on my machine). 如果你使用索引,你需要超过5000000行(在测试中使用)来注意任何差异(我没有耐心去做,因为在我的机器上填充数百万行需要很长时间)。

https://www.mssqltips.com/sqlservertip/4137/sql-server-performance-test-for-bit-data-type-in-a-where-clause/ https://www.mssqltips.com/sqlservertip/4137/sql-server-performance-test-for-bit-data-type-in​​-a-where-clause/

If you would like to save space, use bit, since 8 of them can ocuppy one byte whereas 8 tinyints will ocupy 8 bytes. 如果你想节省空间,可以使用bit,因为其中8个可以占用一个字节而8个tinyint会占用8个字节。 Which is around 7 Megabytes saved on each million of rows. 每百万行节省大约7兆字节。

The differences between those two cases are basically negligable and since using bit has the upside of signalling that the column represents merely a flag, I would recommend using bit. 这两种情况之间的差异基本上是可以忽略的,因为使用bit有信号表明该列只代表一个标志,我建议使用bit。

As I understand it, you still need a byte to store a bit column (but you can store 8 bit columns in a single byte). 据我了解,您仍然需要一个字节来存储位列(但您可以在一个字节中存储8位列)。 So having a large number (how many?) of these bit columns could save you a bit on storage. 因此,拥有大量(多少?)这些位列可以节省一些存储空间。 As Yishai said it probably won't make much of a difference in performance (though a bit will translate to a boolean in application code more nicely). 正如Yishai所说,它可能不会在性能方面产生太大的影响(虽然有点会在应用程序代码中更好地转换为布尔值)。

If you can state with 100% confidence that the two options for this column will NEVER change then by all means use the bit. 如果您可以100%放心地说明此列的两个选项永远不会改变,那么请务必使用该位。 But if you can see a third value popping up in the future it could make life a little easier when that day comes to use a tinyint. 但如果你能看到未来出现第三个值,那么当那天使用tinyint时,它可以让生活变得更轻松。

Just a thought, but I'm not sure how much good an index will do you on this column either, unless you see the vast majority of rows going to one side or the other. 只是一个想法,但我不确定索引在这个专栏上对你有多好,除非你看到绝大多数行都在一侧或另一侧。 In a roughly 50/50 distribution you might actually take more of a hit keeping the index up to date than it gains you'd see in querying the table. 在大约50/50的分布中,您可能实际上需要更多的命中,使索引保持最新状态,而不是您在查询表时看到的增益。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM