简体   繁体   English

MySQL / InnoDB如何在内部表示NULL值?

[英]How does MySQL/InnoDB represent NULL values internally?

In MySQL (or maybe I should say: with MySQL's InnoDB engine) - how are null values represented? 在MySQL中(或者也许我应该说:使用MySQL的InnoDB引擎)-空值如何表示? ie how does the representation of a table (or of a single record if it's on the record level) change if a column is allowed to have NULL s? 也就是说,如果允许列具有NULL ,那么表(或记录为记录级的单个记录)的表示方式会如何变化?

If it's different for different column data types - either explain the variety of approaches to representing NULLs, or just pick one data type (eg INT ). 如果不同的列数据类型不同,则说明每种表示NULL的方法,或者仅选择一种数据类型(例如INT )。

Reference 参考

https://dev.mysql.com/doc/refman/5.7/en/innodb-physical-record.html https://dev.mysql.com/doc/refman/5.7/en/innodb-physical-record.html

Quotes and Interpretation 行情和解释

ROW_FORMAT=REDUNDANT : ROW_FORMAT=REDUNDANT

An SQL NULL value reserves one or two bytes in the record directory. SQL NULL值在记录目录中保留一个或两个字节。 Besides that, an SQL NULL value reserves zero bytes in the data part of the record if stored in a variable length column. 除此之外,如果将SQL NULL值存储在可变长度列中,则在记录的数据部分中保留零字节。 In a fixed-length column, it reserves the fixed length of the column in the data part of the record. 在固定长度的列中,它在记录的数据部分中保留列的固定长度。 Reserving the fixed space for NULL values enables an update of the column from NULL to a non-NULL value to be done in place without causing fragmentation of the index page. 保留NULL值的固定空间可以使列从NULL更新为非NULL值,而不会引起索引页的碎片。

That is, 1 bit/col for NULLs, no savings in data. 也就是说,NULL为1位/列,则不节省数据。

ROW_FORMAT=COMPACT : ROW_FORMAT=COMPACT

The variable-length part of the record header contains a bit vector for indicating NULL columns. 记录头的可变长度部分包含一个用于指示NULL列的位向量。 If the number of columns in the index that can be NULL is N, the bit vector occupies CEILING(N/8) bytes. 如果索引中可以为NULL的列数为N,则位向量占用CEILING(N / 8)个字节。 (For example, if there are anywhere from 9 to 15 columns that can be NULL, the bit vector uses two bytes.) Columns that are NULL do not occupy space other than the bit in this vector. (例如,如果有9到15列可以为NULL的列,则位向量使用两个字节。)NULL列不占用此向量中的位以外的空间。 The variable-length part of the header also contains the lengths of variable-length columns. 标头的可变长度部分还包含可变长度列的长度。 Each length takes one or two bytes, depending on the maximum length of the column. 每个长度占用一个或两个字节,具体取决于列的最大长度。 If all columns in the index are NOT NULL and have a fixed length, the record header has no variable-length part. 如果索引中的所有列都不为空并且具有固定长度,那么记录头将没有可变长度部分。

That is, 1 bit/col, zero space for data. 也就是说,1位/列,零数据空间。

I suspect, without evidence, that DYNAMIC and COMPRESSED are like COMPACT . 我怀疑没有证据表明DYNAMICCOMPRESSED就像COMPACT一样。

Column length 列长

Each column has a 1- or 2- byte length in front of it. 每列前面都有1或2个字节的长度。 The choice of 1 or 2 is based on the max potential column width. 选择1还是2是基于最大潜在列宽。 (Note: Though LONGTEXT needs a 4 byte length, the 'length' is really talking about the amount that is stored in the record, not in overflow.) (注意:尽管LONGTEXT需要4个字节的长度,但'length'真正的意思是存储在记录中的数量,而不是溢出)。

Overflow storage 溢出存储

While I am on the topic, here is some info on what happens with "long" strings/blobs -- whether it is in the record, or stored elsewhere: 当我在讨论该主题时,以下是有关“长”字符串/ blob发生的情况的一些信息-无论是在记录中还是存储在其他位置:

  • <= 40 bytes (in a given column): Stored in the record. <= 40字节(在给定的列中):存储在记录中。
  • If the whole record fits in about 8KB: Store in the record. 如果整个记录适合大约8KB:将存储在记录中。
  • Otherwise, and COMPACT : 768+20 for long columns 否则, COMPACT :长COMPACT 768 + 20
  • Otherwise, and DYNAMIC and COMPRESSED : 20 for long columns 否则, DYNAMICCOMPRESSED :长COMPRESSED 20

"768" means that the first 768 bytes of the text/blob are stored in the record; “ 768”表示文本/斑点的前768个字节存储在记录中; "20" means a 20-byte 'pointer' to where the rest (or all) is stored. “ 20”表示剩余(或全部)存储位置的20字节“指针”。

KEY_BLOCK_SIZE controls how much column data is stored in the clustered index, and how much is placed on overflow pages. KEY_BLOCK_SIZE控制在聚集索引中存储多少列数据,以及在溢出页面上放置多少列数据。

(I am leaving REDUNDANT out of that, because I don't have the details.) (我要离开REDUNDANT出来的,因为我没有细节。)

Rules of Thumb 经验法则

There is 20-30 bytes of overhead for each InnoDB row. 每个InnoDB行有20到30个字节的开销。

A BTree (including the Data for InnoDB, plus each secondary index) gravitates to 69% full as blocks split, etc. 当块拆分时,BTree(包括InnoDB的数据,以及每个辅助索引)的容量已达到69%。

"Data_free" is woefully incomplete; “ Data_free”非常不完整; don't trust it. 不要相信它。

MyISAM is very spartan on space; MyISAM在太空领域是斯巴达式的。 it is easy to compute the space for a MyISAM table. 计算MyISAM表的空间很容易。 From there, multiply by 2-3 to get the space needed for InnoDB. 从那里乘以2-3,以获得InnoDB所需的空间。 (There are exceptions, often involve MyISAM fragmentation, PK clustering, etc.) (有一些例外,通常涉及MyISAM碎片,PK群集等)

This is for COMPACT only (REDUNDANT is interesting for historical reasons only unless you access dictionary tables). 这仅适用于COMPACT(冗余仅出于历史原因才很有趣,除非您访问字典表)。 For each NULL-able column there is one bit in NULLS header. 对于每个可为NULL的列,NULLS标头中只有一位。

If there are no NULL-able fields in a table NULLs header size is zero. 如果表中没有可为空的字段,则NULL头大小为零。

If a column value is NULL the bit is set and there is no value in the record data. 如果列值为NULL,则该位置1,并且记录数据中没有值。

If a column value is not NULL the bit is unset and the column value is stored in the record's data. 如果列值不为NULL,则该位未设置,并且该列值存储在记录的数据中。

以COMPACT格式记录

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM