简体   繁体   English

字段大小会影响查询时间吗?

[英]Does field size affect query time?

My question is in regards to MySQL, but I also wonder how this affects other databases. 我的问题是关于MySQL,但我也想知道这会如何影响其他数据库。 I have several fields that are varchar(255) but my coworker insists if they were varchar(30) -- or any smaller size -- then queries would run faster. 我有几个字段是varchar(255)但我的同事坚持认为如果它们是varchar(30) - 或任何更小的大小 - 那么查询会运行得更快。 I'm not so sure, but if it's so I'll admit to it. 我不太确定,但如果是这样的话,我会承认。

It depends on the query and the data, but you're probably optimizing too soon to even be worried. 这取决于查询和数据,但你可能很快就会优化甚至担心。

For SELECT queries, the statement itself will run just as fast within MySQL, and as long as the data doesn't get larger than it would be at the smaller sized field then it will transmit as fast. 对于SELECT查询,语句本身将在MySQL中以相​​同的速度运行,并且只要数据不会比在较小大小的字段中大,那么它将以快速传输。 If the smaller field forces you to store the information in a smaller space (would you use the extra 225 chars?), then you will get fast transmission to other programs. 如果较小的字段强制您将信息存储在较小的空间中(您是否会使用额外的225个字符?),那么您将快速传输到其他程序。

For INSERT queries the size of the field isn't an issue, but using variable length fields will slow the process done. 对于INSERT查询,字段的大小不是问题,但使用可变长度字段将减慢完成的过程。 INSERTs with fixed length rows are notably faster (at least in MySQL 5.0 and earlier). 具有固定长度行的INSERT明显更快(至少在MySQL 5.0及更早版本中)。

Generally, use the size you need for the data. 通常,使用数据所需的大小。 If you don't know if you need 255 chars or 30 chars you're probably optimizing too soon. 如果您不知道是否需要255个字符或30个字符,那么您可能很快就会优化。 Are large data fields causing a bottleneck? 大数据领域是否会造成瓶颈? Is you program suffering from database performance problems at all? 您是否编程遭受数据库性能问题? Find your bottlenecks first, solve the problem with them second. 首先找到你的瓶颈,然后用他们解决问题。 I'd guess the difference in time you're looking at here is unimportant to whatever problem you are trying to solve. 我猜你在这里看到的时间差异对于你要解决的问题并不重要。

Most other answers here are focused on the fact that VARCHAR is stored in a variable-length manner, so it stores the number of bytes of the string you enter on a given row, not the maximum length of the field. 这里的大多数其他答案都集中在VARCHAR以可变长度方式存储的事实,因此它存储您在给定行上输入的字符串的字节数,而不是字段的最大长度。

But during queries, there are some circumstances where MySQL converts a VARCHAR into a CHAR -- and hence the size goes up to the maximum length. 但在查询期间,在某些情况下MySQL会将VARCHAR转换为CHAR - 因此大小会达到最大长度。 This happens, for instance, when MySQL creates a temporary table during some JOIN or ORDER BY or GROUP BY operations. 例如,当MySQL在某些JOIN或ORDER BY或GROUP BY操作期间创建临时表时,就会发生这种情况。

Telling all the cases where it would do this is complicated, because it depends on how the optimizer treats the query, it depends on other table structure and indexes you define, it depends on the type of query, and it even depends on the version of MySQL because the optimizer is improved with each version. 告诉所有这样做的情况很复杂,因为它取决于优化器如何处理查询,它取决于您定义的其他表结构和索引,它取决于查询的类型,甚至取决于版本的MySQL因为每个版本都改进了优化器。

The short answer is yes, it can make a difference whether you use VARCHAR(255) or VARCHAR(30). 简短的回答是肯定的,无论你使用VARCHAR(255)还是VARCHAR(30),它都会有所不同。 So define the column maximum length according to what you need, not a "big" length like 255 for the sake of tradition. 因此,根据您的需要定义列最大长度,而不是为了传统而定义像255这样的“大”长度。

Since you asked about other databases… 既然你问过其他数据库......

It ABSOLUTELY does affect query time. 它绝对会影响查询时间。

In Oracle when data is moved from Server to Client, it's done through a buffer. 在Oracle中,当数据从服务器移动到客户端时,它通过缓冲区完成。 Nothing revolutionary there. 没有什么革命性的。 The number of rows it puts in that buffer is based on the maximum row size. 它放在该缓冲区中的行数基于最大行大小。 Say your query returns 4 columns of varchars. 假设您的查询返回4列varchars。 If the size of the columns is 100 and it should be 10, Oracle will fit 10x fewer rows in each fetch than it otherwise could with right-sized column definitions. 如果列的大小为100且应该为10,那么Oracle在每次提取中的行数将比使用右侧列定义的行少10倍。 This results in blocks being re-read unnecessarily. 这导致不必要地重新读取块。 It forces more network traffic, more round trips. 它会增加网络流量,增加往返次数。

In Oracle you can change the size of the buffer with SET ARRAYSIZE. 在Oracle中,您可以使用SET ARRAYSIZE更改缓冲区的大小。 Try it sometime, do a query with one size and then do it again with 10% of the space. 尝试一下,用一种尺寸进行查询,然后再用10%的空间再做一次。 You'll see reads go up, network trips go up, and performance go down. 你会看到读数上升,网络旅行增加,性能下降。 Making columns way too big is just like making that buffer way too small. 使列太大就像使缓冲区太小。

But the real reason for accurately sized columns is data integrity. 但准确大小的列的真正原因是数据完整性。 You keep bad stuff out. 你把坏东西拿走了。 That's just as important as performance. 这和表现一样重要。

Remember: 记得:

  • It's never too early to design for performance 设计性能永远不会太早
  • 99% of what you say come back to, you won't 99%的你说的回来,你不会
  • It's far easier, better, and cheaper to get something right the first time. 第一次获得正确的东西要容易得多,更好,也更便宜。

If you're only ever using the first 30 characters, then there won't be a difference between a varchar(30) and a varchar(255) (although there would be a difference with varchar(1000), which would take an extra byte). 如果你只使用前30个字符,那么varchar(30)和varchar(255)之间没有区别(尽管varchar(1000)会有所不同,这需要额外的字节)。

Of course, if you end up using more than 30 characters, it will be slower as you have more data to pass to the client, and your indexes will be larger. 当然,如果您最终使用超过30个字符,那么当您有更多数据传递给客户端时,它会更慢,并且您的索引会更大。

Anything smaller than VARCHAR(255) will use one byte to store it's size , so VARCHAR(30) and VARCHAR(255) won't make a difference. 任何小于VARCHAR(255)的东西都会使用一个字节来存储它的大小 ,因此VARCHAR(30)和VARCHAR(255)不会有任何区别。

But take a look if your data is consistent, I mean, always the same size, in that case using a CHAR would be more useful because you won't waste time with size information and your search would be simpler to find the data, not in account index here. 但是看看你的数据是否一致,我的意思是,总是相同的大小,在这种情况下使用CHAR会更有用,因为你不会浪费时间和大小信息,你的搜索会更容易找到数据,而不是在帐户索引这里。

Even if your data isn't consistent but changes in a factor of let's say, one byte, a CHAR would be better, because you will waste one byte with size information anyway. 即使你的数据不一致但改变了一个字节,一个字节,CHAR会更好,因为无论如何你将浪费一个带有大小信息的字节。

Very rarely will column width affect query performance. 列宽很少会影响查询性能。 Certainly if you're using larger objects (BLOBs, LONGBLOBs, TEXTs, LONGTEXTs), there is the potential for a lot of data to get pulled. 当然,如果您使用较大的对象(BLOB,LONGBLOB,TEXT,LONGTEXT),可能会有大量数据被拉扯。 That could possibly affect performance, but it won't necessarily. 这可能会影响性能,但不一定如此。 That really only affects storage. 这实际上只影响存储。 If you care about storage size by data type, you can reference http://dev.mysql.com/doc/refman/5.0/en/storage-requirements.html to see the details. 如果您关注数据类型的存储大小,可以参考http://dev.mysql.com/doc/refman/5.0/en/storage-requirements.html查看详细信息。

And to reiterate: storage size of data does not necessarily impact the speed of queries. 并重申:数据的存储大小不一定会影响查询的速度。 There are many other design considerations that will impact query speed. 还有许多其他设计注意事项会影响查询速度。 Design of the tables and relationships, key structure, indexes, query and join architecture, etc. 表和关系的设计,关键结构,索引,查询和连接体系结构等。

A few years ago many people suggested using tinytext instead of varchar in MySQL for performance, since row by row search was supposedly faster with constant row data size. 几年前,许多人建议在MySQL中使用tinytext而不是varchar来提高性能,因为在行数据大小恒定的情况下逐行搜索应该更快。 Surely MySQL's query, storage and index handling algorithms evolved since then and it may not have that much of an impact now. 当然,MySQL的查询,存储和索引处理算法从那时起逐渐发展,现在可能没有那么大的影响。

But you're probably optimizing too soon and shouldn't be worried about performance at this level. 但是你可能很快就会优化,不应该担心这个级别的性能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM