简体   繁体   English

SqlServer和nvarchar(最大)

[英]SqlServer and nvarchar(max)

we are currently looking at setting our string columns to nvarchar(max) rather than specifying a specific length to prevent any problems where there could be not enough room in the database to store the string . 我们目前正在考虑将字符串列设置为nvarchar(max)而不是指定特定长度以防止在数据库中没有足够空间存储字符串的任何问题。 Im just wondering if this is a good thing or could it cause any problems since it was ok to do then why specify a length like nvarchar(10) rather than nvarchar(max) . 我只是想知道这是否是一件好事,或者它是否会导致任何问题,因为它可以做到那么为什么指定长度,如nvarchar(10)而不是nvarchar(max) We also use varbinary(max) a lot since we dont know how much binary data we will need so Im not sure how much this is an effect either give that our inserts are not as fast as I think they should be . 我们也varbinary(max)使用varbinary(max)因为我们不知道我们需要多少二进制数据所以我不知道这是多少效果或者说我们的插入没有我想的那么快。 This is an example table: 这是一个示例表:

CREATE TABLE [dbo].[SAMPLETABLE] (  
[ID] [uniqueidentifier] NOT NULL,  
[FIELD1] [int] NOT NULL,  
[FIELD2] [nvarchar] (2000) NULL,  
[FIELD3] [nvarchar] (max) NULL,  
[FIELD4] [uniqueidentifier] NULL,  
[FIELD5] [int] NULL,  
[FIELD6] [nvarchar] (2000) NULL,  
[FIELD7] [varbinary] (max) NULL,  
[FIELD8] [varbinary] (max) NULL,  
[FIELD9] [varbinary] (max) NULL,  
[FIELD10] [uniqueidentifier] NULL,  
[FIELD11] [nvarchar] (2000) NULL,  
[FIELD12] [varbinary] (max) NULL,  
[FIELD13] [varbinary] (max) NULL,  
[FIELD14] [bit] NULL,  
[FIELD15] [uniqueidentifier] NULL,  
[FIELD16] [varbinary] (max) NULL,  
[FIELD17] [bit] NULL,  
[FIELD18] [tinyint] NULL,  
[FIELD19] [datetime] NULL,  
[FIELD20] [nvarchar] (2000) NULL,  
PRIMARY KEY CLUSTERED   
(  
    [ID] ASC  
)
) ON [PRIMARY]  

GO

Given a table design like that and changing the nvarchar(2000) to nvarchar(max) would that make things any worse(or better)? 给定这样的表设计并将nvarchar(2000)更改为nvarchar(max)会使事情变得更糟(或更好)? Does sqlserver frown upon designs like this? sqlserver不喜欢这样的设计吗?

If you're happy for J. Random Developer, 6 months down the line, to insert a work by Shakespeare into each column, then fine. 如果你对J. Random Developer感到高兴,6个月后,将莎士比亚的作品插入每一栏,那就好了。

For me, a big part of data modelling is seriously thinking about what data I do want to allow in each column, and which data I wish to prohibit. 对我来说,数据建模的很大一部分是认真思考我希望在每列中允许哪些数据,以及我希望禁止哪些数据。 I then apply appropriate CHECK constraints to achieve those restrictions (as best SQL Server allows). 然后我应用适当的CHECK约束来实现这些限制(最好的SQL Server允许)。 Having a sensible length check available "for free" has always seemed like a bonus. “免费”进行合理的长度检查似乎总是一种奖励。


You're also not doing much "future proofing" - changing the length of a (n)varchar column to a larger value at a later date is, I believe, purely a meta-data operation. 你也没有做太多的“未来验证” - 我相信,在以后将(n)varchar列的长度更改为更大的值,纯粹是元数据操作。 So I'd say size the columns appropriately for the data you're expecting to deal with today (and okay, for the next year or so). 所以我会说适当的列大小适合你今天要处理的数据(好的,明年左右)。 If you need to expand them later, it takes seconds to do. 如果以后需要扩展它们,则需要几秒钟。

Let's hope you don't use the column for searching or have unique values... 我们希望您不要使用该列进行搜索或具有唯一值...

Indexes can not be over 900 bytes wide So you can probably never create an index. 索引不能超过900字节宽所以你可能永远不会创建索引。 This is one downside: because it gives 这是一个缺点:因为它给出了

  • really bad searching performance 搜索性能非常糟糕
  • no unique constraints 没有独特的限制

It can be worked around with a computed column but then why not store what you need ? 它可以解决计算列,但为什么不存储你需要的

Switching from the in-row types to BLOB types is always a big decision. 从行内类型切换到BLOB类型始终是一个重大决定。 You have to internalize that the BLOB types ( VARCHAR(MAX) , NVARCHAR(MAX) and VARBINARY(MAX) ) are a completely different type internally from the in-row types: 您必须内部化BLOB类型( VARCHAR(MAX)NVARCHAR(MAX)VARBINARY(MAX) )在行内类型内部是完全不同的类型:

So switching all columns to BLOB types might bring in a lot of side effects you have not considered: impossibility to index the BLOB columns, lack of online operations, general performance degradation due to BLOB inherent slower code etc etc. the most serious hurdle may be the fact that you won't be able to index the columns after making them BLOBs. 因此,将所有列切换为BLOB类型可能会带来许多您未考虑的副作用:无法索引BLOB列,缺少在线操作,由于BLOB固有的较慢代码等导致的一般性能下降等等。最严重的障碍可能是事实上,在将BLOB作为BLOB之后,您将无法对列进行索引。 If this is not a show stopper, then you'll have to test and measure the performance impact. 如果这不是一个显示阻止,那么你将不得不测试和衡量性能影响。

The data modeling concerns other have raised are in general valid, but I understand that often in the real world the theory works only in theory... 其他已经提出的数据建模问题一般都是有效的,但据我所知,在现实世界中,理论只能在理论上起作用...

The answer is the same as the answer to "Why do I need to specify an int when I can store all numbers as strings?" 答案与“当我可以将所有数字存储为字符串时为什么需要指定int?”的答案相同。 - because it aids: - 因为它有助于:

  • efficiency of speed. 速度效率。
  • efficiency of storage. 存储效率。
  • the author/architect's intention. 作者/建筑师的意图。
  • cuts down on data error, since only a certain kind of data will fit. 减少数据错误,因为只有某种数据适合。

But it won't cause any obvious "problems" immediately because nvarchar(10) is a subset of nvarchar(max). 但它不会立即引起任何明显的“问题”,因为nvarchar(10)是nvarchar(max)的子集。

Here is the same answer I gave to another guy who wanted endless tables: 以下是我给另一个想要无尽桌子的人的答案:

Database alternative to MySQL made for millions of TABLES MySQL的数据库替代品,用于数百万个表格

That bit above is a less than optimal design for any relational data storage. 对于任何关系数据存储,上述位置都不是最优设计。 Pop goes the weasel for data: just pick one you might get the data you want. Pop为数据添加黄鼠狼:只需选择一个你可能得到的数据。

Perhaps a no sql solution would work better so you can have dynamic data and not worry about column limits. 也许没有sql解决方案可以更好地工作,因此您可以拥有动态数据而不用担心列限制。

I think if we are going to answer questions then I also think it behooves us to offer best/better practices when there are alternates to bad design. 我想如果我们要回答问题,那么我也认为在有不良设计的替代方案时,我们应该提供最好/更好的做法。

Think of the guy coming after you as KM said above 想想像KM上面提到的那个人

in my experience not many 2000 character long fields end up indexed though. 根据我的经验,不过很多2000字符长的字段最终会被索引。 I think it's much better to use nvarchar(max) than some arbitary length that you might have to truncate data if it's not long enough. 我认为使用nvarchar(max)比一些任意长度要好得多,如果数据不够长,你可能需要截断数据。

An example I saw is an error log table where the table designers had not been prepared to store the call stack in an nvarchar(max) field, so they had stored the first n-thousand characters, resulting in truncated call stacks with the most interesting sections missing. 我看到的一个例子是一个错误日志表,其中表设计者还没有准备好将调用堆栈存储在nvarchar(max)字段中,因此他们存储了前n个字符,导致截断的调用堆栈最有趣部分缺失。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM