简体   繁体   English

如果要索引的列是SQL Server中的nvarchar数据类型,该怎么办?

[英]What if the column to be indexed is nvarchar data type in SQL Server?

I retrieve data by joining multiple tables as indicated on the image below. 我通过连接多个表来检索数据,如下图所示。 On the other hand, as there is no data in the FK column (EmployeeID) of Event table, I have to use CardNo (nvarchar) fields in order to join the two tables. 另一方面,由于事件表的FK列(EmployeeID)中没有数据,因此我必须使用CardNo(nvarchar)字段才能连接两个表。 On the other hand, the digit numbers of CardNo fields in the Event and Employee tables are different, I also have to use RIGHT function of SQL Server and this makes the query to be executed approximately 10 times longer. 另一方面,Event和Employee表中CardNo字段的位数不同,我还必须使用SQL Server的RIGHT函数,这会使查询的执行时间延长大约10倍。 So, in this scene what should I do? 那么,在这个场景中我该怎么办? Can I use CardNo field without changing its data type to int, etc (because there are other problem might be seen after changing it and it sill be better to find a solution without changing the data type of it). 我可以使用CardNo字段而不将其数据类型更改为int等吗(因为将其更改后可能还会出现其他问题,并且最好在不更改其数据类型的情况下找到解决方案)。 Here is also execution plan of the query below. 这也是下面查询的执行计划。

Query: 查询:

; WITH a AS (SELECT emp.EmployeeName, emp.Status, dep.DeptName, job.JobName, emp.CardNo 
    FROM TEmployee emp 
    LEFT JOIN TDeptA AS dep ON emp.DeptAID = dep.DeptID 
    LEFT JOIN TJob AS job ON emp.JobID = job.JobID),                           

b AS (SELECT eve.EventID, eve.EventTime, eve.CardNo, evt.EventCH, dor.DoorName 
    FROM TEvent eve LEFT JOIN TEventType AS evt ON eve.EventType = evt.EventID
    LEFT JOIN TDoor AS dor ON eve.DoorID = dor.DoorID) 
    SELECT * FROM b LEFT JOIN a ON RIGHT(a.CardNo, 8) = RIGHT(b.CardNo, 8)

ORDER BY b.EventID ASC

连接架构

执行计划

You can add a computed column to your table like this: 您可以像这样向表中添加一个计算列:

ALTER TABLE TEmployee -- Don't start your table names with prefixes, you already know they're tables
ADD CardNoRight8 AS RIGHT(CardNo, 8) PERSISTED

ALTER TABLE TEvent
ADD CardNoRight8 AS RIGHT(CardNo, 8) PERSISTED

CREATE INDEX TEmployee_CardNoRight8_IDX ON TEmployee (CardNoRight8)
CREATE INDEX TEvent_CardNoRight8_IDX ON TEvent (CardNoRight8)

You don't need to persist the column since it already matches the criteria for a computed column to be indexed, but adding the PERSISTED keyword shouldn't hurt and might help the performance of other queries. 您不需要保留该列,因为它已经与要索引的计算列的条件匹配,但是添加PERSISTED关键字应该不会有问题,并且可能有助于其他查询的性能。 It will cause a minor performance hit on updates and inserts, but that's probably fine in your case unless you're importing a lot of data (millions of rows) at a time. 这将对更新和插入造成较小的性能影响,但是对于您而言,这可能很好,除非您一次导入大量数据(数百万行)。

The better solution though is to make sure that your columns that are supposed to match actually match. 不过,更好的解决方案是确保应该匹配的列实际匹配。 If the right 8 characters of the card number are something meaningful, then they shouldn't be part of the card number, they should be another column. 如果卡号的右边8个字符有意义,则它们不应成为卡号的一部分,而应在另一列中。 If this is an issue where one table uses leading zeroes and the other doesn't then you should fix that data to be consistent instead of putting together work arounds like this. 如果这是一个表使用前导零而另一个表不使用前导零的问题,则应修复该数据以保持一致,而不是像这样将变通方法放在一起。

This line is what is costing you 86% of the query time: 这行代码使您花费了86%的查询时间:

LEFT JOIN a ON RIGHT(a.CardNo, 8) = RIGHT(b.CardNo, 8)

This is happening because it has to run RIGHT() on those fields for every row and then match them with the other table. 发生这种情况是因为它必须在每行的那些字段上运行RIGHT() ,然后将它们与另一个表匹配。 This is obviously going to be inefficient. 这显然是低效的。

The most straightforward solution is probably to either remove the RIGHT() entirely or else to re-implement it as a built-in column on the table so it doesn't have to be calculated on the fly while the query is running. 最直接的解决方案可能是完全删除RIGHT()或将其重新实现为表中的内置列,因此在查询运行时不必立即进行计算。

While inserting the record, you would have to also insert the eight, right digits of the card number and store it in this field. 在插入记录时,您还必须插入卡号的八个右数字并将其存储在此字段中。 My original thought was to use a computed column but I don't think those can be indexed so you'd have to use a regular column. 我最初的想法是使用计算列,但是我不认为这些索引可以被索引,因此您必须使用常规列。

; WITH a AS (
    SELECT emp.EmployeeName, emp.Status, dep.DeptName, job.JobName, emp.CardNoRightEight 
    FROM TEmployee emp 
    LEFT JOIN TDeptA AS dep ON emp.DeptAID = dep.DeptID 
    LEFT JOIN TJob AS job ON emp.JobID = job.JobID
),                           
b AS (
    SELECT eve.EventID, eve.EventTime, eve.CardNoRightEight, evt.EventCH, dor.DoorName 
    FROM TEvent eve LEFT JOIN TEventType AS evt ON eve.EventType = evt.EventID
    LEFT JOIN TDoor AS dor ON eve.DoorID = dor.DoorID
) 
SELECT *
FROM b
LEFT JOIN a ON a.CardNoRightEight = b.CardNoRightEight
ORDER BY b.EventID ASC

This will help you see how to add a calculated column to your database. 这将帮助您了解如何向数据库中添加计算列。

create table #temp (test varchar(30))
insert into #temp
values('000456')

alter table #temp
add test2 as right(test, 3) persisted

select * from #temp

The other alternative is to fix the data and the data entry so that both columns are the same data type and contain the same leading zeros (or remove them) 另一种选择是修复数据和数据条目,以使两列都是相同的数据类型并包含相同的前导零(或删除它们)

Many thanks all of your help. 非常感谢您的帮助。 With the help of your answers, I managed to reduce the query execution time from 2 minutes to 1 at the first step after using computed columns. 借助您的答案,在使用计算列之后的第一步中,我设法将查询执行时间从2分钟减少到1分钟。 After that, when creating an index for these columns, I managed to reduce the execution time to 3 seconds. 之后,在为这些列创建索引时,我设法将执行时间减少到3秒。 Wow, it is really perfect :) 哇,真的很完美:)

Here are the steps posted for those who suffers from a similar problem: 这是为遭受类似问题的人发布的步骤:

Step I: Adding computed columns to the tables (As CardNo fields are nvarchar data type, I specify data type of computed columns as int): 步骤I:将计算列添加到表中(由于CardNo字段是nvarchar数据类型,因此我将计算列的数据类型指定为int):

ALTER TABLE TEvent ADD CardNoRightEight AS RIGHT(CAST(CardNo AS int), 8)  
ALTER TABLE TEmployee ADD CardNoRightEight AS RIGHT(CAST(CardNo AS int), 8) 


Step II: Create index for the computed columns in order to execute the query faster: 步骤II:为计算列创建索引,以便更快地执行查询:

CREATE INDEX TEmployee_CardNoRightEight_IDX ON TEmployee (CardNoRightEight)
CREATE INDEX TEvent_CardNoRightEight_IDX ON TEvent (CardNoRightEight)


Step 3: Update the query by using the computed columns in it: 步骤3:使用查询中的计算列更新查询:

; WITH a AS (
    SELECT emp.EmployeeName, emp.Status, dep.DeptName, job.JobName, emp.CardNoRightEight --emp.CardNo 
    FROM TEmployee emp 
    LEFT JOIN TDeptA AS dep ON emp.DeptAID = dep.DeptID 
    LEFT JOIN TJob AS job ON emp.JobID = job.JobID
    ),                         
b AS (
    SELECT eve.EventID, eve.EventTime, evt.EventCH, dor.DoorName, eve.CardNoRightEight --eve.CardNo
    FROM TEvent eve 
    LEFT JOIN TEventType AS evt ON eve.EventType = evt.EventID 
    LEFT JOIN TDoor AS dor ON eve.DoorID = dor.DoorID) 

SELECT * FROM b LEFT JOIN a ON a.CardNoRightEight = b.CardNoRightEight --ON RIGHT(a.CardNo, 8) = RIGHT(b.CardNo, 8)
ORDER BY b.EventID ASC

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM