简体   繁体   中英

Delete with WHERE - date, time and string comparison - very slow

I have a slow performing query and was hoping someone with a bit more knowledge in sql might be able to help me improve the performance:

I have 2 tables a Source and a Common, I load in some data which contains a Date, a Time and String (whch is a server name), plus some..

The Source table can contain 40k+ rows (it has 30 odd columns, a mix of ints, dates, times and some varchars (255)/(Max)

I use the below query to remove any data from Common that is in source:

'Delete from Common where convert(varchar(max),Date,102)+convert(varchar(max),Time,108)+[ServerName] in (Select convert(varchar(max),[date],102)+convert(varchar(max),time,108)+ServerName from Source where sc_status < 300)'

The Source Fields are in this format:

  • ServerName varchar(255) IE SN1234
  • Date varchar(255) IE 2012-05-22
  • Time varchar(255) IE 08:12:21

The Common Fields are in this format:

  • ServerName varchar(255) IE SN1234
  • Date date IE 2011-08-10
  • Time time(7) IE 14:25:34.0000000

Thanks

Converting both sides to strings, then concatenating them into one big string, then comparing those results is not very efficient. Only do conversions where you have to. Try this example and see how it compares:

DELETE c
  FROM dbo.Common AS c
  INNER JOIN dbo.Source AS s
  ON s.ServerName = c.ServerName
  AND CONVERT(DATE, s.[Date]) = c.[Date]
  AND CONVERT(TIME(7), s.[Time]) = c.[Time]
  WHERE s.sc_status < 300;

All those conversions to VARCHAR(MAX) are unnecessary and probably slowing you down. I would start with something like this instead:

DELETE c
from [Common] c
WHERE EXISTS(
    SELECT 1
    FROM Source
    WHERE CAST([Date] AS DATE)=c.[Date]
    AND CAST([Time] AS TIME(7))=c.[Time]
    AND [ServerName]=c.[ServerName]
    AND sc_status < 300
);

Something like

Delete from Common inner join Source 
On Common.ServerName = Source.ServerName 
and Common.Date = Convert(Date,Source.Date)
and Common.Time = Convert(Time, Source.Time)
And Source.sc_Status < 300

If it's too slow after that, then you need some indexes, possible on both tables.

Removing the unecessary conversions will help a lot as detailed in Aaron's answer. You might also consider creating an indexed view over the top of the log table, since you probably dont have much flexibility in that schema or insert DML from the log parser.

Simple example:

create table dbo.[Source] (LogId int primary key, servername varchar(255), 
   [date] varchar(255), [time] varchar(255));
insert into dbo.[Source]
    values  (1, 'SN1234', '2012-05-22', '08:12:21'),
            (2, 'SN5678', '2012-05-23', '09:12:21')
go

create view dbo.vSource with schemabinding
as
    select  [LogId],
            [servername], 
            [date], 
            [time], 
            [actualDateTime] = convert(datetime, [date]+' '+[time], 120)
    from    dbo.[Source];
go

create unique clustered index UX_Source on vSource(LogId);
create nonclustered index IX_Source on vSource(actualDateTime);

This will give you an indexed datetime column on which to seek and vastly improve your execution plans at the cost of some insert performance.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM