简体   繁体   中英

Optimizing SQL query to return Record with tags

I was looking for help to optimize a query I am writing for SQL Server. Given this database schema:

TradeLead object, a record in this table is a small article.

CREATE TABLE [dbo].[TradeLeads]
(
    [TradeLeadID] INT NOT NULL PRIMARY KEY IDENTITY(1,1),
    Title nvarchar(250),
    Body nvarchar(max),
    CreateDate datetime,
    EditDate datetime,
    CreateUser nvarchar(250),
    EditUser nvarchar(250), 
    [Views] INT NOT NULL DEFAULT(0)

)

Here's the cross reference table to link a TradeLead article to an Industry record.

CREATE TABLE [dbo].[TradeLeads_Industries]
(
    [ID] INT NOT NULL PRIMARY KEY IDENTITY(1,1), 
    [TradeLeadID] INT NOT NULL, 
    [IndustryID] INT NOT NULL
)

Finally, the schema for the Industry object. These are essentially just tags, but a user is unable to enter these. The database will have a specific amount.

CREATE TABLE [dbo].[Industries]
(
    IndustryID INT NOT NULL PRIMARY KEY identity(1,1),
    Name nvarchar(200)
)

The procedure I'm writing is used to search for specific TradeLead records. The user would be able to search for keywords in the title of the TradeLead object, search using a date range, and search for a TradeLead with specific Industry Tags.

The database will most likely be holding around 1,000,000 TradeLead articles and about 30 industry tags.

This is the query I have come up with:

DECLARE @Title nvarchar(50);
SET @Title = 'Testing';
-- User defined table type containing a list of IndustryIDs. Would prob have around 5 selections max.
DECLARE @Selectedindustryids IndustryIdentifierTable_UDT;
DECLARE @Start DATETIME;
SET @Start = NULL;
DECLARE @End DATETIME;
SET @End = NULL;


SELECT *
FROM(
-- Subquery to return all the tradeleads that match a user's criteria.
-- These fields can be null.
SELECT TradeLeadID, 
            Title, 
            Body, 
            CreateDate, 
            CreateUser, 
            Views
     FROM TradeLeads
     WHERE(@Title IS NULL OR Title LIKE '%' + @Title + '%') AND (@Start IS NULL OR CreateDate >= @Start) AND (@End IS NULL OR CreateDate <= @End)) AS FTL

    INNER JOIN
    -- Subquery to return the TradeLeadID for each TradeLead record with related IndustryIDs
    (SELECT TI.TradeLeadID
           FROM TradeLeads_Industries TI
           -- Left join the selected IndustryIDs to the Cross reference table to get the TradeLeadIDs that are associated with a specific industry.
           LEFT JOIN @SelectedindustryIDs SIDS
             ON SIDS.IndustryID = TI.IndustryID
           -- It's possible the user has not selected any IndustryIDs to search for.
           WHERE (NOT EXISTS(SELECT 1 FROM @SelectedIndustryIDs) OR SIDS.IndustryID IS NOT NULL)
           -- Group by to reduce the amount of records.
           GROUP BY TI.TradeLeadID) AS SelectedIndustries ON SelectedIndustries.TradeLeadID = FTL.TradeLeadID



       With about 600,000 TradeLead records and  with an average of 4 IndustryIDs attached to each one, the query takes around 8 seconds to finish on a local machine. I would like to get it as fast as possible. Any tips or insight would be appreciated.

There's a few points here.

Using constructs like (@Start IS NULL OR CreateDate >= @Start) can cause a problem called parameter sniffing. Two ways of working around it are

  1. Add Option (Recompile) to the end of the query
  2. Use dynamic SQL to only include the criteria that the user has asked for.

I would favour the second method for this data.

Next, the query can be rewritten to be more efficient by using exists (assuming the user has entered industry ids)

select
    TradeLeadID, 
    Title, 
    Body, 
    CreateDate, 
    CreateUser, 
    [Views]
from
    dbo.TradeLeads t
where
    Title LIKE '%' + @Title + '%' and
    CreateDate >= @Start and
    CreateDate <= @End and
    exists (
        select
            'x'
        from
            dbo.TradeLeads_Industries ti
                inner join
            @Selectedindustryids sids
                on ti.IndustryID = sids.IndustryID
        where
            t.TradeLeadID = ti.TradeLeadID
    );

Finally you will want at least one index on the dbo.TradeLeads_Industries table. The following are candidates.

(TradeLeadID, IndustryID)
(IndustryID, TradeLeadID)

Testing will tell you whether one or both is useful.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM