简体   繁体   中英

sQl Join on case statement taking too long

When trying to join two tables using a case statement, results take too long. Table structure looks like below. Table A may or may not have an alphanumeric .

Table A
--------
AdminId
-------------
Kerry, John (A1234567)
K,John 
..

Table B
---------
Id        | Name
-----------------
1234567    Kerry, John
1234568    K,John
...

Join is on a case statement that looks like below

CASE 
           WHEN adminid  LIKE '%[0-9]%' AND dbo.udf_GetNumeric(AdminId)=B.id THEN 1
           WHEN adminid  NOT LIKE '%[0-9]%'  AND adminid = B.Name THEN 1
           ELSE 0
           END = 1

The function udf_GetNumeric just picks the numeric sequence from alphanumeric. When I comment the line

WHEN adminid  NOT LIKE '%[0-9]%'  AND adminid = B.Name THEN 1

The query only takes a few seconds. is there a better way of writing the join? Such that I can improve performance?

I agree with @Turophile: the tables should be redesigned. Your options include: 1. Create an AdminId column (which doesn't have to be PK and doesn't need to have a Not NULL cosntraint). Populate this with the numeric data you're looking for using your udf_GetNumeric. 2. If the data is coming into the table as-displayed, create a calculated column using the udf_GetNumeric. Make it a persisted column and stick an index on it. 3. If you can't modify the structure of the existing table, create a 2nd table (B_Supplement) and put your AdminID in there, and join it to table B using table B's primary key.

Your biggest improvement will be with the table redesign, but you can optimize your Case statement as well. The Case statement can be improved by switching your 1st and 2nd When-branches as that takes the function execution out of the process if the adminID matches directly.

Also, it doesn't seem like your results will change if you take out the "Not Like '%0-9%'.

This one's a bit riskier depending on how the UDF is written, but you might be able to take out the LIKE '%0-9%' without any impact as well.

And finally, you can get better performance in your comparison (it will be what's referred to as SARG-able; see: https://www.simple-talk.com/sql/performance/the-seven-sins-against-tsql-performance/ for more details) if you switch the comparison around in the 1st When so that it's "B.id = dbo.udf_GetNumeric(AdminId)" Thus, I would re-write the Case as:

    CASE
       WHEN adminid = B.Name THEN 1
       WHEN adminid  LIKE '%[0-9]%' AND B.id  = dbo.udf_GetNumeric(AdminId) THEN 1
       ELSE 0
       END

The riskier version of this would be:

  CASE
       WHEN adminid = B.Name THEN 1
       WHEN B.id  = dbo.udf_GetNumeric(AdminId) THEN 1
       ELSE 0
       END

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM