简体   繁体   中英

SQL query working fine in SQL Server 2012, but failed to execute in SQL Server 2008 R2

I have a table called MyTextstable (myTextsTable_id INT, myTextsTable_text VARCHAR(MAX)) . This table has around 4 million records and I am trying to remove any instance of the ASCII characters in the following range(s) the VARCHAR(MAX) column myTextsTable_text .

  • 00 - 08
  • 11 - 12
  • 14 - 31
  • 127

I have written the following SQL query, which is taking under 10 minutes on SQL Server 2012, but failed to execute on SQL Server 2008 R2 even after two hours (so I stopped the execution). Please note I have restored the backup of a SQL Server 2008 R2 database on SQL Server 2012 (ie the data is exactly same).

BEGIN TRANSACTION [Tran1]

BEGIN TRY
    UPDATE myTextsTable
    SET myTextsTable_text = REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(myTextsTable_text, CHAR(0), ''), CHAR(1), ''), CHAR(2), ''), CHAR(3), ''), CHAR(4), ''), CHAR(5), ''), CHAR(6), ''), CHAR(7), ''), CHAR(8), ''), CHAR(11), ''), CHAR(12), ''), CHAR(14), ''), CHAR(15), ''), CHAR(16), ''), CHAR(17), ''), CHAR(18), ''), CHAR(19), ''), CHAR(20), ''), CHAR(21), ''), CHAR(22), ''), CHAR(23), ''), CHAR(24), ''), CHAR(25), ''), CHAR(26), ''), CHAR(27), ''), CHAR(28), ''), CHAR(29), ''), CHAR(30), ''), CHAR(31), ''), CHAR(127), '')
    WHERE myTextsTable_text LIKE '%[' + CHAR(0) + CHAR(1) + CHAR(2) + CHAR(3) + CHAR(4) + CHAR(5) + CHAR(6) + CHAR(7) + CHAR(8) + CHAR(11) + CHAR(12) + CHAR(14) + CHAR(15) + CHAR(16) + CHAR(17) + CHAR(18) + CHAR(19) + CHAR(20) + CHAR(21) + CHAR(22) + CHAR(23) + CHAR(24) + CHAR(25) + CHAR(26) + CHAR(27) + CHAR(28) + CHAR(29) + CHAR(30) + CHAR(31) + CHAR(127) + ']%';
    COMMIT TRANSACTION [Tran1];
END TRY

BEGIN CATCH
    ROLLBACK TRANSACTION [Tran1];
    --PRINT ERROR_MESSAGE();
END CATCH;

There are only 135 records affected. As the single UPDATE query wasn't working in SQL Server 2008, I have tried the following approach with a temp table.

BEGIN TRANSACTION [Tran1]

BEGIN TRY
    IF OBJECT_ID('tempdb..#myTextsTable') IS NOT NULL DROP TABLE #myTextsTable;
    SELECT myTextsTable_id, myTextsTable_text
    INTO #myTextsTable
    FROM myTextsTable
    WHERE myTextsTable_text LIKE '%[' + CHAR(0) + CHAR(1) + CHAR(2) + CHAR(3) + CHAR(4) + CHAR(5) + CHAR(6) + CHAR(7) + CHAR(8) + CHAR(11) + CHAR(12) + CHAR(14) + CHAR(15) + CHAR(16) + CHAR(17) + CHAR(18) + CHAR(19) + CHAR(20) + CHAR(21) + CHAR(22) + CHAR(23) + CHAR(24) + CHAR(25) + CHAR(26) + CHAR(27) + CHAR(28) + CHAR(29) + CHAR(30) + CHAR(31) + CHAR(127) + ']%';

    UPDATE #myTextsTable
    SET myTextsTable_text = REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(myTextsTable_text, CHAR(0), ''), CHAR(1), ''), CHAR(2), ''), CHAR(3), ''), CHAR(4), ''), CHAR(5), ''), CHAR(6), ''), CHAR(7), ''), CHAR(8), ''), CHAR(11), ''), CHAR(12), ''), CHAR(14), ''), CHAR(15), ''), CHAR(16), ''), CHAR(17), ''), CHAR(18), ''), CHAR(19), ''), CHAR(20), ''), CHAR(21), ''), CHAR(22), ''), CHAR(23), ''), CHAR(24), ''), CHAR(25), ''), CHAR(26), ''), CHAR(27), ''), CHAR(28), ''), CHAR(29), ''), CHAR(30), ''), CHAR(31), ''), CHAR(127), '')

    UPDATE myTextsTable
    SET myTextsTable_text = new.myTextsTable_text
    FROM myTextsTable
    INNER JOIN #myTextsTable new ON new.myTextsTable_id=myTextsTable.myTextsTable_id

    DROP TABLE #myTextsTable;

    COMMIT TRANSACTION [Tran1];
END TRY

BEGIN CATCH
    ROLLBACK TRANSACTION [Tran1];
    --PRINT ERROR_MESSAGE();
END CATCH;

However, the result is same. Works perfectly fine in SQL Server 2012, but not in SQL Server 2008 R2. I found that the UPDATE query was still executing even after two hours (the records were saved into the temp table ( #myTextsTable ) in a few minutes, I checked this later to make sure which part is taking longer).

As the aforementioned two ways weren't working, I have tried using this using TABLE variables just to check if it makes any difference, but the result was same (ie works fine in SQL Server 2012 but not in SQL Server 2008 R2)

BEGIN TRANSACTION [Tran1]

BEGIN TRY
    DECLARE @myTextsTable TABLE (myTextsTable_id INT, myTextsTable_text VARCHAR(MAX))
    INSERT INTO @myTextsTable(myTextsTable_id, myTextsTable_text)
    SELECT myTextsTable_id, myTextsTable_text
    FROM myTextsTable
    WHERE myTextsTable_text LIKE '%[' + CHAR(0) + CHAR(1) + CHAR(2) + CHAR(3) + CHAR(4) + CHAR(5) + CHAR(6) + CHAR(7) + CHAR(8) + CHAR(11) + CHAR(12) + CHAR(14) + CHAR(15) + CHAR(16) + CHAR(17) + CHAR(18) + CHAR(19) + CHAR(20) + CHAR(21) + CHAR(22) + CHAR(23) + CHAR(24) + CHAR(25) + CHAR(26) + CHAR(27) + CHAR(28) + CHAR(29) + CHAR(30) + CHAR(31) + CHAR(127) + ']%';

    UPDATE @myTextsTable
    SET myTextsTable_text = REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(myTextsTable_text, CHAR(0), ''), CHAR(1), ''), CHAR(2), ''), CHAR(3), ''), CHAR(4), ''), CHAR(5), ''), CHAR(6), ''), CHAR(7), ''), CHAR(8), ''), CHAR(11), ''), CHAR(12), ''), CHAR(14), ''), CHAR(15), ''), CHAR(16), ''), CHAR(17), ''), CHAR(18), ''), CHAR(19), ''), CHAR(20), ''), CHAR(21), ''), CHAR(22), ''), CHAR(23), ''), CHAR(24), ''), CHAR(25), ''), CHAR(26), ''), CHAR(27), ''), CHAR(28), ''), CHAR(29), ''), CHAR(30), ''), CHAR(31), ''), CHAR(127), '')

    UPDATE myTextsTable
    SET myTextsTable_updated = GETDATE()
        ,myTextsTable_updatedby = 'As per V87058'
        ,myTextsTable_text = new.myTextsTable_text
    FROM myTextsTable
    INNER JOIN @myTextsTable new ON new.myTextsTable_id=myTextsTable.myTextsTable_id

    COMMIT TRANSACTION [Tran1];
END TRY

BEGIN CATCH
    ROLLBACK TRANSACTION [Tran1];
    --PRINT ERROR_MESSAGE();
END CATCH;

Could anyone explain why this would happen? How to make this SQL query work in SQL Server 2008 R2?

Note: I know that the string manipulations in database server/layer are not ideal and it would be recommended to do string manipulations in application layer and then save it in DB. But, I am trying to understand why this would be a problem in one version and why not in another version.

SQL Server 2012
Microsoft SQL Server 2012 - 11.0.5058.0 (X64)
Standard Edition (64-bit) on Windows NT 6.3 (Build 9600: ) (Hypervisor)

SQL Server 2008 R2
Microsoft SQL Server 2012 - 11.0.5058.0 (X64)
Standard Edition (64-bit) on Windows NT 6.3 (Build 9600: ) (Hypervisor)

This is a known issue on SQL Server 2008 with LOB datatypes and certain collations.

It is easy to reproduce

/*Hangs on 2008*/

DECLARE @VcMax varchar(max)= char(0) + 'a'

SELECT REPLACE(@VcMax COLLATE Latin1_General_CS_AS, char(0), '')

Whilst hung it is CPU bound and seems to be in an infinite loop through these functions.

在此输入图像描述

And the fix is easy too. Either use a non MAX datatype...

... or a binary collation

/*Doesn't Hang*/
DECLARE @VcMax varchar(max)= char(0) + 'a'

SELECT REPLACE(@VcMax COLLATE Latin1_General_100_BIN2, char(0), '')

For anyone reading this in future, the following ways worked fine.

Way 1. Changing the COLLATION on the VARCHAR(MAX) column in the UPDATE SQL query to BINARY COLLATION as Martin Smith suggested (please see the accepted answer).

REPLACE(myTextsTable_text COLLATE Latin1_General_100_BIN2, CHAR(0),...

The solution will be as below:

GO
BEGIN TRANSACTION [Tran1]

BEGIN TRY
    IF OBJECT_ID('tempdb..#myTextsTable') IS NOT NULL DROP TABLE #myTextsTable;
    SELECT myTextsTable_id, myTextsTable_text
    INTO #myTextsTable
    FROM myTextsTable
    WHERE myTextsTable_text LIKE '%[' + CHAR(0) + CHAR(1) + CHAR(2) + CHAR(3) + CHAR(4) + CHAR(5) + CHAR(6) + CHAR(7) + CHAR(8) + CHAR(11) + CHAR(12) + CHAR(14) + CHAR(15) + CHAR(16) + CHAR(17) + CHAR(18) + CHAR(19) + CHAR(20) + CHAR(21) + CHAR(22) + CHAR(23) + CHAR(24) + CHAR(25) + CHAR(26) + CHAR(27) + CHAR(28) + CHAR(29) + CHAR(30) + CHAR(31) + CHAR(127) + ']%';

    UPDATE #myTextsTable
    SET myTextsTable_text = REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(myTextsTable_text COLLATE Latin1_General_100_BIN2, CHAR(0), ''), CHAR(1), ''), CHAR(2), ''), CHAR(3), ''), CHAR(4), ''), CHAR(5), ''), CHAR(6), ''), CHAR(7), ''), CHAR(8), ''), CHAR(11), ''), CHAR(12), ''), CHAR(14), ''), CHAR(15), ''), CHAR(16), ''), CHAR(17), ''), CHAR(18), ''), CHAR(19), ''), CHAR(20), ''), CHAR(21), ''), CHAR(22), ''), CHAR(23), ''), CHAR(24), ''), CHAR(25), ''), CHAR(26), ''), CHAR(27), ''), CHAR(28), ''), CHAR(29), ''), CHAR(30), ''), CHAR(31), ''), CHAR(127), '')

    UPDATE myTextsTable
    SET myTextsTable_updated = GETDATE()
        ,myTextsTable_updatedby = 'As per V87058'
        ,myTextsTable_text = new.myTextsTable_text
    FROM myTextsTable
    INNER JOIN #myTextsTable new ON new.myTextsTable_id=myTextsTable.myTextsTable_id

    DROP TABLE #myTextsTable;

    COMMIT TRANSACTION [Tran1];
END TRY

Way 2: I have created a SQL function to replace these characters with STUFF instead of using REPLACE function.

Note: Please note the SQL function is written to my specific requirement. As such, it only replaces characters in the following range.

  • 00 - 08
  • 11 - 12
  • 14 - 31
  • 127

--

Go
CREATE FUNCTION [dbo].RemoveASCIICharactersInRange(@InputString VARCHAR(MAX))
    RETURNS VARCHAR(MAX)
    AS
    BEGIN
        IF @InputString IS NOT NULL
        BEGIN
          DECLARE @Counter INT, @TestString NVARCHAR(40)

          SET @TestString = '%[' + NCHAR(0) + NCHAR(1) + NCHAR(2) + NCHAR(3) + NCHAR(4) + NCHAR(5) + NCHAR(6) + NCHAR(7) + NCHAR(8) + NCHAR(11) + NCHAR(12) + NCHAR(14) + NCHAR(15) + NCHAR(16) + NCHAR(17) + NCHAR(18) + NCHAR(19) + NCHAR(20) + NCHAR(21) + NCHAR(22) + NCHAR(23) + NCHAR(24) + NCHAR(25) + NCHAR(26) + NCHAR(27) + NCHAR(28) + NCHAR(29) + NCHAR(30) + NCHAR(31) + NCHAR(127)+ ']%'

          SELECT @Counter = PATINDEX (@TestString, @InputString COLLATE Latin1_General_BIN)

          WHILE @Counter <> 0
          BEGIN
            SELECT @InputString = STUFF(@InputString, @Counter, 1, '')
            SELECT @Counter = PATINDEX (@TestString, @InputString COLLATE Latin1_General_BIN)
          END
        END
        RETURN(@InputString)
    END

    GO

Then, the UPDATE SQL query (in my temp table approach) will be something like below:

UPDATE #myTextsTable 
SET myTextsTable_text = [dbo].RemoveASCIICharactersInRange(#myTextsTable_text)
Go

My personal preferred way would be the first one.

Probably the problem is the nesting in the replace and it is reported on the execution and not the compilación check @@nestlevel function. https://technet.microsoft.com/en-us/library/ms190607(v=sql.105).aspx

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM