简体   繁体   中英

SQL Server - strip out special characters

I have some special unicode characters, like bullets and squares in a table. When I do a select, I would like to remove the special characters but still return the rest of the text.

Is there a way to do this?

Try to use this function. The following code returns the text composed only from the range of Unicode char codes; in this case from 0x0020 (space char) to 0x007E (tilde char) while the rest is omitted. You can define one or more character ranges to extend the valid characters for this function. The list of Unicode characters including their codes can be found eg here .

CREATE FUNCTION dbo.CharRangeOnly (@InputText NVARCHAR(4000))
  RETURNS NVARCHAR(4000)
AS
BEGIN
  DECLARE @Index INT
  DECLARE @BufferChar INT
  DECLARE @OutputText NVARCHAR(4000)

  SET @Index = 1
  SET @OutputText = ''

  WHILE @Index < LEN(@InputText) + 1
    BEGIN
      SET @BufferChar = UNICODE(SUBSTRING(@InputText, @Index, 1))

      -- here you can specify the valid character range(s)
      IF @BufferChar BETWEEN 0x0020 AND 0x007E
        SET @OutputText = @OutputText + NCHAR(@BufferChar)

      SET @Index = @Index + 1
    END

  RETURN @OutputText
END

You can use it this way

SELECT dbo.CharRangeOnly (N'•■乕ขTeᶕst໙ ཙof  β俼 ޠ➽Unᦗic៘ode✸ᾦ')

The function called with this crazy mix of chars surprisingly returns

'Test of Unicode'

This may be of some help. You can do it with SQL, or build a CLR method and import it into your instance of SQL server to make the determination. Either way you'll need some kind of scalar function comparing character by character.

link

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM