简体   繁体   中英

SQL remove unwanted special characters from a string

Hi i am new to SQL and am writing a case statement for a column of grade values. The values can be a length of 3 like A02, B04, A10, A09, D03. The first character is a letter while the next 2 are digits.

If a user enters in 'A02 I want to change it to do A02. Basically remove any special characters if there are present.

CASE
WHEN Grade like '[^0-9A-z]%' THEN ''
else Grade end as Grade

So far I have this but I am not sure how to use regex to remove the character only search for it.

Unless you really want to do a CASE for the fun of it, in oracle I'd do it like this which removes punctuation characters and spaces when you select it. Note this does not verify format so a grade of Z1234 would get returned.

WITH tbl(ID, grade) AS (
  SELECT 1, 'A01' FROM dual UNION ALL
  SELECT 1, '''B02' FROM dual UNION ALL
  SELECT 2, '$    C01&' FROM dual
)
SELECT ID, grade, REGEXP_REPLACE(grade, '([[:punct:]]| )') AS grade_scrubbed
from tbl;


        ID GRADE     GRADE_SCRUBBED
---------- --------- --------------
         1 A01       A01           
         1 'B02      B02           
         2 $    C01& C01           

3 rows selected.

HOWEVER, that said, since you seem to want to verify the format and use regex, you could do it this way although it's a little fugly. See comments.

WITH tbl(ID, grade) AS (
  -- Test data. Include every crazy combo you'd never expect to see,
  --   because you WILL see it, it's just a matter of time :-)
  SELECT 1, 'A01'               FROM dual UNION ALL
  SELECT 1, '''B02'             FROM dual UNION ALL
  SELECT 2, '$    C01&'         FROM dual UNION ALL
  SELECT 3, 'DDD'               FROM dual UNION ALL
  SELECT 4, 'A'||CHR(10)||'DEF' FROM dual UNION ALL
  SELECT 5, 'Z1234'             FROM dual UNION ALL
  SELECT 6, NULL                FROM dual
)
SELECT ID, grade, 
CASE 
  -- Correct format of A99.
  WHEN REGEXP_LIKE(grade, '^[A-Z]\d{2}$') 
    THEN grade
  -- if not A99, see if stripping out punctuation and spaces make it match A99.
  --   If so, return with punctuation and spaces stripped out.
  WHEN NOT REGEXP_LIKE(grade, '^[A-Z]\d{2}$') 
    AND REGEXP_LIKE(REGEXP_REPLACE(grade, '([[:punct:]]| )'), '^[A-Z]\d{2}$')
    THEN REGEXP_REPLACE(grade, '([[:punct:]]| )')
  -- if not A99, and stripping out punctuation and spaces didn't make it match A99,
  --   then the grade is in the wrong format.
  WHEN NOT REGEXP_LIKE(grade, '^[A-Z]\d{2}$') 
    AND NOT REGEXP_LIKE(REGEXP_REPLACE(grade, '([[:punct:]]| )'), '^[A-Z]\d{2}$')
    THEN 'Invalid grade format'
  -- Something fell through all cases we tested for.  Always expect the unexpected!
  ELSE 'No case matched!'
END AS grade_scrubbed
from tbl;


        ID GRADE                GRADE_SCRUBBED      
---------- -------------------- --------------------
         1 A01                  A01                 
         1 'B02                 B02                 
         2 $    C01&            C01                 
         3 DDD                  Invalid grade format
         4 A
DEF                            Invalid grade format
                                                    
                                                                                
         5 Z1234                Invalid grade format
         6                      No case matched!    

7 rows selected.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM