Is there a way to replace accents with Teradata SQL?

Question

I have a varchar column and i want to replace all diacritics with normal letters

For example:

In: São Paulo Out: Sao Paulo
In: eéíãç Out: eeiac

Answer 1

A diacritical character is a composite character, ie can be a base char plus a diacritic, eg

   'a' +  '´' =  'á'
  0061 + 00B4 = 00E1

Both 006100B4 and 00E1 result in the same character, Unicode allows to switch back and forth using normalization functions, which are supported by Teradata:

translate(string using UNICODE_TO_UNICODE_NFD)

decomposes a composite character into separate characters. Those Combining Diacritical Marks are in a Unicode block ranging from U+0300 to U+036F.

Now decompose the input and apply a Regular Expression to remove characters from this range:

select
   'Ťĥïŝ ẅåš á şťŕĭńġ ŵïţĥ ḿäṇȳ ḓìāčṝĩţïĉș' as str,
   REGEXP_REPLACE(translate(str using UNICODE_TO_UNICODE_NFD) -- decomposed input
                 ,'[\x{0300}-\x{036F}]'                        -- diacritics
                 ,'')

returns

Ťĥïŝ ẅåš á şťŕĭńġ ŵïţĥ ḿäṇȳ ḓìāčṝĩţïĉș
This was a string with many diacritics

If there are other decomposable characters you might need to compose them again to save space using another translate(... using UNICODE_TO_UNICODE_NFC)

If you input string has a LATIN charset it might be easier to find the limited list of diacritical characters and apply translate:

oTranslate(str, 'àáâãäåÀÁÂÃÄÅÇ...', 'aaaaaaAAAAAAC...')

Is there a way to replace accents with Teradata SQL?

Question

1 answers

solution1
2 ACCPTED 2020-10-09 13:27:27

Is there a way to replace accents with Teradata SQL?

Question

1 answers

solution1 2 ACCPTED 2020-10-09 13:27:27

solution1
2 ACCPTED 2020-10-09 13:27:27