简体   繁体   中英

Postgres unaccent function for character

I'm using unaccent in Postgres but it cannot convert special character like: ù : ù
but it's okay for ù: ù
2 characters same meaning but different code, the first one is character u + ̀
How I can solve this problem ? Thank you so much.

Your problem is unicode normalization , what PostgreSQL does not do , unfortunately. And it's not so simple to implement on your own.

But, because you only want to remove diacritical marks, you only need to actually remove code-points (before or after calling the unaccent() function) which are unicode combining characters :

select regexp_replace(
  'ùù',
  '[\u0300-\u036F\u1AB0-\u1AFF\u1DC0-\u1DFF\u20D0-\u20FF\uFE20-\uFE2F]',
  '',
  'g'
)

should do the trick.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM