I'm using unaccent in Postgres but it cannot convert special character like: ù : ù
but it's okay for ù: ù
2 characters same meaning but different code, the first one is character u + ̀
How I can solve this problem ? Thank you so much.
Your problem is unicode normalization , what PostgreSQL does not do , unfortunately. And it's not so simple to implement on your own.
But, because you only want to remove diacritical marks, you only need to actually remove code-points (before or after calling the unaccent()
function) which are unicode combining characters :
select regexp_replace(
'ùù',
'[\u0300-\u036F\u1AB0-\u1AFF\u1DC0-\u1DFF\u20D0-\u20FF\uFE20-\uFE2F]',
'',
'g'
)
should do the trick.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.