简体   繁体   中英

.net string comparison with collation

I have 2 different strings ( XXÈ and XXE ). Is there any way to compare them using a collation (for this case, it would be UTF8 general CI - I need them to be equal)? I've seen few examples involving MSSQL or SQLLite - but this would add an unnecessary dependency to my project. So, my question is - is there any way to do this in pure .net (especially c#)?

Update:

Let's take any decent SQL engine as an example. You can create a table and you can select the collation for the table. In our case, XXÈ and XXE will be stored in the table, they will have different binary representations (depending on the encoding), but when you search for XXE , it will match also XXÈ .

My case is pretty much similar. I have a text file with some strings in it (UTF8). I want to display the values on screen (sorted - where the collation is again, relatively important) and I want to let the user search for values. The collation used for search will be an option.

You could use String.Normalize and a little bit LINQ-power:

string initial = "XXÈ";
string normal = initial.Normalize(NormalizationForm.FormD);

var withoutDiacritics = normal.Where(
    c => CharUnicodeInfo.GetUnicodeCategory(c) != UnicodeCategory.NonSpacingMark);
string final = new string(withoutDiacritics.ToArray());
bool equals = "XXE".Equals(final); // true

Reference: http://www.blackwasp.co.uk/RemoveDiacritics.aspx

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM