简体   繁体   English

.net字符串与排序规则的比较

[英].net string comparison with collation

I have 2 different strings ( XXÈ and XXE ). 我有2个不同的字符串( XXÈXXE )。 Is there any way to compare them using a collation (for this case, it would be UTF8 general CI - I need them to be equal)? 有什么方法可以使用归类比较它们(在这种情况下,它将是UTF8 general CI我需要它们相等)? I've seen few examples involving MSSQL or SQLLite - but this would add an unnecessary dependency to my project. 我见过一些涉及MSSQL或SQLLite的示例-但这会给我的项目增加不必要的依赖。 So, my question is - is there any way to do this in pure .net (especially c#)? 所以,我的问题是-有什么方法可以在纯.net(尤其是C#)中做到这一点?

Update: 更新:

Let's take any decent SQL engine as an example. 让我们以任何不错的SQL引擎为例。 You can create a table and you can select the collation for the table. 您可以创建表格,也可以选择表格的排序规则。 In our case, XXÈ and XXE will be stored in the table, they will have different binary representations (depending on the encoding), but when you search for XXE , it will match also XXÈ . 在我们的情况下, XXÈXXE将存储在表中,它们将具有不同的二进制表示形式(取决于编码),但是当您搜索XXE ,它也会与XXÈ匹配。

My case is pretty much similar. 我的情况非常相似。 I have a text file with some strings in it (UTF8). 我有一个带有一些字符串的文本文件(UTF8)。 I want to display the values on screen (sorted - where the collation is again, relatively important) and I want to let the user search for values. 我想在屏幕上显示值(排序-排序规则再次出现,相对重要),我想让用户搜索值。 The collation used for search will be an option. 用于搜索的排序规则将是一个选项。

You could use String.Normalize and a little bit LINQ-power: 您可以使用String.Normalize和一点LINQ功能:

string initial = "XXÈ";
string normal = initial.Normalize(NormalizationForm.FormD);

var withoutDiacritics = normal.Where(
    c => CharUnicodeInfo.GetUnicodeCategory(c) != UnicodeCategory.NonSpacingMark);
string final = new string(withoutDiacritics.ToArray());
bool equals = "XXE".Equals(final); // true

Reference: http://www.blackwasp.co.uk/RemoveDiacritics.aspx 参考: http : //www.blackwasp.co.uk/RemoveDiacritics.aspx

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM