简体   繁体   English

使HashSet <string>不区分大小写

[英]Make HashSet<string> case-insensitive

I have method with HashSet parameter. 我有HashSet参数的方法。 And I need to do case-insensitive Contains within it: 我需要在其中做不区分大小写的包含:

public void DoSomething(HashSet<string> set, string item)
{
    var x = set.Contains(item);
    ... 
}

Is it any way to make existing HashSet case-insensitive (do not create new one)? 是否可以使现有的HashSet不区分大小写(不创建新的)?

I'm looking for solution with best perfomance. 我正在寻找具有最佳性能的解决方案。

Edit 编辑

Contains can be called multiple times. 包含可以多次调用。 So IEnumerable extensions are not acceptable for me due to lower perfomance than native HashSet Contains method. 因此,由于性能低于本机HashSet Contains方法,IEnumerable扩展对我来说是不可接受的。

Solution

Since, answer to my question is NO, it is impossible, I've created and used following method: 既然,回答我的问题是NO,那是不可能的,我已经创建并使用了以下方法:

public HashSet<string> EnsureCaseInsensitive(HashSet<string> set)
{
    return set.Comparer == StringComparer.OrdinalIgnoreCase
           ? set
           : new HashSet<string>(set, StringComparer.OrdinalIgnoreCase);
}

The HashSet<T> constructor has an overload that lets you pass in a custom IEqualityComparer<string> . HashSet<T>构造函数具有一个重载,允许您传入自定义IEqualityComparer<string> There are a few of these defined for you already in the static StringComparer class, a few of which ignore case. 在静态StringComparer类中已经为您定义了一些,其中一些忽略了大小写。 For example: 例如:

var set = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
set.Add("john");
Debug.Assert(set.Contains("JohN"));

You'll have to make this change at the time of constructing the HashSet<T> . 在构造HashSet<T>时,您必须进行此更改。 Once one exists, you can't change the IEqualityComparer<T> it's using. 一旦存在,您就无法更改它正在使用的IEqualityComparer<T>


Just so you know, by default (if you don't pass in any IEqualityComparer<T> to the HashSet<T> constructor), it uses EqualityComparer<T>.Default instead. 您知道,默认情况下(如果您没有将任何IEqualityComparer<T>传递给HashSet<T>构造函数),它会使用EqualityComparer<T>.Default


Edit 编辑

The question appears to have changed after I posted my answer. 在我发布答案后,问题似乎已经改变了。 If you have to do a case insensitive search in an existing case sensitive HashSet<string> , you will have to do a linear search: 如果必须在现有区分大小写的 HashSet<string>中进行不区分大小写的搜索,则必须执行线性搜索:

set.Any(s => string.Equals(s, item, StringComparison.OrdinalIgnoreCase));

There's no way around this. 没有办法解决这个问题。

You can not magically make case-sensetive HashSet (or Dictionary) to behave in case-insensitive way. 你不能神奇地使区分大小写的HashSet(或Dictionary)以不区分大小写的方式运行。

You have to recreate one inside your function if you can not rely on incoming HashSet to be case-insensitive. 如果不能依赖传入的HashSet不区分大小写,则必须在函数内重新创建一个。

Most compact code - use constructor from existing set: 最紧凑的代码 - 使用现有集合中的构造函数

var insensitive = new HashSet<string>(
   set, StringComparer.InvariantCultureIgnoreCase);

Note that copying HashSet is as expensive as walking through all items, so if your function does just on search it would be cheaper (O(n)) to iterate through all items. 请注意,复制HashSet与遍历所有项目一样昂贵,因此如果您的函数仅在搜索时执行,则迭代所有项目会更便宜(O(n))。 If your function called multiple times to make single case-insensitive search you should try to pass proper HashSet to it instead. 如果您的函数多次调用以进行单个不区分大小写的搜索,则应尝试将适当的HashSet传递给它。

The HashSet is designed to quickly find elements as per its hashing function and equality comparator. HashSet旨在根据其散列函数和相等比较器快速查找元素。 What you are asking for is really to find an element matching "some other" condition. 你要求的是真正找到一个匹配“其他”条件的元素。 Imagine that you have a Set<Person> objects that uses only Person.Name for comparison and you need to find an element with some given value of Person.Age . 想象一下,你有一个Set<Person>对象只使用Person.Name进行比较,你需要找到一个具有给定值Person.Age的元素。

The point is you need to iterate over the contents of the set to find the matching elements. 关键是你需要迭代集合的内容来找到匹配的元素。 If you are going to be doing this often you might create a different Set, in you case using a case-insensitive comparator but then you would have to make sure that this shadow set is in sync with the original. 如果您要经常这样做,您可能会创建一个不同的Set,在您使用不区分大小写的比较器的情况下,然后您必须确保此阴影集与原始同步。

The answers so far are essentially variations of the above, I thought to add this to clarify the fundamental issue. 到目前为止的答案基本上是上述的变化,我想补充一点来澄清根本问题。

Assuming you've got this extension method: 假设你有这个扩展方法:

public static HashSet<T> ToHashSet<T>(this IEnumerable<T> source)
{
    return new HashSet<T>(source);
}

You can just use this: 你可以使用这个:

set = set.Select(n => n.ToLowerInvariant()).ToHashSet();

Or, you could just do this: 或者,你可以这样做:

set = new HashSet(set, StringComparer.OrdinalIgnoreCase); 
//or InvariantCultureIgnoreCase or CurrentCultureIgnoreCase

The constructor of HashSet can take an alternative IEqualityComparer that can override how equality is determined. HashSet的构造函数可以采用替代的IEqualityComparer ,它可以覆盖如何确定相等性。 See the list of constructors here . 请在此处查看构造函数列表。

The class StringComparer contains a bunch of static instances of IEqualityComparers for strings. StringComparer类包含一组用于字符串的IEqualityComparers的静态实例。 Particularly, you're probably interested in StringComparer.OrdinalIgnoreCase . 特别是,您可能对StringComparer.OrdinalIgnoreCase感兴趣。 Here is the documentation of StringComparer . StringComparer的文档。

Note that another constructor takes in an IEnumerable , so you can construct a new HashSet from your old one, but with the IEqualityComparer . 请注意,另一个构造函数接受IEnumerable ,因此您可以使用旧的HashSet构建一个新的HashSet ,但使用IEqualityComparer

So, all together, you want to convert your HashSet as follows: 所以,总之,你想要转换你的HashSet如下:

var myNewHashSet = new HashSet(myOldHashSet, StringComparer.OrdinalIgnoreCase);

如果您想保留原始的区分大小写的版本,您可以使用不区分大小写的linq查询它:

var contains = set.Any(a => a.Equals(item, StringComparison.InvariantCultureIgnoreCase));

You can now use 你现在可以使用了

set.Contains(item, StringComparer.OrdinalIgnoreCase);

without needing to re-create you HashSet 无需重新创建HashSet

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM