简体   繁体   English

Linq扩展集操作

[英]Linq extended set operations

Current situation : 现在的情况 :

HashSet<string> MasterSet => {100, 3}

HashSet <string> SubSet => {100, 3} or {100} or {100, 3, 1}

So from the huge list of subsets, based on MasterSet I select the particular set like this: 因此,从庞大的子集列表中,基于MasterSet,我选择了这样的特定集合:

if(MasterSet.SetEquals(subSet) || MasterSet.IsSupersetOf(subSet) || MasterSet.IsSubsetOf(subSet))

Process with that subSet if it is true otherwise loop through other available sets. 如果为true,则使用该subSet进行处理,否则循环遍历其他可用集合。

Problem with duplicates : If business logic wants to include duplicate in Master set and subset like : 重复项的问题:如果业务逻辑要在主集和子集中包含重复项,例如:

MasterSet => {100, 3, 3}

SubSet => {100, 3, 3} or {100, 3} or {100, 3, 3, 1}

then HashSet usage is deprived. 那么HashSet的使用将被剥夺。

How do I select the subset if I change the MasterSet and Subset to List<string> 如果我将MasterSet和Subset更改为List<string>如何选择子集

EDIT : Solution provided by "BigYellowCactus" works. 编辑:由“ BigYellowCactus”提供的解决方案。 However if I wantto use headers instead of order of element to match would it be even easier to filter the set? 但是,如果我想使用标头而不是元素的顺序进行匹配,是否更容易过滤集合?

MasterSet => {100, 3, 4}
MasterHeaders => {"T","F","V"} //Headers element corresponds to the MasterSet element

Case 1: 情况1:

SubSet => {3, 100}
SubSetHeaders => {"F", "T"} //Headers element corresponds to the SubSet element

Case 2: 情况2:

SubSet => {4, 3}
SubSetHeaders => {"V", "F"} //Headers element corresponds to the SubSet element

Is it possible to first match by headers comparing MasterHeaders and SubSetHeaders and then Match by values? 是否可以首先通过比较MasterHeaders和SubSetHeaders的标题进行匹配,然后再按值进行匹配?

You can use the All extension method. 您可以使用全部扩展方法。

Description 描述

Determines whether all elements of a sequence satisfy a condition. 确定序列中的所有元素是否都满足条件。


Example: 例:

if (MasterSet.All(e => SubSet.Contains(e)) || SubSet.All(e => MasterSet.Contains(e)))
{
    //do stuff
}

Alternative: 选择:

if (!MasterSet.Except(SubSet).Any() || !SubSet.Except(MasterSet).Any())
{
    //do stuff
}

Edit: 编辑:

Just for the case you want to SubSet { 100, 3, 3 } not match MasterSet = { 100, 100, 3 } as Iridium pointed out in his comment, you can go by simple counting the occurring of each element. 正如Iridium在其评论中指出的那样,对于您想要SubSet { 100, 3, 3 }MasterSet = { 100, 100, 3 } SubSet { 100, 3, 3 } 匹配的情况,您可以简单地计算每个元素的出现。

if (MasterSet.All(e => MasterSet.Count(r => r==e) <= SubSet.Count(r => r==e))
    || SubSet.All(e => SubSet.Count(r => r==e) <= MasterSet.Count(r => r==e)))
{
    //do stuff
}

(Note that this is probably not the most efficient way...) (请注意,这可能不是最有效的方法...)


Edit2: 编辑2:

Given that you basically search a sequence inside a sequence, you can use the following method: 假设您基本上是在序列中搜索序列,则可以使用以下方法:

void Main()
{
    var MasterSet = new List<string>() {"100", "3","4"};

    var SubSets = new[] 
    {
        new List<string>() {"100", "100", "3"},
        new List<string>() {"100", "3", "4"},
        new List<string>() {"32", "3423", "4234", "100", "3", "4", "34234"},
        new List<string>() {"100", "32", "3423", "4234", "100", "3", "4", "34234"},
        new List<string>() {"100", "32", "3", "4234", "100", "4", "34234"},
        new List<string>() {"100", "4", "3"},
        new List<string>() {"100", "3", "3"},
        new List<string>() {"100", "3"},
        new List<string>() {"100", "3", "3", "1"}
    };

    foreach (var SubSet in SubSets)
    {
        if (IsMatch(MasterSet, SubSet))
            Console.WriteLine(String.Join(", ", SubSet) + " is a \"subset\"");
        else if (IsMatch(SubSet, MasterSet))
            Console.WriteLine(String.Join(", ", SubSet) + " is a \"superset\"");
    }
}

bool IsMatch<T>(IEnumerable<T> source, IEnumerable<T> to_test)
{
    using (var enumerator = source.GetEnumerator())
    using (var sub_enumerator = to_test.GetEnumerator())
        while (sub_enumerator.MoveNext())
        {
            if (!enumerator.MoveNext())
                return false;
            if (!enumerator.Current.Equals(sub_enumerator.Current))
                sub_enumerator.Reset();
        }
    return true;
}

Output: 输出:

100, 3, 4 is a "subset" 100、3、4是一个“子集”
32, 3423, 4234, 100, 3, 4, 34234 is a "superset" 32、3423、4234、100、3、4、34234是“超集”
100, 32, 3423, 4234, 100, 3, 4, 34234 is a "superset" 100、32、3423、4234、100、3、4、34234是“超集”
100, 3 is a "subset" 100、3是“子集”

The current framework implementations of ISet<T> are HashSet<T> and SortedSet<T> . ISet<T>的当前框架实现是HashSet<T>SortedSet<T> Both of these classes enforce member uniqueness and do not allow duplicates. 这两个类都强制成员唯一,并且不允许重复。

Whilst this may first seem like an omission in the framework, it is actually related to the properties and defenition of a mathematical set. 虽然这乍看起来似乎是框架中的遗漏,但实际上与数学集的属性和定义有关。 As explained in this post , a mathematical set does not have duplicate members and logicaly {100, 3} is equivalent to {100, 3, 3}. 正如在解释这个帖子 ,数学集没有重复成员和logicaly {100, 3}相当于{100, 3, 3}.

It may be possible to extend List<T> to implement ISet<T> , perhaps calling the new class Sack<T> but, the non unique implementation of the ISet<T> will be siginificantly more challenging than those exisiting in the framework, without putting much thought into it, it seems reminiscient of general Knapsack problems . 可以扩展List<T>来实现ISet<T> ,也许可以调用新类Sack<T>但是ISet<T>的非唯一实现将比框架中现有的更具挑战性,无需过多考虑,似乎使人联想到一般的背包问题

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM