简体   繁体   中英

How do I do an integer list intersection while keeping duplicates?

I'm working on a Greatest Common Factor and Least Common Multiple assignment and I have to list the common factors. Intersection() won't work because that removes duplicates. Contains() won't work because if it sees the int in the second list it returns all matching ints from the first list. Is there a way to do an Intersection that is not Distinct?

edit: sorry for not providing an example, here is what I meant:

if I have the sets:

{1, 2, 2, 2, 3, 3, 4, 5}
{1, 1, 2, 2, 3, 3, 3, 4, 4}

I would want the output

{1, 2, 2, 3, 3, 4}
ILookup<int, int> lookup1 = list1.ToLookup(i => i);
ILookup<int, int> lookup2 = list2.ToLookup(i => i);

int[] result =
(
  from group1 in lookup1
  let group2 = lookup2[group1.Key]
  where group2.Any()
  let smallerGroup = group1.Count() < group2.Count() ? group1 : group2
  from i in smallerGroup
  select i
).ToArray();

The where expression is technically optional, I feel it makes the intent clearer.


If you want more terse code:

ILookup<int, int> lookup2 = list2.ToLookup(i => i);

int[] result =
(
  from group1 in list1.GroupBy(i => i)
  let group2 = lookup2[group1.Key]
  from i in (group1.Count() < group2.Count() ? group1 : group2)
  select i
).ToArray();

I wrote this extension to solve the problem:

public static IEnumerable<T> Supersect<T>(this IEnumerable<T> a, ICollection<T> b)
              => a.Where(b.Remove);

example:

var a = new List<int> { 1, 2, 2, 2, 3, 3, 4, 5 };
var b = new List<int> { 1, 1, 2, 2, 3, 3, 3, 4, 4};

var result = a.Supersect(b);

result:

{ 1, 2, 2, 3, 3, 4 }

You could use this generic extension I wrote for another answer , it is essentially a single Linq statement. Note that it uses Zip to avoid the needless full enumeration of matched groups.

public static IEnumerable<T> Commom<T>(
        this IEnumerable<T> source,
        IEnumerable<T> sequence,
        IEqualityComparer<T> comparer = null)
{
    if (sequence == null)
    {
        return Enumerable.Empty<T>();
    }

    if (comparer == null)
    {
        comparer = EqualityComparer<T>.Default;
    }

    return source.GroupBy(t => t, comparer)
        .Join(
            sequence.GroupBy(t => t, comparer),
            g => g.Key,
            g => g.Key,
            (lg, rg) => lg.Zip(rg, (l, r) => l),
            comparer)
        .SelectMany(g => g);
}

this enables,

new[] {1, 2, 2, 2, 3, 3, 4, 5}.Common(
    new[] {1, 1, 2, 2, 3, 3, 3, 4, 4}).ToArray()

maintaining the order of the source sequence, as desired.

Are you looking for something like this? It should be pretty-much O(n+m) , where n is the number of items in first and m is the number of items in second .

public static IEnumerable<T> Overlap<T>(this IEnumerable<T> first,
    IEnumerable<T> second, IEqualityComparer<T> comparer = null)
{
    // argument checking, optimisations etc removed for brevity

    var dict = new Dictionary<T, int>(comparer);

    foreach (T item in second)
    {
        int hits;
        dict.TryGetValue(item, out hits);
        dict[item] = hits + 1;
    }

    foreach (T item in first)
    {
        int hits;
        dict.TryGetValue(item, out hits);
        if (hits > 0)
        {
            yield return item;
            dict[item] = hits - 1;
        }
    }
}
  • Find the intersection of the two lists.
  • Group the lists by the intersecting items
  • Join the groups, and select the Min(Count) for each item
  • Flatten into a new list.

See below:

var intersect = list1.Intersect(list2).ToList();
var groups1 = list1.Where(e => intersect.Contains(e)).GroupBy(e => e);
var groups2 = list2.Where(e => intersect.Contains(e)).GroupBy(e => e);

var allGroups = groups1.Concat(groups2);

return allGroups.GroupBy(e => e.Key)
    .SelectMany(group => group
        .First(g => g.Count() == group.Min(g1 => g1.Count())))
    .ToList();

Here's one way to do it. To be fair, it is very similar to David B's answer except that it uses a join to do the association.

IEnumerable<Foo> seqA = ...
IEnumerable<Foo> seqB = ...

var result = from aGroup in seqA.GroupBy(x => x)
             join bGroup in seqB.GroupBy(x => x) 
                         on aGroup.Key equals bGroup.Key
             let smallerGroup = aGroup.Count() < bGroup.Count() 
                                ? aGroup : bGroup
             from item in smallerGroup
             select item;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM