简体繁体中英

Intersection of two sets in most optimized way

原文 2013-01-25 17:55:46 2 2 c#/ performance/ set

Given two sets of values, I have to find whether there is any common element among them or not ie whether their intersection is null or not.

Which of the standard C# collection will suit best (in terms of performance) for this purpose? I know that linq has a Intersect extension method to find out the intersection of two list/arrays but my focus is on performance in terms of Big-O notation .

And what if I have to find out the intersection of two sets as well?

2 answers

Well, if you use LINQ's Intersect method it will build up a HashSet of the second sequence, and then check each element of the first sequence against it. So it's O(M+N)... and you can use foo.Intersect(bar).Any() to get an early-out.

Of course, if you store one (either) set in a HashSet<T> to start with, you can just iterate over the other one checking for containment on each step. You'd still need to build the set to start with though.

Fundamentally you've got an O(M+N) problem whatever you do - you're not going to get cheaper than that (there's always the possibility that you'll have to look at every element) and if your hash codes are reasonable, you should be able to achieve that complexity easily. Of course, some solutions may give better constant factors than others... but that's performance rather than complexity ;)

EDIT: As noted in the comments, there's also ISet<T>.Overlaps - if you've already got either set with a static type of ISet<T> or a concrete implementation, calling Overlaps makes it clearer what you're doing. If both of your sets are statically typed as ISet<T> , use larger.Overlaps(smaller) (where larger and smaller are in terms of the size of the set) as I'd expect an implementation of Overlaps to iterate over the argument and check each element against contents of the set you call it on.

As mentioned , Applying Any() will give you some performance.

I tested it on pretty big dataset and it gave 25% improvements.

Also applying larger.Intersect(smaller) rather than the opposite is very important, in my case, it gave 35% improvements.

Also ordering the list before applying intersect gave another 7-8%.

Another thing to keep in mind that depending on the use case you can completely avoid applying intersect.

For example, for an integer list, if the maximum and minimum are not within the same bounders you don't need to apply intersect since they will never do.

The same goes for a string list with the same idea applied to first letter.

Again depending on your case, try as much as possible to find a rule where intersection is impossible to avoid calling it.

intersection of two sets of data

Intersection of two sets (Lists) of data

LINQ: how to get an intersection of two sets of ints?

What is the best way to compute intersection and difference of 2 sets?

Optimized way to compare two lists

What's the most optimized way to add a wordlist in my Windows Store application?

Is there a way to find the volume of an intersection between two box colliders in unity?

Simple way to calculate point of intersection between two polygons in C#

Concise way to get the intersection of two lists using complex equality comparison?

Which is the most performant way to compare two digits?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question intersection of two sets of data Intersection of two sets (Lists) of data LINQ: how to get an intersection of two sets of ints? What is the best way to compute intersection and difference of 2 sets? Optimized way to compare two lists What's the most optimized way to add a wordlist in my Windows Store application? Is there a way to find the volume of an intersection between two box colliders in unity? Simple way to calculate point of intersection between two polygons in C# Concise way to get the intersection of two lists using complex equality comparison? Which is the most performant way to compare two digits?

Related Tags

Intersection of two sets in most optimized way

Question

2 answers

solution1
48 ACCPTED 2013-01-25 17:57:59

solution2
12 2017-03-14 11:51:29

Intersection of two sets in most optimized way

Question

2 answers

solution1 48 ACCPTED 2013-01-25 17:57:59

solution2 12 2017-03-14 11:51:29

solution1
48 ACCPTED 2013-01-25 17:57:59

solution2
12 2017-03-14 11:51:29