简体   繁体   English

比较两个数组或arraylists,找到相似和不同的值

[英]Compare two arrays or arraylists, find similar and different values

I have two arrays (or arraylists if it is easier) of strings. 我有两个数组(或更简单的arraylists)字符串。 I need to compare these, find which only exist in the first array, which exist in both, and which only exist in the second array. 我需要比较这些,找到只存在于第一个数组中的数据,它们存在于两者中,并且只存在于第二个数组中。 These arrays are different lengths, and may be in different orders. 这些阵列的长度不同,可能的顺序不同。 If necessary, I suppose I could sort them... 如有必要,我想我可以对它们进行排序......

I know I could hack this together, but I think this might have a fairly standard and efficient / "best" solution, and I am curious more than anything. 我知道我可以一起破解这个,但我认为这可能有一个相当标准和有效/“最佳”的解决方案,而且我比任何事情都更好奇。

I am using c# for this, but if you want to write your solution in another language, any help is welcome. 我正在使用c#,但如果你想用另一种语言编写解决方案,欢迎任何帮助。

Thanks for the help! 谢谢您的帮助!

If the arrays are large then you'll want to use a data structure that is efficient for these operations; 如果数组很大,那么你将需要使用对这些操作有效的数据结构; arrays are not. 数组不是。

The naive solution is O(n^2) in time if the arrays are of size n. 如果阵列的大小为n,那么天真的解决方案是O(n ^ 2)。

If you sort the arrays in place then you can binary search them for the items; 如果您对数组进行排序,那么您可以二进制搜索它们的项目; sorting will likely be O(n lg n) and searching n times at a cost of lg n per search will also be O(n lg n) in time. 排序可能是O(n lg n)并且每次搜索以ng n为代价搜索n次也将是O(n lg n)。

If you turn each array into a HashSet<T> first then you can do it in O(n) time and O(n) extra space. 如果先将每个数组转换为HashSet<T>那么可以在O(n)时间和O(n)额外空间中进行。

var onlyinfirst = from s in list1 where !list2.Contains(s) select s;
var onlyinsecond = from s in list2 where !list1.Contains(s) select s;
var onboth = from s in list1 where list2.Contains(s) select s;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM