简体   繁体   English

查找数组中第n1个最小数到第n2个最小数的快速算法

[英]Fast algorithm to find the n1 th smallest number to n2 th smallest number in an array

I have an array, I attempt to find the n1 th smallest number until the n2 th smallest number in the array and store them in another array of size n2-n1+1 . 我有一个数组,我试图找到第n1个最小的数字,直到找到该数组中第n2个最小的数字,并将它们存储在另一个大小为n2-n1 + 1的数组中。 Here n2 > n1 and both are realtively small compared to the size of the array (for example, the array size is 10000, n1=5 , n2=20 ). 在这里, n2 > n1且与数组的大小相比都非常小(例如,数组大小为10000, n1 = 5n2 = 20 )。

I can sort this array first, and then retrieve the n1 th number, n1+1 th number until the n2 th number from the sorted array. 我可以首先对该数组进行排序,然后从排序后的数组中检索第n1个数字,第n1 + 1个数字,直到第n2个数字。 But since n1 and n2 are usually relatively small compared to the size of the array, it is not necessary to sort the array completely. 但是由于n1n2通常比数组的大小小,因此不必对数组进行完全排序。 The algorithm should be able to stop in middle once reaches n2 I think. 我认为一旦达到n2,该算法就应该能够在中间停止。

I wonder if there is any algorithm, maybe a modified version of certian sorting algorithm that is specifacally good (by good I mean fast) at this problem. 我想知道是否有任何算法,也许是certian排序算法的修改版本,在此问题上特别好(好,我的意思是很快)。 You can either use Python code or pseudo code as an illustration, thanks! 您可以使用Python代码或伪代码作为示例,谢谢!

Since, the N1 and N2 are really small compared to size of array letus say N. We can have an implementation in O(N2 * LogN) using min heap data structures. 因为,N1和N2的大小比数组N的大小要小。我们可以使用最小堆数据结构在O(N2 * LogN)中实现。

Steps 脚步

  1. Construct a min heap. 构造一个最小堆。 Complexity of this operation will be O(N) 此操作的复杂度为O(N)
  2. Loop for a range of 0 to N2: Get the root element and call heapify. 循环从0到N2:获取根元素并调用heapify。 Ignore first N1 elements and return rest of the elements. 忽略第一个N1元素,并返回其余元素。 Complexity of this step is O(1)+O(logN) Hence, overall we have O(N2 * logN) 此步骤的复杂度为O(1)+ O(logN)因此,总的来说,我们有O(N2 * logN)

Instead of sorting, if your array size is not very big you can use simple lookup table (kind of sorting here). 如果您的数组大小不是很大,则可以使用简单的查找表(此处为某种排序)来代替排序。 Firstly iterate over array and just store lookup[array[i]]=true; 首先,迭代数组,然后存储lookup [array [i]] = true; and then just iterate over lookup and do something like: 然后遍历查找并执行以下操作:

for(...){
if(lookup[j]){
    ith++; 
    if(ith>=n1 && ith<=n2)
       ADD(j);
}}

That would be O(n), if you have a window n1<=n2 rather nothing faster than O(n) exist 如果窗口n1 <= n2,那将是O(n),但没有比O(n)快的东西了

Use selection sort . 使用选择排序 It's O( n ²) if you sort the entire array, but O( mn ) if you only sort the smallest m items in the array. 这是为O(n²),如果你整个数组排序,但O(MN),如果你只有数组中最小的m个排序。

If n2 (and therefore n1 ) are both small, then you can find the n2 smallest elements and ignore the first n1 ones. 如果n2 (因此n1 )都很小,则可以找到n2个最小的元素,而忽略前n1个元素。 These approaches are described by Arun Kumar and user448810 and will be efficient as long as n1 remains small. 这些方法由Arun Kumar和user448810进行了描述,并且只要n1保持较小,它们将是有效的。

However, you may be describing a situation in which n1 (and therefore n2 ) can grow (perhaps even linearly with the overall list length) and it is only their difference n2-n1 which remains small. 但是,您可能正在描述一种情况,其中n1 (因此n2 )可以增长(也许甚至与整个列表长度呈线性关系),并且只有它们之间的差异n2-n1保持很小。 In this case you need a selection algorithm such as quickselect which will remain O(N) in this case. 在这种情况下,您需要选择算法,例如quickselect,在这种情况下将保持O(N)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM