简体   繁体   English

为什么这些循环和散列操作需要 O(N) 时间复杂度?

[英]Why are these loop & hashing operations take O(N) time complexity?

Given the array:给定数组:

int arr[]= {1, 2, 3, 2, 3, 1, 3}

You are asked to find a number within the array that occurs odd number of times.你被要求在数组中找到一个出现奇数次的数字。 It's 3 (occurring 3 times).它是 3(发生 3 次)。 The time complexity should be at least O(n).时间复杂度至少应为 O(n)。 The solution is to use an HashMap .解决方案是使用HashMap Elements become keys and their counts become values of the hashmap.元素成为,它们的计数成为 hashmap 的值。

// Code belongs to geeksforgeeks.org
// function to find the element occurring odd 
    // number of times 

    static int getOddOccurrence(int arr[], int n) 
    { 
        HashMap<Integer,Integer> hmap = new HashMap<>(); 
        // Putting all elements into the HashMap 
        for(int i = 0; i < n; i++) 
        { 
            if(hmap.containsKey(arr[i])) 
            { 
                int val = hmap.get(arr[i]); 
                // If array element is already present then 
                // increase the count of that element. 
                hmap.put(arr[i], val + 1);  
            } 
            else
                // if array element is not present then put 
                // element into the HashMap and initialize  
                // the count to one. 
                hmap.put(arr[i], 1);  
        } 

        // Checking for odd occurrence of each element present 
          // in the HashMap  
        for(Integer a:hmap.keySet()) 
        { 
            if(hmap.get(a) % 2 != 0) 
                return a; 
        } 
        return -1; 
    } 

I don't get why this overall operation takes O(N) time complexity.我不明白为什么这个整体操作需要O(N)时间复杂度。 If I think about it, the loop alone takes O(N) time complexity.如果我考虑一下,仅循环就需要O(N)时间复杂度。 Those hmap.put (an insert operation) and hmap.get (a find operations) take O(N) and they are nested within the loop.那些hmap.put (插入操作)和hmap.get (查找操作)需要O(N)并且它们嵌套在循环中。 So normally I would think this function takes O(N^2) times.所以通常我会认为这个 function 需要O(N^2)次。 Why it instead takes O(N) ?.为什么它需要O(N)

The algorithm first iterates the array of numbers, of size n , to generate the map with counts of occurrences.该算法首先迭代大小为n的数字数组,以生成具有出现次数的 map。 It should be clear why this is an O(n) operation.应该清楚为什么这是一个O(n)操作。 Then, after the hashmap has been built, it iterates that map and finds all entries whose counts are odd numbers.然后,在构建 hashmap 之后,它会迭代 map 并找到所有计数为奇数的条目。 The size of this map would in practice be somewhere between 1 (in the case of all input numbers being the same), and n (in the case where all inputs are different).这个 map 的大小实际上介于 1(在所有输入数字相同的情况下)和n (在所有输入不同的情况下)之间。 So, this second operation is also bounded by O(n) , leaving the entire algorithm O(n) .因此,第二个操作也以O(n)为界,留下整个算法O(n)

I don't get why this overall operation takes O(N) time complexity.我不明白为什么这个整体操作需要 O(N) 时间复杂度。

You must examine all elements of the array - O(N)您必须检查数组的所有元素 - O(N)

For each element of the array you call contain , get and put on the array.对于您调用的数组的每个元素containgetput数组。 These are O(1) operations.这些是O(1)操作。 Or more precisely, they are O(1) on averaged amortized over the lifefime of the HashMap .或者更准确地说,它们HashMap的生命周期内平均摊销O(1) This is due to the fact that a HashMap will grow its hash array when the ratio of the array size to the number of elements exceeds the load factor.这是因为当数组大小与元素数量的比率超过负载因子时, HashMap将增长其 hash 数组。

O(N) repetitions of 2 or 3 O(1) operations is O(N). 2 次或 3 次 O(1) 次操作的 O(N) 次重复是 O(N)。 QED量子点

Reference:参考:


Strictly speaking there are a couple of scenarios where a HashMap is not O(1) .严格来说,有几种情况HashMap不是O(1)

  • If the hash function is poor (or the key distribution is pathological) the hash chains will be unbalanced.如果 hash function 较差(或密钥分布是病态的),则 hash 链将不平衡。 With early HashMap implementations, this could lead to (worst case) O(N) operations because operations like get had to search a long hash chain.对于早期HashMap实现,这可能会导致(最坏情况) O(N)操作,因为像get这样的操作必须搜索长 hash 链。 With recent implementations, HashMap will construct a balanced binary tree for any hash chain that is too long.在最近的实现中, HashMap将为任何太长的 hash 链构建平衡二叉树。 That leads to worst case O(logN) operations.这会导致最坏情况的O(logN)操作。

  • HashMap is unable to grow the hash array beyond 2^31 hash buckets. HashMap无法将 hash 阵列扩展到 2^31 个 hash 存储桶之外。 So at that point HashMap complexity starts transitioning to O(log N) complexity.因此,此时HashMap复杂度开始转变为O(log N)复杂度。 However if you have a map that size, other secondary effects will probably have affected the real performance anyway.但是,如果您有这么大的 map,其他次要效果可能无论如何都会影响实际性能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM