简体   繁体   English

迭代算法的时间复杂度

[英]Time complexity of an iterative algorithm

I am trying to find the Time Complexity of this algorithm . 我试图找到这个算法的时间复杂度。

The iterative: algorithm produces all the bit-strings within a given Hamming distance, from the input bit-string. 迭代:算法从输入比特串产生给定汉明距离内的所有比特串。 It generates all increasing sequences 0 <= a[0] < ... < a[dist-1] < strlen(num) , and reverts bits at corresponding indices. 它生成所有增加的序列0 <= a[0] < ... < a[dist-1] < strlen(num) ,并在相应的索引处恢复位。

The vector a is supposed to keep indices for which bits have to be inverted. 假设向量a保持必须反转位的索引。 So if a contains the current index i , we print 1 instead of 0 and vice versa. 因此,如果a包含当前索引i ,则我们打印1而不是0,反之亦然。 Otherwise we print the bit as is (see else-part), as shown below: 否则我们按原样打印该位(参见else-part),如下所示:

// e.g. hamming("0000", 2);
void hamming(const char* num, size_t dist) {
    assert(dist > 0);
    vector<int> a(dist);
    size_t k = 0, n = strlen(num);
    a[k] = -1;
    while (true)
        if (++a[k] >= n)
            if (k == 0)
                return;
            else {
                --k;
                continue;
            }
        else
            if (k == dist - 1) {
                // this is an O(n) operation and will be called
                // (n choose dist) times, in total.
                print(num, a);
            }
            else {
                a[k+1] = a[k];
                ++k;
            }
}

What is the Time Complexity of this algorithm? 这个算法的时间复杂度是多少?


My attempt says: 我的尝试说:

dist * n + (n choose t) * n + 2 dist * n +(n选择t)* n + 2

but this seems not to be true, consider the following examples, all with dist = 2: 但这似乎并非如此,请考虑以下示例,所有示例均为dist = 2:

len = 3, (3 choose 2) = 3 * O(n), 10 while iterations
len = 4, (4 choose 2) = 6 * O(n), 15 while iterations
len = 5, (5 choose 2) = 9 * O(n), 21 while iterations
len = 6, (6 choose 2) = 15 * O(n), 28 while iterations

Here are two representative runs (with the print to be happening at the start of the loop): 这是两个代表性的运行(在循环开始时打印):

000, len = 3
k = 0, total_iter = 1
vector a = -1 0 
k = 1, total_iter = 2
vector a = 0 0 
Paid O(n)
k = 1, total_iter = 3
vector a = 0 1 
Paid O(n)
k = 1, total_iter = 4
vector a = 0 2 
k = 0, total_iter = 5
vector a = 0 3 
k = 1, total_iter = 6
vector a = 1 1 
Paid O(n)
k = 1, total_iter = 7
vector a = 1 2 
k = 0, total_iter = 8
vector a = 1 3 
k = 1, total_iter = 9
vector a = 2 2 
k = 0, total_iter = 10
vector a = 2 3 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
gsamaras@pythagoras:~/Desktop/generate_bitStrings_HammDistanceT$ ./iter
0000, len = 4
k = 0, total_iter = 1
vector a = -1 0 
k = 1, total_iter = 2
vector a = 0 0 
Paid O(n)
k = 1, total_iter = 3
vector a = 0 1 
Paid O(n)
k = 1, total_iter = 4
vector a = 0 2 
Paid O(n)
k = 1, total_iter = 5
vector a = 0 3 
k = 0, total_iter = 6
vector a = 0 4 
k = 1, total_iter = 7
vector a = 1 1 
Paid O(n)
k = 1, total_iter = 8
vector a = 1 2 
Paid O(n)
k = 1, total_iter = 9
vector a = 1 3 
k = 0, total_iter = 10
vector a = 1 4 
k = 1, total_iter = 11
vector a = 2 2 
Paid O(n)
k = 1, total_iter = 12
vector a = 2 3 
k = 0, total_iter = 13
vector a = 2 4 
k = 1, total_iter = 14
vector a = 3 3 
k = 0, total_iter = 15
vector a = 3 4 

The while loop is somewhat clever and subtle, and it's arguable that it's doing two different things (or even three if you count the initialisation of a ). while循环是有点聪明和微妙的,并且(如果算上的初始化甚至三是值得商榷的,它在做两个不同的东西a )。 That's what's making your complexity calculations challenging, and it's also less efficient than it could be. 这就是让你的复杂性计算具有挑战性的因素,而且它的效率也低于它。

In the abstract, to incrementally compute the next set of indices from the current one, the idea is to find the last index, i , that's less than n-dist+i , increment it, and set the following indexes to a[i]+1 , a[i]+2 , and so on. 在摘要中,为了逐步计算当前索引的下一组索引,我们的想法是找到最后一个索引i ,它小于n-dist+i ,递增它,并将以下索引设置a[i]+1a[i]+2 ,依此类推。

For example, if dist=5, n=11 and your indexes are: 例如,如果dist = 5,则n = 11并且您的索引是:

0, 3, 5, 9, 10

Then 5 is the last value less than n-dist+i (because n-dist is 6, and 10=6+4, 9=6+3, but 5<6+2). 那么5是小于n-dist+i的最后一个值(因为n-dist是6,而10 = 6 + 4,9 = 6 + 3,但是5 <6 + 2)。

So we increment 5 , and set the subsequent integers to get the set of indexes: 所以我们递增5 ,并设置后续整数以获得索引集:

0, 3, 6, 7, 8

Now consider how your code runs, assuming k=4 现在考虑代码如何运行,假设k=4

0, 3, 5, 9, 10
  • a[k] + 1 is 11, so k becomes 3. a[k] + 1是11,所以k变为3。
  • ++a[k] is 10, so a[k+1] becomes 10, and k becomes 4. ++a[k]是10,所以a[k+1]变为10, k变为4。
  • ++a[k] is 11, so k becomes 3. ++a[k]是11,所以k变为3。
  • ++a[k] is 11, so k becomes 2. ++a[k]是11,所以k变为2。
  • ++a[k] is 6, so a[k+1] becomes 6, and k becomes 3. ++a[k]是6,所以a[k+1]变为6, k变为3。
  • ++a[k] is 7, so a[k+1] becomes 7, and k becomes 4. ++a[k]是7,所以a[k+1]变为7, k变为4。
  • ++a[k] is 8, and we continue to call the print function. ++a[k]是8,我们继续调用print函数。

This code is correct, but it's not efficient because k scuttles backwards and forwards as it's searching for the highest index that can be incremented without causing an overflow in the higher indices. 这段代码是正确的,但它没有效率,因为k向后和向前搜索,因为它正在搜索可以递增而不会导致较高索引溢出的最高索引。 In fact, if the highest index is j from the end, the code uses a non-linear number iterations of the while loop. 实际上,如果最高索引是从末尾开始的j ,则代码使用while循环的非线性数字迭代。 You can easily demonstrate this yourself if you trace how many iterations of the while loop occur when n==dist for different values of n . 如果你跟踪n==distn不同值的while循环发生了多少迭代,你可以自己很容易地证明这一点。 There is exactly one line of output, but you'll see an O(2^n) growth in the number of iterations (in fact, you'll see 2^(n+1)-2 iterations). 只有一行输出,但你会看到迭代次数增加了O(2 ^ n)(事实上,你会看到2 ^(n + 1)-2次迭代)。

This scuttling makes your code needlessly inefficient, and also hard to analyse. 这种破坏使你的代码不必要地低效,而且难以分析。

Instead, you can write the code in a more direct way: 相反,您可以以更直接的方式编写代码:

void hamming2(const char* num, size_t dist) {
    int a[dist];
    for (int i = 0; i < dist; i++) {
        a[i] = i;
    }
    size_t n = strlen(num);
    while (true) {
        print(num, a);
        int i;
        for (i = dist - 1; i >= 0; i--) {
            if (a[i] < n - dist + i) break;
        }
        if (i < 0) return;
        a[i]++;
        for (int j = i+1; j<dist; j++) a[j] = a[i] + j - i;
    }
}

Now, each time through the while loop produces a new set of indexes. 现在,每次通过while循环都会生成一组新的索引。 The exact cost per iteration is not straightforward, but since print is O(n), and the remaining code in the while loop is at worst O(dist), the overall cost is O(N_INCR_SEQ(n, dist) * n), where N_INCR_SEQ(n, dist) is the number of increasing sequences of natural numbers < n of length dist. 每次迭代的确切成本并不简单,但由于print是O(n),而while循环中的剩余代码处于最差O(dist),因此总成本为O(N_INCR_SEQ(n,dist)* n),其中N_INCR_SEQ(n,dist)是自然数<n的长度dist的增加序列的数量。 Someone in the comments provides a link that gives a formula for this. 评论中的某个人提供了一个链接,为此提供了一个公式。

Notice, that given n which represents the length, and t which represents the distance required, the number of increasing, non-negative series of t integers between 1 and n (or in indices form, between 0 and n-1 ) is indeed n choose t , since we pick t distinct indices. 请注意,给定n代表长度, t代表所需距离, 1n之间的t整数增加,非负序列的数量(或索引形式, 0n-1 )确实为n choose t ,因为我们挑选t不同的指数。

The problem occurs with your generation of those series: 您生成这些系列会出现问题:

-First, notice that for example in the case of length 4, you actually go over 5 different indices, 0 to 4. - 首先,请注意,例如在长度为4的情况下,您实际上会超过5个不同的索引,0到4。

-Secondly, notice that you are taking in account series with identical indices (in the case of t=2 , its 0 0, 1 1, 2 2 and so on), and generally, you would go through every non-decreasing series, instead of through every increasing series. - 其次,请注意您正在考虑具有相同索引的帐户系列(在t=2的情况下,其0 0, 1 1, 2 2等等),通常,您将查看每个非递减系列,而不是通过每一个增加的系列。

So for calculating the TC of your program, make sure you take that into account. 因此,为了计算程序的TC,请确保将其考虑在内。

Hint: try to make one-to-one correspondence from the universe of those series, to the universe of integer solutions to some equation. 提示:尝试从这些系列的宇宙中进行一对一的对应,到一些方程的整数解的范围。

If you need the direct solution, take a look here : https://math.stackexchange.com/questions/432496/number-of-non-decreasing-sequences-of-length-m 如果您需要直接解决方案,请查看此处: https//math.stackexchange.com/questions/432496/number-of-non-decreasing-sequences-of-length-m


The final solution is (n+t-1) choose (t) , but noticing the first bullet, in your program, its actually ((n+1)+t-1) choose (t) , since you loop with one extra index. 最终的解决方案是(n+t-1) choose (t) ,但注意到你的程序中的第一个子弹,它实际上((n+1)+t-1) choose (t) ,因为你循环一个额外的指数。 Denote 表示

((n+1)+t-1) choose (t) =: A , n choose t =: B ((n+1)+t-1) choose (t) =: An choose t =: B

overall we get O(1) + B*O(n) + (AB)*O(1) 总体上我们得到O(1) + B*O(n) + (AB)*O(1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM