简体   繁体   English

算法,大O表示法:这个函数是O(n ^ 2)吗? 还是O(n)?

[英]Algorithm, Big O notation: Is this function O(n^2) ? or O(n)?

This is code from a algorithm book, "Data structures and Algorithms in Java, 6th Edition." 这是来自算法书“Java中的数据结构和算法,第6版”的代码。 by by Michael T. GoodRich, Roberto Tamassia, and Michael H. Goldwasser 作者:Michael T. GoodRich,Roberto Tamassia和Michael H. Goldwasser

public static String repeat1(char c, int n)
{
    String answer = "";
    for(int j=0; j < n; j++)
    {
         answer += c;
    }  
    return answer;
}

According to the authors, the Big O notation of this algorithm is O(n^2) with reason: "The command, answer += c, is shorthand for answer = (answer + c). This command does not cause a new character to be added to the existing String instance; instead it produces a new String with the desired sequence of characters, and then it reassigns the variable, answer, to refer to that new string. In terms of efficiency, the problem with this interpretation is that the creation of a new string as a result of a concatenation, requires time that is proportional to the length of the resulting string. The first time through this loop, the result has length 1, the second time through the loop the result has length 2, and so on, until we reach the final string of length n." 根据作者的说法,这个算法的Big O表示法是O(n ^ 2),理由是:“命令,答案+ = c,是答案的简写=(答案+ c)。这个命令不会导致新的字符要添加到现有的String实例中;相反,它会生成一个具有所需字符序列的新String,然后重新分配变量answer,以引用该新字符串。就效率而言,这种解释的问题是由于连接而创建一个新字符串,需要的时间与结果字符串的长度成正比。第一次通过此循环时,结果的长度为1,第二次通过循环时结果的长度为2 ,依此类推,直到我们达到长度为n的最终字符串。“

However, I do not understand, how this code can have O(n^2) as its number of primitive operations just doubles each iteration regardless of the value of n(excluding j < n and j++). 但是,我不明白,这个代码如何具有O(n ^ 2),因为它的原始操作数量只是每次迭代加倍,而不管n的值是什么(不包括j <n和j ++)。 The statement answer += c requires two primitive operations each iteration regardless of the value n, therefore I think the equation for this function supposed to be 4n + 3. (Each loop operates j 语句答案+ = c每次迭代需要两个基本操作而不管值n,因此我认为这个函数的等式应该是4n + 3.(每个循环操作j

Or, is the sentence,"In terms of efficiency, the problem with this interpretation is that the creation of a new string as a result of a concatenation, requires time that is proportional to the length of the resulting string.," just simply saying that creating a new string as a result of a concatenation requires proportional time to its length regardless of the number of primitive operations used in the function? 或者,是句子,“就效率而言,这种解释的问题在于,由于连接而创建一个新字符串,需要的时间与结果字符串的长度成正比。”只是简单地说由于串联而创建一个新字符串需要与其长度成比例的时间,而不管函数中使用的基本操作的数量是多少? So the number of primitive operations does not have big effects on the running time of the function because the built-in code for concatenated String assignment operator's running time runs in O(n^2). 因此,原始操作的数量对函数的运行时间没有太大影响,因为连接的String赋值运算符的运行时间的内置代码在O(n ^ 2)中运行。

How can this function be O(n^2)? 这个函数怎么能是O(n ^ 2)?

Thank you for your support. 谢谢您的支持。

During every iteration of the loop, the statement answer += c; 在循环的每次迭代期间,语句answer += c; must copy each and every character already in the string answer to a new string. 必须每一个字符字符串拷贝已经answer到一个新的字符串。

Eg n = 5, c = '5' 例如,n = 5,c ='5'

  • First loop: answer is an empty string, but it must still create a new string. 第一个循环: answer是一个空字符串,但它仍然必须创建一个新字符串。 There is one operation to append the first '5' , and answer is now "5" . 有一个操作可以追加第一个'5' ,而answer现在是"5"
  • Second loop: answer will now point to a new string, with the first '5' copied to a new string with another '5' appended, to make "55" . 第二个循环: answer现在将指向一个新字符串,第一个'5'复制到一个新字符串,另一个'5'附加,以产生"55" Not only is a new String created, one character '5' is copied from the previous string and another '5' is appended. 不仅是一个新的String创建的,一个字符'5'是从以前的字符串复制并另一个'5'被附加。 Two characters are appended. 附加两个字符。
  • "n"th loop: answer will now point to a new string, with n - 1 '5' characters copied to a new string, and an additional '5' character appended, to make a string with n 5s in it. 第n个循环: answer现在将指向一个新字符串,将n - 1'5 '5'字符复制到一个新字符串,并附加一个额外的'5'字符,以生成一个包含n 5s的字符串。

The number of characters copied is 1 + 2 + ... + n = n(n + 1)/2. 复制的字符数为1 + 2 + ... + n = n(n + 1)/ 2。 This is O(n 2 ). 这是O(n 2 )。

The efficient way to constructs strings like this in a loop in Java is to use a StringBuilder , using one object that is mutable and doesn't need to copy all the characters each time a character is appended in each loop. 在Java循环中构造这样的字符串的有效方法是使用StringBuilder ,使用一个可变的对象,并且每次在每个循环中追加一个字符时都不需要复制所有字符。 Using a StringBuilder has a cost of O(n). 使用StringBuilder的成本为O(n)。

Strings are immutable in Java. 字符串在Java中是不可变的。 I believe this terrible code is O(n^2) for that reason and only that reason. 我认为这个可怕的代码是O(n ^ 2),因为这个原因。 It has to construct a new String on each iteration. 它必须在每次迭代时构造一个新的String。 I'm unsure if String concatenation is truly linearly proportional to the number of characters (it seems like it should be a constant time operation since Strings have a known length). 我不确定字符串连接是否真正与字符数成线性比例(似乎它应该是一个恒定的时间操作,因为字符串具有已知的长度)。 However if you take the author's word for it then iterating n times with each iteration taking a time proportional to n, you get n^2. 然而,如果你把作者的话用于它,那么每次迭代迭代n次,花费与n成比例的时间,你得到n ^ 2。 StringBuilder would give you O(n). StringBuilder会给你O(n)。

I mostly agree with it being O(n^2) in practice, but consider: 我大多同意在实践中它是O(n ^ 2),但考虑:

Java is SMART. Java很聪明。 In many cases it uses StringBuilder instead of string for concatenation under the covers. 在许多情况下,它使用StringBuilder而不是字符串来进行连接。 You can't just assume it's going to copy the underlying array every time (although it almost certainly will in this case). 你不能只是假设它每次都要复制底层数组(尽管在这种情况下几乎可以肯定)。

Java gets SMARTER all the time. Java一直都是SMARTER。 There is no reason it couldn't optimize that entire loop based on StringBuilder since it can analyze all your code and figure out that you don't use it as a string inside that loop. 没有理由不能基于StringBuilder优化整个循环,因为它可以分析你的所有代码,并发现你不会将它用作该循环中的字符串。

Further optimizations can happen--Strings currently use an array AND an length AND a shared flag (And maybe a start location so that splits wouldn't require copying, I forget, but they changed that split implementation anyway)--so appending into an oversized array and then returning a new string with a reference to the same underlying array but a higher end without mutating the original string is altogether possible (by design, they do stuff like this already to a degree)... 可能会发生进一步的优化 - 字符串当前使用一个数组和一个长度和一个共享标志(也许是一个起始位置,因此拆分不需要复制,我忘了,但他们改变了分割实现) - 所以附加到一个超大数组,然后返回一个新的字符串,引用同一个底层数组,但更高端而不改变原始字符串是完全可能的(按照设计,他们在某种程度上做了这样的东西)...

So I think the real question is, is it a great idea to calculate O() based on a particular implementation of a language-level construct? 所以我认为真正的问题是,根据语言级构造的特定实现计算O()是一个好主意吗?

And although I can't say for sure what the answer to that is, I can say it would be a REALLY BAD idea to optimize on the assumption that it was O(n^2) unless you absolutely needed it--you could take away java's ability to speed up your code later by hand optimizing today. 虽然我不能肯定地说答案是什么,但我可以说,除非你绝对需要它,否则优先考虑它是O(n ^ 2)是一个非常糟糕的想法 - 你可以采取远离java的能力,以便稍后通过手工优化加速你的代码。

ps. PS。 this is from experience. 这是来自经验。 I had to optimize some java code that was the UI for a spectrum analyzer. 我不得不优化一些作为频谱分析仪用户界面的java代码。 I saw all sorts of String+ operations and figured I'd clean them all up with .append(). 我看到了各种各样的String +操作,并认为我用.append()清理它们。 It saved NO time because Java already optimizes String+ operations that are not in a loop. 它节省了没有时间,因为Java已经优化了不在循环中的String +操作。

The complexity becomes O(n^2) because each time the string increase the length by one and to create it each time you need n complexity. 复杂性变为O(n ^ 2),因为每次字符串将长度增加1并且每次需要n复杂度时创建它。 Also, the outer loop is n in complexity. 而且,外环的复杂性为n So the exact complexity will be (n * (n+1))/2 which is O(n^2) 所以确切的复杂度将是(n *(n + 1))/ 2 ,即O(n ^ 2)

For example, 例如,

For abcdefg 对于abcdefg

a // one length string object is created so complexity is 1
ab // similarly complexity is 2
abc // complexity 3 here 
abcd // 4 now.
abcde // ans so on.
abcdef
abcedefg

Now, you see the total complexity is 1 + 2 + 3 + 4 + ... + n = (n * (n+1))/2. 现在,您看到总复杂度为1 + 2 + 3 + 4 + ... + n =(n *(n + 1))/ 2。 In big O notation it's O(n^2) 在大O表示法中它是O(n ^ 2)

That is because: 那是因为:

answer += c;

is a String concatenation. String连接。 In java Strings are immutable . 在java中, Strings不可变的

It means concatenated string is created by creating a copy of original string and appending c to it. 这意味着通过创建原始字符串的副本并将c附加到其来创建连接字符串。 So a simple concatenation operation is O(n) for n sized String . 因此,对于n大小的String简单的连接操作是O(n)

In first iteration, answer length is 0 , in second iteration its 1 , in third its 2 and so on. 在第一次迭代中,答案长度为0 ,在第二次迭代中为1 ,在第二次迭代中为2,依此类推。

So you're doing these operations every time ie 所以你每次都在做这些操作,即

1 + 2 + 3 + ... + n = O(n^2)

For string manipulations StringBuilder is the preferred way ie it appends any character in O(1) time. 对于字符串操作, StringBuilder是首选方式,即它在O(1)时间内附加任何字符。

将字符串的长度视为“n”,因此每次我们需要在末尾添加元素时,字符串的迭代为“n”,并且我们还有外部for循环,因此“n”为此,因此结果我们得到O(n ^ 2)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM