简体   繁体   English

字符串连接:concat() 与“+”运算符

[英]String concatenation: concat() vs "+" operator

Assuming String a and b:假设字符串 a 和 b:

a += b
a = a.concat(b)

Under the hood, are they the same thing?在引擎盖下,它们是一样的吗?

Here is concat decompiled as reference.这里是 concat 反编译作为参考。 I'd like to be able to decompile the + operator as well to see what that does.我也希望能够反编译+运算符,看看它做了什么。

public String concat(String s) {

    int i = s.length();
    if (i == 0) {
        return this;
    }
    else {
        char ac[] = new char[count + i];
        getChars(0, count, ac, 0);
        s.getChars(0, i, ac, count);
        return new String(0, count + i, ac);
    }
}

No, not quite.不,不完全是。

Firstly, there's a slight difference in semantics.首先,语义上略有不同。 If a is null , then a.concat(b) throws a NullPointerException but a+=b will treat the original value of a as if it were null .如果anull ,则a.concat(b)抛出NullPointerExceptiona+=b会将a的原始值视为null Furthermore, the concat() method only accepts String values while the + operator will silently convert the argument to a String (using the toString() method for objects).此外, concat()方法只接受String值,而+运算符会默默地将参数转换为 String(对对象使用toString()方法)。 So the concat() method is more strict in what it accepts.所以concat()方法接受的内容更加严格。

To look under the hood, write a simple class with a += b;要深入了解,请使用a += b;

public class Concat {
    String cat(String a, String b) {
        a += b;
        return a;
    }
}

Now disassemble with javap -c (included in the Sun JDK).现在用javap -c反汇编(包含在 Sun JDK 中)。 You should see a listing including:您应该会看到一个列表,其中包括:

java.lang.String cat(java.lang.String, java.lang.String);
  Code:
   0:   new     #2; //class java/lang/StringBuilder
   3:   dup
   4:   invokespecial   #3; //Method java/lang/StringBuilder."<init>":()V
   7:   aload_1
   8:   invokevirtual   #4; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   11:  aload_2
   12:  invokevirtual   #4; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   15:  invokevirtual   #5; //Method java/lang/StringBuilder.toString:()Ljava/lang/    String;
   18:  astore_1
   19:  aload_1
   20:  areturn

So, a += b is the equivalent of所以, a += b等价于

a = new StringBuilder()
    .append(a)
    .append(b)
    .toString();

The concat method should be faster. concat方法应该更快。 However, with more strings the StringBuilder method wins, at least in terms of performance.但是,如果字符串越多, StringBuilder方法就会胜出,至少在性能方面如此。

The source code of String and StringBuilder (and its package-private base class) is available in src.zip of the Sun JDK. Sun JDK 的 src.zip 中提供了StringStringBuilder (及其包私有基类)的源代码。 You can see that you are building up a char array (resizing as necessary) and then throwing it away when you create the final String .您可以看到您正在构建一个 char 数组(根据需要调整大小),然后在创建最终String时将其丢弃。 In practice memory allocation is surprisingly fast.在实践中,内存分配速度惊人。

Update: As Pawel Adamski notes, performance has changed in more recent HotSpot.更新:正如 Pawel Adamski 所说,最近的 HotSpot 的性能发生了变化。 javac still produces exactly the same code, but the bytecode compiler cheats. javac仍然产生完全相同的代码,但字节码编译器作弊。 Simple testing entirely fails because the entire body of code is thrown away.简单的测试完全失败,因为整个代码体都被丢弃了。 Summing System.identityHashCode (not String.hashCode ) shows the StringBuffer code has a slight advantage.System.identityHashCode (不是String.hashCode )求和显示StringBuffer代码有一点优势。 Subject to change when the next update is released, or if you use a different JVM.下一次更新发布或您使用不同的 JVM 时可能会发生变化。 From @lukaseder , a list of HotSpot JVM intrinsics .来自@lukasederHotSpot JVM 内在函数列表

Niyaz is correct, but it's also worth noting that the special + operator can be converted into something more efficient by the Java compiler. Niyaz是正确的,但也值得注意的是,特殊的 + 运算符可以通过 Java 编译器转换为更有效的东西。 Java has a StringBuilder class which represents a non-thread-safe, mutable String. Java 有一个 StringBuilder 类,它代表一个非线程安全的可变字符串。 When performing a bunch of String concatenations, the Java compiler silently converts当执行一堆字符串连接时,Java 编译器会默默地转换

String a = b + c + d;

into进入

String a = new StringBuilder(b).append(c).append(d).toString();

which for large strings is significantly more efficient.对于大字符串,这明显更有效。 As far as I know, this does not happen when you use the concat method.据我所知,使用 concat 方法时不会发生这种情况。

However, the concat method is more efficient when concatenating an empty String onto an existing String.但是,concat 方法在将空字符串连接到现有字符串时更有效。 In this case, the JVM does not need to create a new String object and can simply return the existing one.在这种情况下,JVM 不需要创建新的 String 对象,只需返回现有的对象即可。 See the concat documentation to confirm this.请参阅concat 文档以确认这一点。

So if you're super-concerned about efficiency then you should use the concat method when concatenating possibly-empty Strings, and use + otherwise.因此,如果您非常关心效率,那么在连接可能为空的字符串时应该使用 concat 方法,否则使用 + 。 However, the performance difference should be negligible and you probably shouldn't ever worry about this.但是,性能差异应该可以忽略不计,您可能永远不必担心这一点。

I ran a similar test as @marcio but with the following loop instead:我运行了与@marcio 类似的测试,但使用了以下循环:

String c = a;
for (long i = 0; i < 100000L; i++) {
    c = c.concat(b); // make sure javac cannot skip the loop
    // using c += b for the alternative
}

Just for good measure, I threw in StringBuilder.append() as well.为了更好地衡量,我也加入了StringBuilder.append() Each test was run 10 times, with 100k reps for each run.每个测试运行 10 次,每次运行 100k 次。 Here are the results:结果如下:

  • StringBuilder wins hands down. StringBuilder赢得了胜利。 The clock time result was 0 for most the runs, and the longest took 16ms.大多数运行的时钟时间结果为 0,最长为 16 毫秒。
  • a += b takes about 40000ms (40s) for each run. a += b每次运行大约需要 40000 毫秒(40 秒)。
  • concat only requires 10000ms (10s) per run. concat每次运行只需要 10000 毫秒(10 秒)。

I haven't decompiled the class to see the internals or run it through profiler yet, but I suspect a += b spends much of the time creating new objects of StringBuilder and then converting them back to String .我还没有反编译该类以查看内部结构或通过分析器运行它,但我怀疑a += b花费大量时间创建StringBuilder的新对象,然后将它们转换回String

Most answers here are from 2008. It looks that things have changed over the time.这里的大多数答案都是从 2008 年开始的。看起来事情随着时间的推移发生了变化。 My latest benchmarks made with JMH shows that on Java 8 + is around two times faster than concat .我使用 JMH 进行的最新基准测试表明,在 Java 8 +上比concat快两倍左右。

My benchmark:我的基准:

@Warmup(iterations = 5, time = 200, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 5, time = 200, timeUnit = TimeUnit.MILLISECONDS)
public class StringConcatenation {

    @org.openjdk.jmh.annotations.State(Scope.Thread)
    public static class State2 {
        public String a = "abc";
        public String b = "xyz";
    }

    @org.openjdk.jmh.annotations.State(Scope.Thread)
    public static class State3 {
        public String a = "abc";
        public String b = "xyz";
        public String c = "123";
    }


    @org.openjdk.jmh.annotations.State(Scope.Thread)
    public static class State4 {
        public String a = "abc";
        public String b = "xyz";
        public String c = "123";
        public String d = "!@#";
    }

    @Benchmark
    public void plus_2(State2 state, Blackhole blackhole) {
        blackhole.consume(state.a+state.b);
    }

    @Benchmark
    public void plus_3(State3 state, Blackhole blackhole) {
        blackhole.consume(state.a+state.b+state.c);
    }

    @Benchmark
    public void plus_4(State4 state, Blackhole blackhole) {
        blackhole.consume(state.a+state.b+state.c+state.d);
    }

    @Benchmark
    public void stringbuilder_2(State2 state, Blackhole blackhole) {
        blackhole.consume(new StringBuilder().append(state.a).append(state.b).toString());
    }

    @Benchmark
    public void stringbuilder_3(State3 state, Blackhole blackhole) {
        blackhole.consume(new StringBuilder().append(state.a).append(state.b).append(state.c).toString());
    }

    @Benchmark
    public void stringbuilder_4(State4 state, Blackhole blackhole) {
        blackhole.consume(new StringBuilder().append(state.a).append(state.b).append(state.c).append(state.d).toString());
    }

    @Benchmark
    public void concat_2(State2 state, Blackhole blackhole) {
        blackhole.consume(state.a.concat(state.b));
    }

    @Benchmark
    public void concat_3(State3 state, Blackhole blackhole) {
        blackhole.consume(state.a.concat(state.b.concat(state.c)));
    }


    @Benchmark
    public void concat_4(State4 state, Blackhole blackhole) {
        blackhole.consume(state.a.concat(state.b.concat(state.c.concat(state.d))));
    }
}

Results:结果:

Benchmark                             Mode  Cnt         Score         Error  Units
StringConcatenation.concat_2         thrpt   50  24908871.258 ± 1011269.986  ops/s
StringConcatenation.concat_3         thrpt   50  14228193.918 ±  466892.616  ops/s
StringConcatenation.concat_4         thrpt   50   9845069.776 ±  350532.591  ops/s
StringConcatenation.plus_2           thrpt   50  38999662.292 ± 8107397.316  ops/s
StringConcatenation.plus_3           thrpt   50  34985722.222 ± 5442660.250  ops/s
StringConcatenation.plus_4           thrpt   50  31910376.337 ± 2861001.162  ops/s
StringConcatenation.stringbuilder_2  thrpt   50  40472888.230 ± 9011210.632  ops/s
StringConcatenation.stringbuilder_3  thrpt   50  33902151.616 ± 5449026.680  ops/s
StringConcatenation.stringbuilder_4  thrpt   50  29220479.267 ± 3435315.681  ops/s

Tom is correct in describing exactly what the + operator does. Tom 准确地描述了 + 运算符的作用是正确的。 It creates a temporary StringBuilder , appends the parts, and finishes with toString() .它创建一个临时StringBuilder ,附加部分,并以toString()结束。

However, all of the answers so far are ignoring the effects of HotSpot runtime optimizations.但是,到目前为止,所有答案都忽略了 HotSpot 运行时优化的影响。 Specifically, these temporary operations are recognized as a common pattern and are replaced with more efficient machine code at run-time.具体来说,这些临时操作被认为是一种常见模式,并在运行时被更高效的机器代码所取代。

@marcio: You've created a micro-benchmark ; @marcio:您已经创建了一个 微基准 with modern JVM's this is not a valid way to profile code.对于现代 JVM,这不是分析代码的有效方法。

The reason run-time optimization matters is that many of these differences in code -- even including object-creation -- are completely different once HotSpot gets going.运行时优化很重要的原因是,一旦 HotSpot 开始运行,代码中的许多差异——甚至包括对象创建——都完全不同。 The only way to know for sure is profiling your code in situ .唯一可以确定的方法是就地分析您的代码。

Finally, all of these methods are in fact incredibly fast.最后,所有这些方法实际上都非常快。 This might be a case of premature optimization.这可能是过早优化的情况。 If you have code that concatenates strings a lot, the way to get maximum speed probably has nothing to do with which operators you choose and instead the algorithm you're using!如果您有很多连接字符串的代码,那么获得最大速度的方法可能与您选择的运算符无关,而是您使用的算法!

How about some simple testing?做一些简单的测试怎么样? Used the code below:使用了下面的代码:

long start = System.currentTimeMillis();

String a = "a";

String b = "b";

for (int i = 0; i < 10000000; i++) { //ten million times
     String c = a.concat(b);
}

long end = System.currentTimeMillis();

System.out.println(end - start);
  • The "a + b" version executed in 2500ms . "a + b"版本在2500ms 内执行。
  • The a.concat(b) executed in 1200ms . a.concat(b)1200ms内执行。

Tested several times.测试了几次。 The concat() version execution took half of the time on average. concat()版本的执行平均花费了一半的时间。

This result surprised me because the concat() method always creates a new string (it returns a " new String(result) ". It's well known that:这个结果让我感到惊讶,因为concat()方法总是创建一个新字符串(它返回一个“ new String(result) ”。众所周知:

String a = new String("a") // more than 20 times slower than String a = "a"

Why wasn't the compiler capable of optimize the string creation in "a + b" code, knowing the it always resulted in the same string?为什么编译器不能优化“a + b”代码中的字符串创建,知道它总是产生相同的字符串? It could avoid a new string creation.它可以避免创建新的字符串。 If you don't believe the statement above, test for your self.如果您不相信上面的陈述,请自行测试。

Basically, there are two important differences between + and the concat method.基本上,+ 和concat方法之间有两个重要的区别。

  1. If you are using the concat method then you would only be able to concatenate strings while in case of the + operator, you can also concatenate the string with any data type.如果您使用的是concat方法,那么您将只能连接字符串,而在使用+运算符的情况下,您还可以将字符串与任何数据类型连接起来。

    For Example:例如:

     String s = 10 + "Hello";

    In this case, the output should be 10Hello .在这种情况下,输出应该是10Hello

     String s = "I"; String s1 = s.concat("am").concat("good").concat("boy"); System.out.println(s1);

    In the above case you have to provide two strings mandatory.在上述情况下,您必须提供两个强制性字符串。

  2. The second and main difference between + and concat is that: +concat之间的第二个主要区别是:

    Case 1: Suppose I concat the same strings with concat operator in this way案例 1:假设我以这种方式使用concat运算符连接相同的字符串

    String s="I"; String s1=s.concat("am").concat("good").concat("boy"); System.out.println(s1);

    In this case total number of objects created in the pool are 7 like this:在这种情况下,池中创建的对象总数为 7,如下所示:

     I am good boy Iam Iamgood Iamgoodboy

    Case 2:案例二:

    Now I am going to concatinate the same strings via + operator现在我将通过+运算符连接相同的字符串

    String s="I"+"am"+"good"+"boy"; System.out.println(s);

    In the above case total number of objects created are only 5.在上述情况下,创建的对象总数只有 5 个。

    Actually when we concatinate the strings via + operator then it maintains a StringBuffer class to perform the same task as follows:-实际上,当我们通过+运算符连接字符串时,它会维护一个 StringBuffer 类来执行相同的任务,如下所示:-

     StringBuffer sb = new StringBuffer("I"); sb.append("am"); sb.append("good"); sb.append("boy"); System.out.println(sb);

    In this way it will create only five objects.这样,它将只创建五个对象。

So guys these are the basic differences between + and the concat method.所以伙计们,这些是+concat方法之间的基本区别。 Enjoy :)享受 :)

For the sake of completeness, I wanted to add that the definition of the '+' operator can be found in the JLS SE8 15.18.1 :为了完整起见,我想补充一点,“+”运算符的定义可以在JLS SE8 15.18.1中找到:

If only one operand expression is of type String, then string conversion (§5.1.11) is performed on the other operand to produce a string at run time.如果只有一个操作数表达式是字符串类型,则对另一个操作数执行字符串转换(第 5.1.11 节)以在运行时生成字符串。

The result of string concatenation is a reference to a String object that is the concatenation of the two operand strings.字符串连接的结果是对 String 对象的引用,该对象是两个操作数字符串的连接。 The characters of the left-hand operand precede the characters of the right-hand operand in the newly created string.在新创建的字符串中,左侧操作数的字符在右侧操作数的字符之前。

The String object is newly created (§12.5) unless the expression is a constant expression (§15.28) String 对象是新创建的(第 12.5 节),除非表达式是常量表达式(第 15.28 节)

About the implementation the JLS says the following:关于实施,JLS 说如下:

An implementation may choose to perform conversion and concatenation in one step to avoid creating and then discarding an intermediate String object.实现可以选择在一个步骤中执行转换和连接,以避免创建然后丢弃中间 String 对象。 To increase the performance of repeated string concatenation, a Java compiler may use the StringBuffer class or a similar technique to reduce the number of intermediate String objects that are created by evaluation of an expression.为了提高重复字符串连接的性能,Java 编译器可以使用 StringBuffer 类或类似技术来减少通过计算表达式创建的中间 String 对象的数量。

For primitive types, an implementation may also optimize away the creation of a wrapper object by converting directly from a primitive type to a string.对于原始类型,实现还可以通过直接从原始类型转换为字符串来优化包装对象的创建。

So judging from the 'a Java compiler may use the StringBuffer class or a similar technique to reduce', different compilers could produce different byte-code.因此从“Java 编译器可能使用 StringBuffer 类或类似技术来减少”来看,不同的编译器可能会产生不同的字节码。

I don't think so.我不这么认为。

a.concat(b) is implemented in String and I think the implementation didn't change much since early java machines. a.concat(b)是在 String 中实现的,我认为自早期的 java 机器以来实现并没有太大变化。 The + operation implementation depends on Java version and compiler. +操作的实现取决于 Java 版本和编译器。 Currently + is implemented using StringBuffer to make the operation as fast as possible.目前+是使用StringBuffer实现的,以使操作尽可能快。 Maybe in the future, this will change.也许在未来,这种情况会改变。 In earlier versions of java + operation on Strings was much slower as it produced intermediate results.在早期版本的 java +中,对字符串的操作要慢得多,因为它会产生中间结果。

I guess that += is implemented using + and similarly optimized.我猜+=是使用+实现的,并进行了类似的优化。

The + operator can work between a string and a string, char, integer, double or float data type value. + 运算符可以在字符串和字符串、字符、整数、双精度或浮点数据类型值之间工作。 It just converts the value to its string representation before concatenation.它只是在连接之前将值转换为其字符串表示形式。

The concat operator can only be done on and with strings. concat 运算符只能在字符串上完成。 It checks for data type compatibility and throws an error, if they don't match.它检查数据类型的兼容性,如果不匹配则抛出错误。

Except this, the code you provided does the same stuff.除此之外,您提供的代码执行相同的操作。

When using +, the speed decreases as the string's length increases, but when using concat, the speed is more stable, and the best option is using the StringBuilder class which has stable speed in order to do that.使用 + 时,速度随着字符串长度的增加而降低,但使用 concat 时,速度更稳定,最好的选择是使用速度稳定的 StringBuilder 类来做到这一点。

I guess you can understand why.我想你可以理解为什么。 But the totally best way for creating long strings is using StringBuilder() and append(), either speed will be unacceptable.但是创建长字符串的最佳方法是使用 StringBuilder() 和 append(),这两种速度都将是不可接受的。

Note that s.concat("hello");注意s.concat("hello"); would result in a NullPointereException when s is null.当 s 为 null 时会导致NullPointereException In Java, the behavior of the + operator is usually determined by the left operand:在 Java 中,+ 运算符的行为通常由左操作数决定:

System.out.println(3 + 'a'); //100

However, Strings are an exception.但是,字符串是一个例外。 If either operand is a String, the result is expected to be a String.如果任一操作数是字符串,则结果应为字符串。 This is the reason null is converted into "null", even though you might expect a RuntimeException .这就是 null 被转换为“null”的原因,即使您可能期望出现RuntimeException

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM