简体   繁体   English

关于String a =“你好”; 字符串b =“hello”a == b,在java中

[英]about String a= “hello”; String b= “hello” a==b , in java

check the following program: Run it in sun java hostspot jvm, everything will be "true". 检查以下程序:在sun java hostspot jvm中运行它,一切都将为“true”。

--------updated: got the answer by Stephen and Danie,changed the program to add string intern method----------- --------更新:得到了Stephen和Danie的答案,改变了程序添加字符串实习方法-----------

how it will become, if B is separate compiled not together with A, what will happen???, for example , B is compiled and put in a jar, and put its class path when run TestStringEqual ?? 它将如何成为,如果B是单独编译而不是与A一起编程,会发生什么?例如,B编译并放入jar中,并在运行TestStringEqual时放入其类路径?

Also, is this java compile time optimization, or java run time optimization, or java language specification defined ?? 另外,这是java编译时优化,还是java运行时优化,还是java语言规范定义?

Also, it this program comes the same result on different VMs, or just one VM feature? 此外,该程序在不同的VM上只有一个VM功能,结果相同吗?

thanks 谢谢

public class TestStringEqual {
public static String HELLO = "hello";

private String m_hello;

public TestStringEqual() {
    m_hello = "hello";
}

public static void main(String[] args) {
    String a = "hello";
    String b = "hello";

    System.out.println("string a== string b:" + (a == b));

    System.out.println("static memebr ==a:" + (HELLO == a));

    System.out.println("instance field ==a:"
            + (new TestStringEqual().getHello() == a));

    System.out.println("hello in B ==a:" + (B.B_HELLO == a));

    System.out.println("interned new string object in heep==a:"
            + ( new String("hello").intern() == a));

}

public String getHello() {
    return this.m_hello;
}
}
class B{
public static final String B_HELLO = "he"+"llo";
}

On the JVM level, the LDC (load constant) instruction is used to push a string literal onto the stack. 在JVM级别, LDC (加载常量)指令用于将字符串文字推送到堆栈。 For performance reasons, the string literal isn't stored in the code itself; 出于性能原因,字符串文字不存储在代码本身中; it's stored in the constant pool of the class. 它存储在类的常量池中。 The constant pool is a table which appears at the beginning of a class file containing string literals, numeric literals, field and method descriptors, and a few other things. 常量池是一个表,它出现在包含字符串文字,数字文字,字段和方法描述符以及其他一些内容的类文件的开头。 LDC is followed by a byte specifying the string's index in the constant pool. LDC后跟一个字节,指定常量池中的字符串索引。 (If one byte is not large enough, the compiler will use LDC_W , which is followed by a 16-bit offset. Hence the limit of 65,536 constants.) (如果一个字节不够大,编译器将使用LDC_W ,后面跟着一个16位的偏移量。因此限制为65,536个常量。)

If the same string literal occurs twice in the same class, javac is smart enough to create only one entry in the constant pool. 如果相同的字符串文字在同一个类中出现两次,则javac足够聪明,只能在常量池中创建一个条目。 When a class is loaded, the JVM creates actual String objects from the data in the constant pool. 加载类时,JVM会从常量池中的数据创建实际的String对象。 LDC s which contain the same offset into the constant pool will thus cause the same String to be pushed onto the stack. 因此,在常量池中包含相同偏移量的LDC将导致相同的String被压入堆栈。 Instructions like IF_ACMPEQ (which checks for reference equality as == does) will then recognize the strings as identical. IF_ACMPEQ这样的IF_ACMPEQ (检查参考相等性为== )会将字符串识别为相同。

See the JVMS for more info. 有关详细信息,请参阅JVMS

There is really no mystery about this at all. 这根本就没有什么神秘感。 You just need to know three basic facts about Java: 您只需要了解有关Java的三个基本事实:

  • The '==' operator for object references tests if two object references are the same; 对象引用的'=='运算符测试两个对象引用是否相同; ie if they point to the same object. 即如果他们指向同一个对象。 Reference JLS 15.21.3 参考JLS 15.21.3

  • All String literals with the same sequence of characters in a Java program will be represented by the same String object. Java程序中具有相同字符序列的所有字符串文字将由相同的String对象表示。 Reference JLS 3.10.5 So (for example) "hello" == "hello" is comparing the same object. 参考JLS 3.10.5所以(例如) "hello" == "hello"正在比较同一个对象。

  • Constant expressions are evaluated at compile time. 在编译时计算常量表达式。 Reference JLS 15.28 . 参考JLS 15.28 So (for example) "hell" + "o" is evaluated at compile time, and is therefore equivalent to the literal "hello" . 所以(例如) "hell" + "o"在编译时被评估,因此等同于文字"hello"

These three facts are stated in the Java Language Specifications. 这三个事实在Java语言规范中说明。 They are sufficient to explain the "puzzling" aspects behaviour of your program, without relying on anything else. 它们足以解释程序的“令人费解”的方面行为,而不依赖于其他任何东西。

The more detailed explanation involving the string pool, string literals being interned by the class loader, the bytecodes emitted by the compiler, etc, etc ... are just implementation details . 涉及字符串池的更详细的解释,由类加载器实现的字符串文字,编译器发出的字节码等等...... 只是实现细节 You don't need to understand these details if you understand what the JLS is saying, and they don't really help to make the JLS clearer (IMO). 如果您了解JLS的说法,则无需了解这些详细信息,并且它们无助于使JLS更清晰(IMO)。


Notes: 笔记:

  1. The definition of what is and what isn't a constant expression is a little involved. 什么是什么,什么不是常数表达式的定义有点牵扯。 Some things that you might imagine to be constant valued, are in fact not. 你可能想象的一些不变的东西实际上并非如此。 For instance, "hello".length() is not a constant expression. 例如, "hello".length()不是常量表达式。 However, a concatenation of two string literals is a constant expression. 但是,两个字符串文字的串联一个常量表达式。

  2. The explanation of equality of string literals in the JLS does in fact mention interning as the mechanism by which this property of literals is implemented. JLS中字符串文字相等的解释实际上提到了实习作为实现文字属性的机制

It's an immutable string (unable to be mutated or changed), not an immune one, though I suppose you could argue that it's immune from change :-) 它是一个不可变的字符串(无法变异或改变),而不是一个免疫的字符串,虽然我想你可以说它不受变化的影响:-)

That means you cannot change the underlying string itself, you can only assign a different string to the variable. 这意味着您无法更改基础字符串本身,您只能为变量分配不同的字符串。 So: 所以:

string a = "Hello";
a = "Goodbye";

doesn't change the memory where "Hello" is stored, it changes a to point to a different memory location where "Goodbye" is stored. 不会更改存储"Hello"的内存,它会将a更改为指向存储"Goodbye"的其他内存位置。

This allows Java to share strings for efficiency. 这允许Java共享字符串以提高效率。 You can even get cases where strings like "deoxyribonucleic acid" and "acid" may share space, where the latter points to a specific location within the former. 你甚至可以得到像"deoxyribonucleic acid""acid"这样的字符串可以共享空间的情况,后者指向前者中的特定位置。 Again, this is made possible by the immutable nature of such strings. 同样,这可以通过这种字符串的不可变性来实现。

In any case, == will check to see if the strings refer to the same underlying object, not something that's often useful. 在任何情况下, ==将检查字符串是否引用相同的底层对象,而不是通常有用的东西。 If you want to see if the strings are equal, you should be using String.equals() or one of its variations. 如果要查看字符串是否相等,则应使用String.equals()或其中一个变体。

这个链接会有帮助 - 关于Java的字符串池的问题吗?

It is fairly simple: the compiler will generate a (bytecode) constant for the string "hello" the first time it encounters it. 这很简单:编译器会在第一次遇到字符串“hello”时生成一个(字节码)常量。 In normal assembler it would be in the .TEXT section. 在普通汇编程序中,它将位于.TEXT部分。

The subsequent "hello" strings will then point to that same constant, since there is no need to allocate new space or create a new constant. 随后的“hello”字符串将指向同一个常量,因为不需要分配新空间或创建新常量。 The reason this is so is because strings are immutable and if one is assigned a new value new memory is needed for it anyway. 之所以这样,是因为字符串是不可变的,如果为其分配了一个新值,则无论如何都需要新的内存。

It will probably not work on input, ie if you let a user input "hello" and ==-compare that to the compile-time hello strings you'll likely get false. 它可能不适用于输入,即如果你让用户输入“hello”和== - 将它与编译时hello字符串进行比较,你可能会得到错误。

As far as a==b goes, it seems the compiler is making the shortcuts and sharing the same string object. a==b而言,似乎编译器正在制作快捷方式并共享相同的字符串对象。 When I declare my varuiables as follows, I get a==b is false . 当我声明我的变量如下时,我得到a==bfalse

String a = "hello";
String b = "hell";
String temp = "o";
if (new java.util.Random().nextDouble() < 0.5) b += temp;
else b += "o";

If I do String b = "hell"+"o"; 如果我做String b = "hell"+"o"; I still get a==b as true . 我仍然得到a==btrue

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM