简体   繁体   English

我们什么时候应该在字符串文字上使用字符串的实习生方法

[英]When should we use intern method of String on String literals

According to String#intern() , intern method is supposed to return the String from the String pool if the String is found in String pool, otherwise a new string object will be added in String pool and the reference of this String is returned.根据String#intern() ,如果在 String 池中找到 String ,则intern方法应该从 String 池中返回 String ,否则将在 String 池中添加一个新的字符串对象并返回该 String 的引用。

So i tried this:所以我试过这个:

String s1 = "Rakesh";
String s2 = "Rakesh";
String s3 = "Rakesh".intern();

if ( s1 == s2 ){
    System.out.println("s1 and s2 are same");  // 1.
}

if ( s1 == s3 ){
    System.out.println("s1 and s3 are same" );  // 2.
}

I was expecting that s1 and s3 are same will be printed as s3 is interned, and s1 and s2 are same will not be printed.我原以为s1 and s3 are same将在 s3 被实习时打印,而s1 and s2 are same将不会被打印。 But the result is: both lines are printed.但结果是:两行都被打印出来。 So that means, by default String constants are interned.所以这意味着,默认情况下字符串常量是实习的。 But if it is so, then why do we need the intern method?但如果是这样,那我们为什么需要intern方法呢? In other words when should we use this method?换句话说,我们什么时候应该使用这种方法?

Java automatically interns String literals. Java 自动实习字符串文字。 This means that in many cases, the == operator appears to work for Strings in the same way that it does for ints or other primitive values.这意味着在许多情况下,== 运算符似乎对字符串的工作方式与对整数或其他原始值的工作方式相同。

Since interning is automatic for String literals, the intern() method is to be used on Strings constructed with new String()由于字符串文字的实习是自动的,因此将在使用new String()构造的new String()上使用intern()方法

Using your example:使用您的示例:

String s1 = "Rakesh";
String s2 = "Rakesh";
String s3 = "Rakesh".intern();
String s4 = new String("Rakesh");
String s5 = new String("Rakesh").intern();

if ( s1 == s2 ){
    System.out.println("s1 and s2 are same");  // 1.
}

if ( s1 == s3 ){
    System.out.println("s1 and s3 are same" );  // 2.
}

if ( s1 == s4 ){
    System.out.println("s1 and s4 are same" );  // 3.
}

if ( s1 == s5 ){
    System.out.println("s1 and s5 are same" );  // 4.
}

will return:将返回:

s1 and s2 are same
s1 and s3 are same
s1 and s5 are same

In all the cases besides of s4 variable, a value for which was explicitly created using new operator and where intern method was not used on it's result, it is a single immutable instance that's being returned JVM's string constant pool .在除s4变量之外的所有情况下,一个值是使用new运算符显式创建的,并且没有在其结果上使用intern方法,它是一个单个不可变实例,它正在返回JVM 的字符串常量池

Refer to JavaTechniques "String Equality and Interning" for more information.有关更多信息,请参阅JavaTechniques“字符串相等和实习”

On a recent project, some huge data structures were set up with data that was read in from a database (and hence not String constants/literals) but with a huge amount of duplication.在最近的一个项目中,一些巨大的数据结构被设置为使用从数据库读取的数据(因此不是字符串常量/文字),但具有大量重复。 It was a banking application, and things like the names of a modest set (maybe 100 or 200) corporations appeared all over the place.这是一个银行应用程序,到处都出现了一些小公司(可能是 100 或 200 家)公司的名称。 The data structures were already large, and if all those corp names had been unique objects they would have overflowed memory.数据结构已经很大了,如果所有这些公司名称都是唯一的对象,它们就会溢出内存。 Instead, all the data structures had references to the same 100 or 200 String objects, thus saving lots of space.相反,所有数据结构都引用了相同的 100 或 200 个 String 对象,从而节省了大量空间。

Another small advantage of interned Strings is that == can be used (successfully!) to compare Strings if all involved strings are guaranteed to be interned.实习字符串的另一个小优点是,如果保证所有涉及的字符串都被实习,则可以使用== (成功!)来比较字符串。 Apart from the leaner syntax, this is also a performance enhancement.除了更精简的语法之外,这也是一种性能增强。 But as others have pointed out, doing this harbors a great risk of introducing programming errors, so this should be done only as a desparate measure of last resort.正如其他人所指出的那样,这样做会带来很大的引入编程错误的风险,所以这应该只作为最后手段的绝望措施。

The downside is that interning a String takes more time than simply throwing it on the heap, and that the space for interned Strings may be limited, depending on the Java implementation.缺点是,实习字符串比简单地将其扔到堆上需要更多时间,而且实习字符串的空间可能有限,具体取决于 Java 实现。 It's best done when you're dealing with a known reasonable number of Strings with many duplications.当您处理具有许多重复的已知合理数量的字符串时,最好这样做。

I want to add my 2 cents on using == with interned strings.我想在使用==和实习字符串时增加我的 2 美分。

The first thing String.equals does is this==object . String.equals做的第一件事就是this==object

So although there is some miniscule performance gain ( you are not calling a method), from the maintainer point of view using == is a nightmare, because some interned strings have a tendency to become non-interned.因此,尽管有一些微小的性能提升(您不是在调用方法),但从维护者的角度来看,使用==是一场噩梦,因为一些实习字符串有变得非实习的趋势。

So I suggest not to rely on special case of == for interned strings, but always use equals as Gosling intended.所以我建议不要依赖==特殊情况来表示 interned 字符串,而是始终按照 Gosling 的意图使用equals

EDIT: interned becoming non-interned:编辑:实习变成非实习:

V1.0
public class MyClass
{
  private String reference_val;

  ...

  private boolean hasReferenceVal ( final String[] strings )
  {
    for ( String s : strings )
    {
      if ( s == reference_val )
      {
        return true;
      }
    }

    return false;
  }

  private void makeCall ( )
  {
     final String[] interned_strings =  { ... init with interned values ... };

     if ( hasReference( interned_strings ) )
     {
        ...
     }
  }
}

In version 2.0 maintainer decided to make hasReferenceVal public, without going into much detail that it expects an array of interned strings.在 2.0 版本中,维护者决定公开hasReferenceVal ,但没有详细说明它需要一个内部字符串数组。

V2.0
public class MyClass
{
  private String reference_val;

  ...

  public boolean hasReferenceVal ( final String[] strings )
  {
    for ( String s : strings )
    {
      if ( s == reference_val )
      {
        return true;
      }
    }

    return false;
  }

  private void makeCall ( )
  {
     final String[] interned_strings =  { ... init with interned values ... };

     if ( hasReference( interned_strings ) )
     {
        ...
     }
  }
}

Now you have a bug, that may be very hard to find, because in majority of cases array contains literal values, and sometimes a non-literal string is used.现在您有一个错误,可能很难找到,因为在大多数情况下数组包含文字值,有时使用非文字字符串。 If equals were used instead of == then hasReferenceVal would have still continue to work.如果使用equals而不是==那么hasReferenceVal仍将继续工作。 Once again, performance gain is miniscule, but maintenance cost is high.再一次,性能提升微乎其微,但维护成本却很高。

String literals and constants are interned by default.默认情况下,字符串文字和常量是内部的。 That is, "foo" == "foo" (declared by the String literals), but new String("foo") != new String("foo") .也就是说, "foo" == "foo" (由字符串文字声明),但是new String("foo") != new String("foo")

Learn Java String Intern - once for all学习 Java String Intern - 一劳永逸

Strings in java are immutable objects by design. java中的字符串在设计上是不可变的对象。 Therefore, two string objects even with same value will be different objects by default.因此,默认情况下,即使具有相同值的两个字符串对象也将是不同的对象。 However, if we wish to save memory, we could indicate to use same memory by a concept called string intern.但是,如果我们希望节省内存,我们可以通过称为字符串实习生的概念来指示使用相同的内存。

The below rules would help you understand the concept in clear terms:以下规则将帮助您清楚地理解该概念:

  1. String class maintains an intern-pool which is initially empty. String 类维护一个初始为空的实习池。 This pool must guarantee to contain string objects with only unique values.这个池必须保证包含只有唯一值的字符串对象。
  2. All string literals having same value must be considered same memory-location object because they have otherwise no notion of distinction.所有具有相同值的字符串文字都必须被视为相同的内存位置对象,否则它们没有区别的概念。 Therefore, all such literals with same value will make a single entry in the intern-pool and will refer to same memory location.因此,所有具有相同值的文字将在实习池中创建一个条目,并将引用相同的内存位置。
  3. Concatenation of two or more literals is also a literal.两个或多个文字的串联也是一个文字。 (Therefore rule #2 will be applicable for them) (因此规则#2 将适用于他们)
  4. Each string created as object (ie by any other method except as literal) will have different memory locations and will not make any entry in the intern-pool作为对象创建的每个字符串(即通过除文字以外的任何其他方法)将具有不同的内存位置,并且不会在实习池中创建任何条目
  5. Concatenation of literals with non-literals will make a non-literal.文字与非文字的串联将成为非文字。 Thus, the resultant object will have a new memory location and will NOT make an entry in the intern-pool.因此,生成的对象将有一个新的内存位置,并且不会在实习池中创建一个条目。
  6. Invoking intern method on a string object, either creates a new object that enters the intern-pool or return an existing object from the pool that has same value.在字符串对象上调用实习生方法,要么创建一个进入实习生池的新对象,要么从池中返回一个具有相同值的现有对象。 The invocation on any object which is not in the intern-pool, does NOT move the object to the pool.对不在实习池中的任何对象的调用不会将对象移动到池中。 It rather creates another object that enters the pool.而是创建另一个进入池的对象。

Example:例子:

String s1=new String (“abc”);
String s2=new String (“abc”);
If (s1==s2)  //would return false  by rule #4
If (“abc” == “a”+”bc” )  //would return true by rules #2 and #3
If (“abc” == s1 )  //would return false  by rules #1,2 and #4
If (“abc” == s1.intern() )  //would return true  by rules #1,2,4 and #6
If ( s1 == s2.intern() )      //wound return false by rules #1,4, and #6

Note: The motivational cases for string intern are not discussed here.注意:这里不讨论字符串实习生的动机案例。 However, saving of memory will definitely be one of the primary objectives.但是,节省内存肯定是主要目标之一。

you should make out two period time which are compile time and runtime time.for example:您应该确定两个时间段,即编译时间和运行时间。例如:

//example 1 
"test" == "test" // --> true 
"test" == "te" + "st" // --> true

//example 2 
"test" == "!test".substring(1) // --> false
"test" == "!test".substring(1).intern() // --> true

in the one hand,in the example 1,we find the results are all return true,because in the compile time,the jvm will put the "test" to the pool of literal strings,if the jvm find "test" exists,then it will use the exists one,in example 1,the "test" strings are all point to the same memory address,so the example 1 will return true.一方面,在示例1中,我们发现结果都返回true,因为在编译时,jvm会将“test”放入字符串池中,如果jvm find“test”存在,则它将使用存在的,在示例 1 中,“测试”字符串都指向相同的内存地址,因此示例 1 将返回 true。 in the other hand,in the example 2,the method of substring() execute in the runtime time, in the case of "test" == "!test".substring(1),the pool will create two string object,"test" and "!test",so they are different reference objects,so this case will return false,in the case of "test" == "!test".substring(1).intern(),the method of intern() will put the ""!test".substring(1)" to the pool of literal strings,so in this case,they are same reference objects,so will return true.另一方面,在示例2中,substring()的方法在运行时执行,在"test" == "!test".substring(1)的情况下,池将创建两个字符串对象," test"和"!test",所以它们是不同的引用对象,所以这种情况会返回false,在"test" == "!test".substring(1).intern()的情况下,intern()的方法) 会将 ""!test".substring(1)" 放入文字字符串池中,因此在这种情况下,它们是相同的引用对象,因此将返回 true。

http://en.wikipedia.org/wiki/String_interning http://en.wikipedia.org/wiki/String_interning

string interning is a method of storing only one copy of each distinct string value, which must be immutable.字符串实习是一种只存储每个不同字符串值的一个副本的方法,它必须是不可变的。 Interning strings makes some string processing tasks more time- or space-efficient at the cost of requiring more time when the string is created or interned.驻留字符串使一些字符串处理任务在时间或空间上更加高效,代价是在创建或驻留字符串时需要更多时间。 The distinct values are stored in a string intern pool.不同的值存储在字符串实习生池中。

Interned Strings avoid duplicate Strings.实习字符串避免重复的字符串。 Interning saves RAM at the expense of more CPU time to detect and replace duplicate Strings.实习节省了 RAM,但会花费更多的 CPU 时间来检测和替换重复的字符串。 There is only one copy of each String that has been interned, no matter how many references point to it.无论有多少引用指向它,每个字符串都只有一个副本。 Since Strings are immutable, if two different methods incidentally use the same String, they can share a copy of the same String.由于字符串是不可变的,如果两个不同的方法偶然使用同一个字符串,它们可以共享同一个字符串的副本。 The process of converting duplicated Strings to shared ones is called interning.String.intern() gives you the address of the canonical master String.将重复字符串转换为共享字符串的过程称为interning.String.intern()为您提供规范主字符串的地址。 You can compare interned Strings with simple == (which compares pointers) instead of equals which compares the characters of the String one by one.您可以使用简单的 ==(比较指针)而不是将字符串的字符一一比较的equals来比较实习字符串。 Because Strings are immutable, the intern process is free to further save space, for example, by not creating a separate String literal for "pot" when it exists as a substring of some other literal such as "hippopotamus".因为字符串是不可变的,所以实习生进程可以自由地进一步节省空间,例如,当“pot”作为某些其他文字(例如“hippopotamus”)的子字符串存在时,不会为“pot”创建单独的字符串文字。

To see more http://mindprod.com/jgloss/interned.html要查看更多http://mindprod.com/jgloss/interned.html

String s1 = "Anish";
        String s2 = "Anish";

        String s3 = new String("Anish");

        /*
         * When the intern method is invoked, if the pool already contains a
         * string equal to this String object as determined by the
         * method, then the string from the pool is
         * returned. Otherwise, this String object is added to the
         * pool and a reference to this String object is returned.
         */
        String s4 = new String("Anish").intern();
        if (s1 == s2) {
            System.out.println("s1 and s2 are same");
        }

        if (s1 == s3) {
            System.out.println("s1 and s3 are same");
        }

        if (s1 == s4) {
            System.out.println("s1 and s4 are same");
        }

OUTPUT输出

s1 and s2 are same
s1 and s4 are same
String p1 = "example";
String p2 = "example";
String p3 = "example".intern();
String p4 = p2.intern();
String p5 = new String(p3);
String p6 = new String("example");
String p7 = p6.intern();

if (p1 == p2)
    System.out.println("p1 and p2 are the same");
if (p1 == p3)
    System.out.println("p1 and p3 are the same");
if (p1 == p4)
    System.out.println("p1 and p4 are the same");
if (p1 == p5)
    System.out.println("p1 and p5 are the same");
if (p1 == p6)
    System.out.println("p1 and p6 are the same");
if (p1 == p6.intern())
    System.out.println("p1 and p6 are the same when intern is used");
if (p1 == p7)
    System.out.println("p1 and p7 are the same");

When two strings are created independently, intern() allows you to compare them and also it helps you in creating a reference in the string pool if the reference didn't exist before.当两个字符串独立创建时, intern()允许您比较它们,如果引用之前不存在,它还可以帮助您在字符串池中创建引用。

When you use String s = new String(hi) , java creates a new instance of the string, but when you use String s = "hi" , java checks if there is an instance of word "hi" in the code or not and if it exists, it just returns the reference.当您使用String s = new String(hi) ,java 会创建该字符串的一个新实例,但是当您使用String s = "hi" ,java 会检查代码中是否存在单词 "hi" 的实例,然后如果存在,它只返回引用。

Since comparing strings is based on reference, intern() helps in you creating a reference and allows you to compare the contents of the strings.由于比较字符串基于引用,因此intern()可帮助您创建引用并允许您比较字符串的内容。

When you use intern() in the code, it clears of the space used by the string referring to the same object and just returns the reference of the already existing same object in memory.当您在代码中使用intern()时,它会清除引用同一对象的字符串所使用的空间,并只返回内存中已存在的同一对象的引用。

But in case of p5 when you are using:但是在使用 p5 的情况下:

String p5 = new String(p3);

Only contents of p3 are copied and p5 is created newly.只复制 p3 的内容,新创建 p5。 So it is not interned .所以不实习

So the output will be:所以输出将是:

p1 and p2 are the same
p1 and p3 are the same
p1 and p4 are the same
p1 and p6 are the same when intern is used
p1 and p7 are the same
    public static void main(String[] args) {
    // TODO Auto-generated method stub
    String s1 = "test";
    String s2 = new String("test");
    System.out.println(s1==s2);              //false
    System.out.println(s1==s2.intern());    //true --> because this time compiler is checking from string constant pool.
}

string intern() method is used to create an exact copy of heap string object in string constant pool. string intern() 方法用于在字符串常量池中创建堆字符串对象的精确副本。 The string objects in the string constant pool are automatically interned but string objects in heap are not.字符串常量池中的字符串对象会自动驻留,但堆中的字符串对象不会。 The main use of creating interns is to save the memory space and to perform faster comparison of string objects.创建实习生的主要用途是节省内存空间和更快地进行字符串对象的比较。

Source : What is string intern in java?来源: java 中的字符串实习生是什么?

As you said, that string intern() method will first find from the String pool, if it finds, then it will return the object that points to that, or will add a new String into the pool.如您所说,该字符串intern()方法将首先从字符串池中查找,如果找到,则返回指向该对象的对象,或者将新的字符串添加到池中。

    String s1 = "Hello";
    String s2 = "Hello";
    String s3 = "Hello".intern();
    String s4 = new String("Hello");

    System.out.println(s1 == s2);//true
    System.out.println(s1 == s3);//true
    System.out.println(s1 == s4.intern());//true

The s1 and s2 are two objects pointing to the String pool "Hello", and using "Hello".intern() will find that s1 and s2 . s1s2是指向字符串池“Hello”的两个对象,使用"Hello".intern()会发现s1s2 So "s1 == s3" returns true, as well as to the s3.intern() .所以"s1 == s3"返回 true,以及s3.intern()

By using heap object reference if we want to get corresponding string constant pool object reference, then we should go for intern()通过使用堆对象引用,如果我们想得到对应的字符串常量池对象引用,那么我们应该去intern()

String s1 = new String("Rakesh");
String s2 = s1.intern();
String s3 = "Rakesh";

System.out.println(s1 == s2); // false
System.out.println(s2 == s3); // true

Pictorial View图片视图在此处输入图片说明

Step 1: Object with data 'Rakesh' get created in heap and string constant pool.第 1 步:在堆和字符串常量池中创建带有数据“Rakesh”的对象。 Also s1 is always pointing to heap object.此外 s1 始终指向堆对象。

Step 2: By using heap object reference s1, we are trying to get corresponding string constant pool object referenc s2, using intern()步骤2:通过使用堆对象引用s1,我们尝试使用intern()获取对应的字符串常量池对象引用s2

Step 3: Intentionally creating a object with data 'Rakesh' in string constant pool, referenced by name s3第 3 步:在字符串常量池中故意创建一个带有数据 'Rakesh' 的对象,由名称 s3 引用

As "==" operator meant for reference comparison.由于“==”运算符用于参考比较。

Getting false for s1==s2 s1==s2 为

Getting true for s2==s3 s2==s3 为

Hope this help!!希望这有帮助!!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM