简体   繁体   English

String.intern()的垃圾收集行为

[英]Garbage collection behaviour for String.intern()

If I use String.intern() to improve performance as I can use "==" to compare interned string, will I run into garbage collection issues? 如果我使用String.intern()来提高性能,因为我可以使用“==”来比较实习字符串,我会遇到垃圾收集问题吗? How does the garbage collection mechanism of interned strings differ from normal strings ? 实习字符串的垃圾收集机制与普通字符串有何不同?

String.intern() manages an internal, native-implemented pool, which has some special GC-related features. String.intern()管理一个内部的本机实现池,它具有一些特殊的GC相关功能。 This is old code, but if it were implemented anew, it would use a java.util.WeakHashMap . 这是旧代码,但如果它是重新实现的,它将使用java.util.WeakHashMap Weak references are a way to keep a pointer to an object without preventing it from being collected. 弱引用是一种保持指向对象的指针而不阻止它被收集的方法。 Just the right thing for a unifying pool such as interned strings. 对于统一池,如实习字符串,这是正确的事情。

That interned strings are garbage collected can be demonstrated with the following Java code: 可以使用以下Java代码演示可以使用垃圾收集的实际字符串:

public class InternedStringsAreCollected {

    public static void main(String[] args)
    {
        for (int i = 0; i < 30; i ++) {
            foo();  
            System.gc();
        }   
    }

    private static void foo()
    {
        char[] tc = new char[10];
        for (int i = 0; i < tc.length; i ++)
            tc[i] = (char)(i * 136757);
        String s = new String(tc).intern();
        System.out.println(System.identityHashCode(s));
    }
}

This code creates 30 times the same string, interning it each time. 此代码创建相同字符串的30倍,每次实习。 Also, it uses System.identityHashCode() to show what hash code Object.hashCode() would have returned on that interned string. 此外,它使用System.identityHashCode()来显示在该实习字符串上返回的哈希代码Object.hashCode() When run, this code prints out distinct integer values, which means that you do not get the same instance each time. 运行时,此代码打印出不同的整数值,这意味着每次都不会获得相同的实例。

Anyway, usage of String.intern() is somewhat discouraged. 无论如何,有点不鼓励使用String.intern() It is a shared static pool, which means that it easily turns into a bottleneck on multi-core systems. 它是一个共享的静态池,这意味着它很容易成为多核系统的瓶颈。 Use String.equals() to compare strings, and you will live longer and happier. 使用String.equals()来比较字符串,您将活得更长久,更快乐。

In fact, this not a garbage collection optimisation, but rather a string pool optimization. 实际上,这不是垃圾收集优化,而是字符串池优化。 When you call String.intern() , you replace reference to your initial String with its base reference (the reference of the first time this string was encountered, or this reference if it is not yet known). 当您调用String.intern() ,您将初始String的引用替换为其基本引用(第一次遇到此字符串的引用,或者如果尚未知道此引用)。

However, it will become a garbage collector issue once your string is of no more use in application, since the interned string pool is a static member of the String class and will never be garbage collected. 但是,一旦您的字符串在应用程序中不再使用,它​​将成为垃圾收集器问题,因为实习字符串池是String类的静态成员,并且永远不会被垃圾回收。

As a rule of thumb, i consider preferrable to never use this intern method and let the compiler use it only for constants Strings, those declared like this : 根据经验,我认为从不使用这个实习方法并让编译器仅将它用于常量字符串,这些声明如下:

String myString = "a constant that will be interned";

This is better, in the sense it won't let you do the false assumption == could work when it won't. 这是更好的,从某种意义上说,它不会让你做错误的假设==当它不会时可以工作。

Besides, the fact is String.equals underlyingly calls == as an optimisation, making it sure interned strings optimization are used under the hood. 此外,事实上String.equals底层调用==作为优化,使得它确保在引擎盖下使用实习字符串优化。 This is one more evidence == should never be used on Strings. 这是另一个证据== 永远不应该在字符串上使用。

This article provides the full answer. 本文提供了完整的答案。

In java 6 the string pool resides in the PermGen, since java 7 the string pool resides in the heap memory. 在java 6中,字符串池驻留在PermGen中,因为java 7字符串池驻留在堆内存中。

Manually interned strings will be garbage-collected. 手动实习的字符串将被垃圾收集。
String literals will be only garbage collected if the class that defines them is unloaded. 如果卸载定义它们的类,则字符串文字将仅被垃圾收集。

The string pool is a HashMap with fixed size which was small in java 6 and early versions of java 7, but increased to 60013 since java 7u40. 字符串池是一个固定大小的HashMap,在java 6和早期版本的java 7中很小,但自java 7u40以来增加到60013。
It can be changed with -XX:StringTableSize=<new size> and viewed with -XX:+PrintFlagsFinal java options. 可以使用-XX:StringTableSize = <new size>更改它,并使用-XX:+ PrintFlagsFinal java选项查看。

Please read: http://satukubik.com/2009/01/06/java-tips-memory-optimization-for-string/ 请阅读: http//satukubik.com/2009/01/06/java-tips-memory-optimization-for-string/

The conclusion I can get from your information is: You interned too many String . 我可以从你的信息得到的结论是: 你实习太多了 If you really need to intern so many String for performance optimization, increase the perm gen memory , but if I were you, I will check first if I really need so many interned String. 如果你真的需要为实现性能优化实习这么多String, 增加perm gen内存 ,但如果我是你, 我会先检查我是否真的需要这么多实习字符串。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM