简体   繁体   English

为什么在Tiger Book的符号表中使用字符串实习生?

[英]Why string intern in the symbol table of the Tiger Book?

In Chapter 5.1.4, Modern Compiler Implementation in Java : 在第5.1.4节“ 用Java实现现代编译器”中

public class Symbol {
  private String name;
  private Symbol (String n) { name = n; }
  private static java.util.Dictionary dict = new java.util.Hashtable();

  public String toString() { return name; }

  public static Symbol symbol(String n) {
    String u = n.intern();
    Symbol s = (Symbol)dict.get(u);
    if (s == null) { s = new Symbol(u); dict.put(u, s); }
    return s;
  }
}

I can't see why using string intern here, since Hashtable use key.equals(...) to check identity. 我看不到为什么在这里使用字符串实习生,因为Hashtable使用key.equals(...)检查身份。

Could you please tell me the reason? 你能告诉我原因吗? Thanks! 谢谢!

In programming there is a lot of "wisdom", "rumours", "magic" or "superstition" going around. 在编程中,有很多“智慧”,“谣言”,“魔术”或“迷信”。

As @RealSkepic points out, Before Java 7u4, String.substring would use a portion of the original string rather than copy that portion. 正如@RealSkepic指出的那样,在Java 7u4之前,String.substring将使用原始字符串的一部分,而不是复制该部分。 While this improved performance in many cases, it could lead to memory leaks. 尽管在许多情况下可以提高性能,但可能导致内存泄漏。 Using intern() was one way to avoid this, though it could create it's own memory clean up problems which is not ideal. 使用intern()是避免这种情况的一种方法,尽管它可能会创建自己的内存清理问题,但这并不理想。 Using new String(oldString) was another approach, but you shouldn't need to do that now. 使用new String(oldString)是另一种方法,但是您现在不需要这样做。

People often try things for "performance reasons" but don't know how to test it, or just don't check it actually helps. 人们经常出于“性能原因”尝试尝试某些事情,但是却不知道如何进行测试,或者只是不检查它是否真正有用。 I do this from time to time, even though I know to avoid it, because it is far too often incorrect, or just makes the code confusing. 我有时会这样做,尽管我知道要避免这样做,因为它经常是不正确的,或者只会使代码令人困惑。

Most likely the author found a situation, or heard some one saved a lot of memory by using String.intern() and in specific cases, it can do this, but it's not like fairy dust where you sprinkle a bit of performance magic and everything is better. 作者很可能发现了这种情况,或者听说有人通过使用String.intern()节省了很多内存,在特定情况下,它可以做到这一点,但是这不像童话般的尘土,您会在上面撒些表演魔术和所有东西更好。 Most of these obscure tricks to optimise code only work in very specific use cases. 这些用于优化代码的晦涩技巧大多数仅在非常特定的用例中起作用。

A similar example is when people using locks or thread safe collections in multi-threading. 一个类似的例子是当人们在多线程中使用锁或线程安全集合时。 Sprinkle enough of this around and the program can appear to stop an error, but you haven't really fixed the problem, just made it harder to find when something incidental changes and your bug shows itself again. 充分散布这些内容,该程序似乎可以停止错误,但是您并没有真正解决问题,只是在发现偶然的变化并且错误再次出现时,很难找到问题。

I hope you know what String#intern does. 希望您知道String#intern作用。 Simply put, it will add the given string into pool of strings maintained by the String class if it already is not part of it or if the string is already part of the String pool, that object is returned. 简而言之,它将给定的字符串添加到String类维护的字符串池中,如果该字符串已经不属于该字符串,或者该字符串已经属于String池,则返回该对象。 So, there will be only copy of this particular value of String in the string pool. 因此,在字符串池中将只有此特定String值的副本。

What it means is when we do aString.intern() , and this is always put in the Map, the next time when anotherString.intern() is get from the map, the equals will return true in the == comparison itself. 意思是当我们执行aString.intern() ,这总是放在地图中,下次当从地图中获取anotherString.intern()时,equals将在==比较本身中返回true。 This will avoid iterating through the entire string to verify the equality. 这样可以避免遍历整个字符串来验证相等性。 This could prove to be a great performance improvement if the Strings stored in the map could be large and if the Map is going to be searched (get or contains operations) frequently. 如果存储在映射中的字符串很大,并且要频繁搜索(获取或包含操作)映射,那么这可能证明是性能上的巨大改进。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM