简体   繁体   English

为什么char []比String更好? - Java

[英]Why char[] performs better than String ?- Java

In reference to the link: File IO Tuning , last section titled "Further Tuning" where the author suggests using char[] to avoid generating String objects for n lines in the file, I need to understand how does 参考链接: 文件IO调优 ,最后一节标题为“进一步调整”,作者建议使用char []来避免为文件中的n行生成String对象,我需要了解它是怎么做的

char[] arr = new char{'a','u','t','h', 'o', 'r'}

differ with 与...不同

String s = "author"

in terms of memory consumption or any other performance factor? 在内存消耗或任何其他性能因素方面? Isn't String object internally stored as a character array? String对象是否内部存储为字符数组? I feel silly since I never thought of this before. 我觉得很傻,因为我以前从没想过这个。 :-) :-)

In Oracle's JDK a String has four instance-level fields: 在Oracle的JDK中, String有四个实例级字段:

  • A character array 一个字符数组
  • An integral offset 积分偏移量
  • An integral character count 一个完整的字符数
  • An integral hash value 一个完整的哈希值

That means that each String introduces an extra object reference (the String itself), and three integers in addition to the character array itself. 这意味着每个String引入了一个额外的对象引用( String本身),以及除了字符数组本身之外的三个整数。 (The offset and character count are there to allow sharing of the character array among String instances produced through the String#substring() methods , a design choice that some other Java library implementers have eschewed .) Beyond the extra storage cost, there's also one more level of access indirection, not to mention the bounds checking with which the String guards its character array. (偏移和字符计数允许在通过String#substring()方法生成的String实例之间共享字符数组,这是一些其他Java库实现者所避免的设计选择。)除了额外的存储成本之外,还有一个更多级别的访问间接,更不用说使用String保护其字符数组的边界检查。

If you can get away with allocating and consuming just the basic character array, there's space to be saved there. 如果你能够分配和使用基本字符数组,那么就有空间可以保存。 It's certainly not idiomatic to do so in Java though; 在Java中这样做当然不是惯用的; judicious comments would be warranted to justify the choice, preferably with mention of evidence from having profiled the difference. 有理由提出明智的评论来证明这一选择的合理性,最好是提出分析差异的证据。

In the example you've referred to, it's because there's only a single character array being allocated for the whole loop. 在这个例子中,你已经提到,这是因为只有一个字符数组被分配用于整个循环有。 It's repeatedly reading into that same array, and processing it in place. 它反复读入同一个数组,并在适当的位置处理它。

Compare that with using readLine which needs to create a new String instance on each iteration. 将其与使用readLine进行比较,后者需要在每次迭代时创建一个新的 String实例。 Each String instance will contain a few int fields and a reference to a char[] containing the actual data - so it would need two new instances per iteration . 每个String实例将包含一些int字段和对包含实际数据的char[]的引用 - 因此每次迭代需要两个新实例。

I'd usually expect the differences to be insignificant (with a decent GC throwing away unused "young" objects very efficiently) compared with the IO involved in reading the data - assuming it's from disk - but I believe that's the point the author was trying to make. 我通常认为差异是微不足道的(与一个体面的GC非常有效地丢弃未使用的“年轻”对象)相比,读取数据所涉及的IO - 假设它来自磁盘 - 但我相信这是作者尝试的重点制作。

Here are few reasons which makes sense to believe that character array is better choice in Java than String: 以下几个理由认为字符数组是Java中比String更好的选择:

Say for Storing the Password 说存储密码

1) Since Strings are immutable in Java, if you store password as plain text it will be available in memory until Garbage collector clears it and since String are used in String pool for reusability there is pretty high chance that it will be remain in memory for long duration, which pose a security threat. 1)由于字符串在Java中是不可变的,如果您将密码存储为纯文本,它将在内存中可用,直到垃圾收集器清除它并且因为字符串在字符串池中用于可重用性,它很可能会保留在内存中持续时间长,构成安全威胁。

Since any one who has access to memory dump can find the password in clear text and that's another reason you should always used an encrypted password than plain text. 由于任何有权访问内存转储的人都可以以明文形式找到密码,这是另一个原因,您应该始终使用加密密码而不是纯文本。

Since Strings are immutable there is no way contents of Strings can be changed because any change will produce new String, while if you char[] you can still set all his element as blank or zero. 由于字符串是不可变的,所以不能更改字符串的内容,因为任何更改都会产生新的字符串,而如果你使用char [],你仍然可以将所有元素设置为空白或零。 So Storing password in character array clearly mitigates security risk of stealing password. 因此,在字符数组中存储密码可以明显降低窃取密码的安全风险。

2) Java itself recommends using getPassword() method of JPasswordField which returns a char[] and deprecated getText() method which returns password in clear text stating security reason. 2)Java本身建议使用JPasswordField的getPassword()方法,该方法返回一个char []和不推荐使用的getText()方法,该方法以明文形式返回密码,说明安全原因。 Its good to follow advice from Java team and adhering to standard rather than going against it. 很好地遵循Java团队的建议并坚持标准而不是反对它。

3) With String there is always a risk of printing plain text in log file or console but if use Array you won't print contents of array instead its memory location get printed. 3)使用String时,总是存在在日志文件或控制台中打印纯文本的风险,但如果使用Array,则不会打印数组的内容而是打印其内存位置。 though not a real reason but still make sense. 虽然不是一个真正的原因,但仍然有道理。

For this simple program 对于这个简单的程序

String strPassword="Unknown";
char[] charPassword= new char[]{'U','n','k','n','o','w','n'};
System.out.println("String password: " + strPassword);
System.out.println("Character password: " + charPassword);

Output: 输出:

String password: Unknown
Character password: [C@110b053

That's all on why character array is better choice than String for storing passwords in Java. 这就是为什么字符数组比String更好的选择,用于在Java中存储密码。 Though using char[] is not just enough you need to erase content to be more secure. 虽然使用char []还不够,但您需要擦除内容才能更安全。

Hope this will help. 希望这会有所帮助。

The author didn't get the reason right. 作者没有得到正确的理由。 The real overhead in in.readLine() is the copying a char[] buffer when making a String out of it. in.readLine()的实际开销是在in.readLine()生成String时复制char []缓冲区。 The additional copying is the most damning cost when dealing with large data. 在处理大数据时,额外的复制是最大的成本。

It is possible to optimize this within JDK so that the additional copying is not needed. 可以在JDK中对其进行优化,以便不需要额外的复制。

My answer is going to focus on other stack questions along this similar line, others have already posted more direct answers. 我的答案将集中在这个类似的线上的其他堆栈问题,其他人已经发布了更直接的答案。

There have been other questions similar to this, advice seems to go along the lines of using StringBuilder. 还有其他类似的问题 ,建议似乎与使用StringBuilder有关。

If you're concerned with string concentenation this have a look at the performance as described here between three different implementations . 如果您关注字符串集中,请查看此处描述的三种不同实现之间性能 With another stack post which can give you some additional pointers and examples you could try yourself to see the performance. 使用另一个堆栈帖子可以为您提供一些额外的指示和示例,您可以尝试自己查看性能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 “带有char的字符串:”+&#39;c&#39;比带有char的字符串更好或更差:“+ c”? - “a string with a char: ” + 'c' better or worse than “a string with a char: ” + “c”? Java CAS操作的执行速度比C等效的快,为什么? - Java CAS operation performs faster than C equivalent, why? LongAdder如何比AtomicLong表现更好 - How LongAdder performs better than AtomicLong 如果 async/await 只是它们的包装器,为什么它比线程执行得更好? - Why async/await performs better than threads if it is just a wrapper around them? 初始化列表的更好方法 <String> 在java中重复使用char - A better way to initialize List<String> with repeated char in java 在java中,为什么未初始化的char是OK但不是String - In java, why uninitialized char is OK but not String 为什么Char实际上是Java中的NumericType,而不是SymbolicType或String? - Why Char is actually a NumericType in Java, but not a SymbolicType or String? 为什么可以在 Java 中连接 Char 和 String? - Why is possible to concatenate Char and String in Java? (为什么)Tomcat / Java在Linux上的表现比在Windows上表现更好? - (Why) does Tomcat/Java perform better on Linux than on Windows? 使用redis缓存java对象:为什么它应该比ConcurrentHashMap更好? - Using redis to cache java objects: why it should be better than a ConcurrentHashMap?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM