简体   繁体   English

使用Hadoop文本对象toString()方法

[英]Using Hadoop Text Object toString() Method

I understood the difference between String & Text. 我了解字符串和文本之间的区别。 Difference between Text and String in Hadoop Hadoop中文本和字符串之间的区别

Question is If we are saying that String maximum storage size is 32767 bytes. 问题是如果我们说String的最大存储大小为32767字节。

Text t = new Text("Hadoo... 2GB of content");
...
String c = t.toString();

How "c" will hold 2GB of data if it has size limitation? 如果“ c”有大小限制,它将如何容纳2GB的数据?

What am I missing here? 我在这里想念什么?

The maximum size of a Java String is not 32k bytes. Java字符串的最大大小不是32k字节。 It is the size needed to store Integer.MAX_VALUE characters, which is 2^31 - 1 (~2 Billion), which is around 4GB (see this post ). 它是存储Integer.MAX_VALUE字符所需的大​​小,即2 ^ Integer.MAX_VALUE (约20亿),大约4GB(请参阅此帖子 )。

The post that you mention, refers to the size limit of the deprecated UTF-8 class , not Java's String class. 您提到的帖子是指已弃用的UTF-8类 (而不是Java的String类)的大小限制。

Anyway, if you need so much space for a single Text instance, I would advise you to reconsider your algorithm. 无论如何,如果单个Text实例需要那么多空间,我建议您重新考虑算法。 As Peter Lawrey says in the afforementioned post "I suspect all the works of JK Rowling would fit into one string." 正如彼得·劳瑞(Peter Lawrey)在上述文章中所说:“我怀疑罗琳(JK Rowling)的所有作品都可以合而为一。”

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM