[英]How can I estimate a table size in HIVE without query?
I want to calculate the table size without querying in HIVE. 我想在不查询HIVE的情况下计算表大小。
How can I do this in HIVE? 我如何在HIVE中做到这一点? (I don't have any permission without selecting in database so I can't use show properties, etc)
(我没有在数据库中选择的任何权限,所以我不能使用show属性等)
(For example) (例如)
dataRows : 100 dataRows:100
columnName(Type) : userName(string), userNumber(int), userCode(bigint), userAge(int) columnName(Type):userName(字符串),userNumber(int),userCode(bigint),userAge(int)
I calculated table size like this. 我这样计算表的大小。
I thought like that string is 8bytes, int is 4bytes, bigint is 8bytes (I didn't consider about record header size and column header size) 我认为该字符串是8bytes,int是4bytes,bigint是8bytes(我没有考虑记录头大小和列头大小)
Would you give me some advice? 你能给我一些建议吗?
hdfs dfs -du -s {table locatoin}
(optional -h) (可选-h)
Eg 例如
hdfs dfs -du -s /user/hive/warehouse/mytable
110265307244 /user/hive/warehouse/mytable
hdfs dfs -du -s -h /user/hive/warehouse/mytable
102.7 G /user/hive/warehouse/mytable
This is not really possible if you have no access to Hive or HDFS. 如果您无法访问Hive或HDFS,则实际上是不可能的。
Hive could be using different compression mechanisms and that could impact the size of the raw data on HDFS as well. Hive可能使用不同的压缩机制,这也可能影响HDFS上原始数据的大小。 If its stored in plain text, you could potentially use this, but I wouldnt say thats the best way to do this.
如果将其存储为纯文本格式,则可以使用它,但是我不会说这是最好的方法。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.