[英]Advantages of non-partitioned table on hive?
Are there any advantages of non-partitioned table on Hive -- their special use cases comparing to partitioned table? Hive上的非分区表有什么优势-与分区表相比,它们的特殊用例?
It will be great if anyone could help. 如果有人可以帮助,那就太好了。 :) :)
Let's put it that way: in the database world, partitioning can be used to solve different kinds of problems. 这么说吧:在数据库世界中,分区可用于解决各种问题。 As long as you have no explicit problem, don't bother with partitions (ie "if it ain't broken, don't fix it") . 只要您没有明显的问题,就不要理会分区(即“如果它没有损坏,请不要修复它”) 。 Whenever you hit a problem, ask a DB architect to find a solution - may involve partitioning, maybe not. 每当您遇到问题时,都要求数据库架构师找到解决方案-可能涉及分区,也许不涉及。
But Hive is not a typical database. 但是Hive不是典型的数据库。 Partitions are everywhere, just because it's a crude workaround for the lack of indexes... 分区无处不在,只是因为缺少索引这是一种粗略的解决方法...
(Well, actually the ORC format has its own workaround [stores min/max values per column per stripe, which allows skipping useless stripes] so partitioning is less critical with that format) (好吧,实际上,ORC格式有其自己的解决方法[在每个条带的每一列中存储最小值/最大值,这允许跳过无用的条带],因此分区对该格式的要求不高)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.