简体   繁体   English

更新Hive表中的值

[英]Update a value in Hive Table

I have gone through some of the Stackoveflow question . 我已经解决了一些Stackoveflow问题。 but I was not able to do the same. 但我无法做到这一点。

Hive doesnot support update but a workaround for that is to create partition to table. Hive不支持更新,但是一种解决方法是创建表的分区。

How to update a record in Hive? 如何在Hive中更新记录?

The update feature in Hive is planned in the next release. Hive中的更新功能计划在下一个版本中发布。

As a workaround , You could try the below : 解决方法是,您可以尝试以下操作:

  1. Add a flag column default as I and a timestamp column 添加标志列默认为I和时间戳列
  2. consider partitions as your primary key fields (combination). 将分区视为您的主键字段(组合)。
  3. whenever a new record (updated ) on this primary key combination set the flag as U. 每当此主键组合上的新记录(已更新)将标志设置为U时。
  4. Write a custom serde class where in only U records are shown. 编写一个自定义serde类,其中仅显示U条记录。

NOTE : There will be duplicated data , but serde should only show the latest U data corresponding to the latest timestamp in the timestamp column. 注意:会有重复的数据,但serde应该仅在timestamp列中显示与最新时间戳相对应的最新U数据。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM