简体   繁体   中英

Generate unique customer id / insert unique rows in hive

I need to insert unique rows into a hive table based on Customer name and Address.

is there anyway we can generate unique value using customer name and address? I am looking to generate unique_value column like below and select rows with distinct unique_value.

For example like below I want to generate unique_value column

{customer_name} {address} {unique_value}

omar street1 111

ryan stree2 222

omar street1 111

or any other approaches are also appreciated!.

You can try two things. You can either try having a UUID but that will generate a unique id for each row. Something like this would do:

select reflect("java.util.UUID", "randomUUID"), customer_name, address, unique_value from table_name

However if you are planning to have a unique key based on the name and address, you can concat both fields and take a hash of the resulting string (See details of hash function here: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF ). That will ensure that same name and address gets the same key. This query should be sufficient:

select customer_name, address, hash(concat(customer_name, address)) from table_name

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM