I need to insert unique rows into a hive table based on Customer name and Address.
is there anyway we can generate unique value using customer name and address? I am looking to generate unique_value column like below and select rows with distinct unique_value.
For example like below I want to generate unique_value column
{customer_name} {address} {unique_value}
omar street1 111
ryan stree2 222
omar street1 111
or any other approaches are also appreciated!.
You can try two things. You can either try having a UUID but that will generate a unique id for each row. Something like this would do:
select reflect("java.util.UUID", "randomUUID"), customer_name, address, unique_value from table_name
However if you are planning to have a unique key based on the name and address, you can concat both fields and take a hash of the resulting string (See details of hash function here: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF ). That will ensure that same name and address gets the same key. This query should be sufficient:
select customer_name, address, hash(concat(customer_name, address)) from table_name
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.