There are two tables Customer1 and Customer2
Customer1: List the details of the customer
Customer2: List the updated details of the customer
https://docs.google.com/spreadsheets/d/1GuQaHhZ70D0NHGXuW51B5nNZXrSkthmEduHOhwoZmRg/edit#gid=0
CustomerName has to be fetched from both the tables.If the customer name is updated it has to be fetched from Customer2 table else it has to fetched from Customer1 table.So all customernames should be listed.
Expexted Resultset:
How this can be achieved in spark scala?
You can perform Left Join
on customer1 table then using coalesce
on customer2 table to get first non null value
for customername
column.
Example :
scala> val customer1=Seq((1,"shiva","9994323565"),(2,"Mani","9994323567"),(3,"Sneha","9994323568")).toDF("customerid","customername","contact")
scala> val customer2=Seq((1,"shivamoorthy","9994323565"),(2,"Manikandan","9994323567")).toDF("customerid","customername","contact")
scala> customer1.as("c1")
.join(customer2.as("c2"),$"c1.customerid" === $"c2.customerid","left")
.selectExpr("c1.customerid",
"coalesce(c2.customername,c1.customername) as customername")
.show()
Result:
+----------+------------+
|customerid|customername|
+----------+------------+
| 1|shivamoorthy|
| 2| Manikandan|
| 3| Sneha|
+----------+------------+
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.