简体   繁体   中英

Delta Table Merge Operation logs Output is not correct number of updated records?

I am performing merge operation on my delta table in spark. I have existing delta table , it already has some records. Now I created another dataframe of csv file, and added one new record and updated one records in that. Please check below snip.

(df_source) is the updated table(temp view)

Now after performing merge operation. The logs generated here are not correct in updated records it shows 3 records updated i have updated only one record. for inserted it shows correctly i have issue with update why it is updating all the records.

Can you please help me to understand what's happening behind the scenes.

delta table UpdatedSourceFile MergeStatment

As per your Merge statement , you are updating the records if IDs in both tables are same . You are getting correct output as, everytime merge statement found the same id in target table as source table since it is updating that record and because of this, you are getting 3 records updated .

As per official documentation, such an update action is considered ambiguous by the SQL semantics of merge since it is not apparent which source record should be utilized to update the matched destination row .

For your reference, kindly find the below documentation link: -

https://docs.microsoft.com/en-us/azure/databricks/spark/latest/spark-sql/language-manual/delta-merge-into

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM