I want to fill null values of a column in sqldb table which I called to databricks using cursor with values of the same column in datalake table by joining them.
I called a table from sqldb in databricks using cursor(will call it table1)
cursor = access_token.cursor()
cursor.execute('SELECT label_name,unit FROM [dbo].[table1]')
label_name | unit |
---|---|
A | NULL |
B | NULL |
C | NULL |
D | NULL |
A | NULL |
D | NULL |
and I have another table which is from datalake(i will call it table2)
df = spark.sql("select distinct label_name,unit from aod.table2 where unit is not null")
label_name | unit |
---|---|
A | a_1 |
B | b_1 |
C | c_1 |
D | d_1 |
The unit values in table1 are all NULLs and the required unit values are all available in the table2.
I need to join those table1 and table2 on label_name to fill the NULL values for unit column in table 1 with unit values from table2.
result that I want
label_name | unit |
---|---|
A | a_1 |
B | b_1 |
C | c_1 |
D | d_1 |
A | a_1 |
D | d_1 |
anyone could suggest any idea to accomplish this?
any help would be much appreciated!
As per the repro from my end, I had created two dataframes name df1 => table1 and df2 = table2 as per your requirement.
As shown above read both tables into the dataframe and apply inner join as shown below:
df = df1.join(df2, df2.label_name == df1.label_name, "inner").select(df2.label_name,df2.unit)
display(df)
Now you can convert dataframe into a table in the SQLDB as shown:
You can verify the same from SQL endpoint:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.