简体   繁体   中英

How do I transform the data set into a dictionary inside the repo. I am using pyspark within foundry

I created a fusion sheet data to be synced to the data set. now, I want to use that data set for creating a dictionary in the repo. I am using pyspark in the repo. later I want to use that dictionary to be passed so that it populates descriptions as it is in Is there a tool available within Foundry that can automatically populate column descriptions? If so, what is it called? .

it would great if anyone can help me creating the dictionary from data set using pyspark in the repo.

The following code would convert your pyspark dataframe into a list of dictionaries:

fusion_rows = map(lambda row: row.asDict(), fusion_df.collect())

However, in your particular case, you can use the following snippet:

col_descriptions = {row["column_name"]: row["description"] for row in fusion_df.collect()}
my_output.write_dataframe(
    my_input.dataframe(),
    column_descriptions=col_descriptions
)

Assuming your Fusion sheet would look like this:

+------------+------------------+
| column_name|       description|
+------------+------------------+
|       col_A| description for A|
|       col_B| description for B|
+------------+------------------+

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM