how can I show the DataFrame with job etl of aws glue?
I tried this code below but doesn't display anything.
df.show()
code
datasource0 = glueContext.create_dynamic_frame.from_catalog(database = "flux-test", table_name = "tab1", transformation_ctx = "datasource0")
sourcedf = ApplyMapping.apply(frame = datasource0, mappings = [("id", "long", "id", "long"),("Rd.Id_Releve", "string", "Rd.Id_R", "string")])
sourcedf = sourcedf.toDF()
data = []
schema = StructType(
[
StructField('PM',
StructType([
StructField('Pf', StringType(),True),
StructField('Rd', StringType(),True)
])
),
])
cibledf = sqlCtx.createDataFrame(data, schema)
cibledf = sqlCtx.createDataFrame(sourcedf.rdd.map(lambda x: Row(PM=Row(Pf=str(x.id_prm), Rd=None ))), schema)
print(cibledf.show())
job.commit()
In your glue console, after you run your glue job, in job listing there would be a column for Logs / Error logs.
Click on the Logs and this would take you to the cloudwatch logs associated to your job. Browse though for the print statement.
also please check here: Convert dynamic frame to a dataframe and do show()
ADDed working/test code sample
Code sample:
zipcode_dynamicframe = glueContext.create_dynamic_frame.from_catalog(
database = "customer_db",
table_name = "zipcode_master")
zipcode_dynamicframe.printSchema()
zipcode_dynamicframe.toDF().show(10)
Screenshot for zipcode_dynamicframe.show() in cloudwatch log:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.