简体   繁体   English

AWS Glue- 作业运行没有错误,但不显示输出

[英]AWS Glue- job runs with no errors, but output is not displayed

I have data in AWS MySQL RDS and the requirement is to grab data from a table to the csv file and place it in S3.我在 AWS MySQL RDS 中有数据,要求是从表中获取数据到 csv 文件并将其放置在 S3 中。 To achieve that i am using AWS glue and have code as shown below.为了实现这一点,我使用了 AWS 胶水并具有如下所示的代码。 The job runs with no errors and the output is not displayed in S3 bucket.作业运行时没有错误,并且输出未显示在 S3 存储桶中。 Please help.请帮忙。

 import sys
   from awsglue.transforms import *
   from awsglue.utils import getResolvedOptions
   from pyspark.context import SparkContext
   from awsglue.context import GlueContext
   from awsglue.job import Job
    import boto3

    ## @params: [JOB_NAME]
    args = getResolvedOptions(sys.argv, ['JOB_NAME'])

    aws_region = "your-aws-region-code"
    s3_path = "s3-prefix"
    glue_database = "glue-database-name"
    table="glue-table name"
    target_format = "csv"

    sc = SparkContext()
    glueContext = GlueContext(sc)
    spark = glueContext.spark_session
    job = Job(glueContext)
    job.init(args['JOB_NAME'], args)

    client = boto3.client(service_name='glue', region_name=aws_region)
    responseGetTables = client.get_tables(DatabaseName=glue_database)

    tableList = responseGetTables['TableList']
    tables = []
    for tableDict in tableList:
      tables.append(tableDict['Name'])

    for table in tables:
      datasource = glueContext.create_dynamic_frame.from_catalog(database = glue_database, table_name = table)
      datasink = glueContext.write_dynamic_frame.from_options(frame = datasource, connection_type = "s3", connection_options = {"path": s3Path + table}, format = target_format)

    job.commit()

replace 2nd last line with this , actually it should s3_path instead of s3path用这个替换最后一行,实际上它应该是 s3_path 而不是 s3path

datasink = glueContext.write_dynamic_frame.from_options(frame = datasource, connection_type = "s3", connection_options = {"path": s3_path + table}, format = target_format) datasink =glueContext.write_dynamic_frame.from_options(frame = datasource, connection_type = "s3", connection_options = {"path": s3_path + table}, format = target_format)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM