AWS Glue Job - Load parquet file from S3 to RDS jsonb column

Question

I have a parquet file in S3 which has several columns and one of them is json. I have the same format in an RDS database with one column as jsonb.

I would like to copy the parquet file to RDS but how do I cast the file to jsonb data type since Glue doesn't support json column type. When I try to insert the column as string, I get an error. Any ideas on how I can enter a json column to RDS jsonb column?

 An error occurred while calling o145.pyWriteDynamicFrame. ERROR: column "json_column" is of type jsonb but expression is of type character varyin

import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job

## @params: [JOB_NAME]
args = getResolvedOptions(sys.argv, ['JOB_NAME'])

sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args['JOB_NAME'], args)

DataSource0 = glueContext.create_dynamic_frame.from_options(connection_type = "s3", format = "parquet", connection_options = {"paths": ["s3://folder"], "recurse":True}, transformation_ctx = "DataSource0")
Transform0 = ApplyMapping.apply(frame = DataSource0, mappings = [("id", "long", "id", "long"), ("name", "string", "name", "string"), ("json_column", "string", "json_column", "string")], transformation_ctx = "Transform0")

DataSink0 = glueContext.write_dynamic_frame.from_catalog(frame = Transform0, database = "postgres", table_name = "table", transformation_ctx = "DataSink0")
job.commit()

Answer 1

One path would be to connect to your RDS utilizing psychopg2, iterate over your dataset and load it directly.

How to insert JSONB into Postgresql with Python?

AWS Glue Job - Load parquet file from S3 to RDS jsonb column

Question

1 answers

solution1
0 2021-04-12 16:47:25

AWS Glue Job - Load parquet file from S3 to RDS jsonb column

Question

1 answers

solution1 0 2021-04-12 16:47:25

solution1
0 2021-04-12 16:47:25