[英]Writing data to timestreamDb from AWS Glue
我正在嘗試使用粘合流並將數據寫入 AWS TimestreamDB,但我很難配置 JDBC 連接。
我正在執行的步驟如下和文檔鏈接: https://docs.aws.amazon.com/timestream/latest/developerguide/JDBC.configuring.html
這是我的代碼:
url = jdbc:timestream://AccessKeyId=<myAccessKeyId>;SecretAccessKey=<mySecretAccessKey>;SessionToken=<mySessionToken>;Region=us-east-1
source_df = sparkSession.read.format("jdbc").option("url",url).option("dbtable","IoT").option("driver","software.amazon.timestream.jdbc.TimestreamDriver").load()
datasink1 = glueContext.write_dynamic_frame.from_options(frame = applymapping0, connection_type = "jdbc", connection_options = {"url":url,"driver":"software.amazon.timestream.jdbc.TimestreamDriver", database = "CovidTestDb", dbtable = "CovidTestTable"}, transformation_ctx = "datasink1")
到目前為止(2022 年 4 月),不支持使用 timestream 的 jdbc 驅動程序進行寫操作(查看代碼並看到一堆不支持寫的異常)。 不過,可以使用膠水從時間流中讀取數據。 以下步驟對我有用:
jdbc:timestream://Region=<timestream-db-region>
應該就足夠了driver
和fetchsize
選項option("driver","software.amazon.timestream.jdbc.TimestreamDriver")
option("fetchsize", "100")
(根據您的需要調整 fetchsize)以下是從時間流中讀取 dataframe 的完整示例:
val df = sparkSession.read.format("jdbc")
.option("url", "jdbc:timestream://Region=us-east-1")
.option("driver","software.amazon.timestream.jdbc.TimestreamDriver")
// optionally add a query to narrow the data to fetch
.option("query", "select * from db.tbl where time between ago(15m) and now()")
.option("fetchsize", "100")
.load()
df.write.format("console").save()
希望這可以幫助
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.