简体   繁体   English

将简单的 synapsesql 实现从 Spark 2.4.8 迁移到 Spark 3.1.2 时需要进行哪些更改?

[英]What changes are required when moving simple synapsesql implementation from Spark 2.4.8 to Spark 3.1.2?

I have a simple implementation of.write.synapsesql() method (code shown below) that works in Spark 2.4.8 but not in Spark 3.1.2 ( documentation/example here ).我有一个简单的 .write.synapsesql() 方法实现(代码如下所示),它在 Spark 2.4.8 中有效,但在 Spark 3.1.2 中无效( 此处为文档/示例)。 The data in use is a simple notebook-created foobar type table.使用的数据是一个简单的笔记本创建的 foobar 类型表。 Searching for key phrases online from and about the error did not turn up any new information for me.在网上搜索有关错误的关键短语并没有为我提供任何新信息。 What is the cause of the error in 3.1.2? 3.1.2错误的原因是什么?

Spark 2.4.8 version (behaves as desired): Spark 2.4.8 版本(按需要运行):

val df = spark.sql("SELECT * FROM TEST_TABLE")
df.write.synapsesql("my_local_db_name.schema_name.test_table", Constants.INTERNAL, None)

Spark 3.1.2 version (extra method is same as in documentation, can also be left out with a similar result): Spark 3.1.2版本(额外方法同文档,也可以省略,结果类似):

val df = spark.sql("SELECT * FROM TEST_TABLE")
df.write.synapsesql("my_local_db_name.schema_name.test_table", Constants.INTERNAL, None, 
                     Some(callBackFunctionToReceivePostWriteMetrics))

The resulting error (only in 3.1.2) is:由此产生的错误(仅在 3.1.2 中)是:

WriteFailureCause -> java.lang.IllegalArgumentException: Failed to derive `https` scheme based staging location URL for SQL COPY-INTO}

As the documentation from the question states, ensure that you are setting the options correctly with正如问题中的文档所述,请确保您正确设置了选项

val writeOptionsWithAADAuth:Map[String, String] = Map(Constants.SERVER -> "<dedicated-pool-sql-server-name>.sql.azuresynapse.net",
                                            Constants.TEMP_FOLDER -> "abfss://<storage_container_name>@<storage_account_name>.dfs.core.windows.net/<some_temp_folder>")

and including the options in your.write statement like so:并在 your.write 语句中包括选项,如下所示:

df.write.options(writeOptionsWithAADAuth).synapsesql(...)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 spark.write.synapsesql 选项与 Azure Synapse Spark Pool - spark.write.synapsesql options with Azure Synapse Spark Pool Spark 数据框:它什么时候实现? - Spark dataframe : When does it materialize? 不支持的编码:使用 Kusto Spark 连接器或使用 Spark 版本 < 3.3.0 的 Kusto 导出从 Kusto 读取时的 DELTA_BYTE_ARRAY - Unsupported encoding: DELTA_BYTE_ARRAY when reading from Kusto using Kusto Spark connector or using Kusto export with Spark version < 3.3.0 如何在创建带有引号的表时从 Spark 连接器读取 Snowflake 表? - How to read Snowflake table from Spark connector when the table was created with quotes around it? 从简单的正则表达式提取转向 NER? - Moving away from simple regex extraction to NER? 在 Dataproc 集群上部署时 Spark 应用程序失败 - Spark Application Failing when deployed on Dataproc cluster pandas to_csv function 从 Spark UDF 调用时不写入 Blob 存储 - pandas to_csv function not writing to Blob Storage when called from Spark UDF 使用火花流从 pubsublite 接收消息时出现问题 - Problems receving messages from pubsublite with spark streaming 从火花日志中提取指标 - pull out metrics from spark logs 使用 SageMaker 处理的等效 Glue Spark 配置是什么? - What is the equivalent Glue Spark configuration to use SageMaker processing?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM