简体   繁体   English

Azure 数据工厂 DataFlow 错误:键分区不允许计算列

[英]Azure Data Factory DataFlow Error: Key partitioning does not allow computed columns

参数

源设置

优化

We have a generic dataflow that works for many tables, the schema is detected at runtime.我们有一个适用于许多表的通用数据流,在运行时检测模式。 We are trying to add a Partition Column for the Ingestion or Sink portion of the delta.我们正在尝试为增量的摄取或接收器部分添加一个分区列。

We are getting error: Azure Data Factory DataFlow Error: Key partitioning does not allow computed columns Job failed due to reason: at Source 'Ingestion'(Line 7/Col 0): Key partitioning does not allow computed columns我们收到错误:Azure 数据工厂 DataFlow 错误:键分区不允许计算列作业失败,原因是:源“摄取”(第 7 行/第 0 行):键分区不允许计算列

Can we pass the partition column as a parameter to a generic dataflow?我们可以将分区列作为参数传递给通用数据流吗?

Can we pass the partition column as a parameter to a generic dataflow?我们可以将分区列作为参数传递给通用数据流吗?

I tried your scenario and got similar error.我试过你的场景并得到了类似的错误。

在此处输入图像描述

There is a limitation of key partition method is we cannot apply any calculation to the partition column while declaring it.键分区方法的一个限制是我们不能在声明分区列时对其进行任何计算。 Instead, this must be created in advanced, either using derived column or read in from source.相反,这必须提前创建,使用派生列或从源读取。

To resolve this, you can try following steps -要解决此问题,您可以尝试执行以下步骤 -

  • First, I created a pipeline parameter with datatype string and gave column name as value.首先,我创建了一个数据类型为字符串的管道参数,并将列名作为值。在此处输入图像描述

  • Click on Dataflow >> Go to Parameter >> In value of parameter select Pipeline expression >> and pass the above created parameter.点击Dataflow >> Go to Parameter >> In value of parameter select Pipeline expression >> 并传递上面创建的参数。在此处输入图像描述

OUTPUT: OUTPUT:

It is taking it as partition key column and partitioning data accordingly.它将其作为分区键列并相应地分区数据。在此处输入图像描述

Reference : How To Use Data Flow Partitions To Optimize Spark Performance In Data Factor参考如何使用数据流分区优化数据因素中的 Spark 性能

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 API调用Azure数据工厂报错500 - Error 500 in API Call with Azure Data Factory Azure 数据工厂 HDFS 数据集预览错误 - Azure Data Factory HDFS dataset preview error Azure 数据工厂 V2 中的 LeaseAlreadyPresent 错误 - LeaseAlreadyPresent Error in Azure Data Factory V2 从Azure数据工厂调用Azure Function报404错误 - Calling Azure Function from Azure Data Factory Gives 404 Error 在使用 terraform 部署脚本以启用 Azure 数据工厂中的客户管理密钥时,我收到一个错误,我在下面说明了这一点 - On deploying Script to enable Customer Managed Key in Azure Data Factory using terraform, I am getting an error which I have stated below 为什么 Dataflow 不允许下载某个指标 - Why does Dataflow not allow the download of a certain metric Azure 数据工厂 API - Azure data factory API 使用 Azure 数据工厂将源文件中的列与接收器表列匹配以确保它们匹配 - matching the columns in a source file with sink table columns to make sure they match using Azure Data Factory 从 Azure 数据工厂连接到 SQL Db 时出错 - Error connecting to SQL Db from Azure Data Factory 如何使用 Azure 密钥保管库将 SQL 服务器的连接字符串作为秘密存储在 Azure 数据工厂中 - How to use Azure key vault for storing connection string of SQL Server as secret in Azure Data Factory
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM