简体繁体 English

从SAP Hana复制到Data Lake Store时，Azure数据管道复制活动会丢失列名

[英]Azure Data Pipeline Copy Activity loses column names when copying from SAP Hana to Data Lake Store

原文 2017-06-29 13:53:18 0 1 azure/ hana/ azure-data-factory/ azure-data-lake

I am trying to copy data from SAP Hana to Azure Data Lake Store (DLS) using a Copy Activity in a Data Pipeline via Azure Data Factory. 我正在尝试使用数据管道中的复制活动通过Azure数据工厂将数据从SAP Hana复制到Azure数据湖存储（DLS）。

Our copy activity runs fine and we can see that rows made it from Hana to the DLS, but they don't appear to have column names (instead they are just given 0-indexed numbers). 我们的复制活动运行良好，我们可以看到从Hana到DLS的行都行了，但是它们似乎没有列名（相反，它们只是被赋予了0索引的数字）。

This link says “For structured data sources, specify the structure section only if you want map source columns to sink columns, and their names are not the same .” 该链接显示 “对于结构化数据源，仅当您希望将映射源列映射到接收器列并且它们的名称不同时 ，才指定结构部分。”

We are fine using the original column names from the SAP Hana table, so it seems like we shouldn't need to specify the structure section in our dataset. 我们可以很好地使用SAP Hana表中的原始列名，因此似乎不需要在数据集中指定structure部分。 However, even when we do, we still just see numbers for column names. 但是，即使这样做，我们仍然只能看到列名称的数字。

We have also seen the translator property at this link , but are not sure if that is the route we need to go. 我们还在此链接上看到了translator属性，但是不确定这是否是我们需要走的路。

Can anyone tell me why we aren't seeing the original column names copied into DLS and how we can change that? 谁能告诉我为什么我们看不到原始列名复制到DLS中以及如何更改它？ Thank you! 谢谢！

UPDATE 更新

Setting the firstRowAsHeader property of the format section on our dataset to true basically solved the problem. 将数据集上format部分的firstRowAsHeader属性设置为true基本上解决了这个问题。 The console still shows the numerical indices, but now includes the headers we are after as the first row. 控制台仍显示数字索引，但现在包括我们作为第一行使用的标题。 Upon downloading and opening the file, we can see the numbers are not there (the console just shows them for whatever reason), and it is a standard comma-delimeted file with a header row and one row entry per line. 下载并打开文件后，我们可以看到没有数字（控制台出于任何原因仅显示了数字），并且它是标准的逗号分隔文件，标题行且每行一行。

Example: 例：

COLUMNA,COLUMNB aVal1,bVal1 aVal2,bVal2

We can now tell our sources and sinks to write and expect this format when reading. 现在，我们可以告诉我们的源和接收端写入并在读取时期望使用这种格式。

BONUS UPDATE: 奖金更新：

To get rid of the numerical indices and see the proper column headers in the console, click Format in the top-left corner, and then check the " First row is a header " box toward the bottom of the resulting blade 要摆脱数字索引并在控制台中查看正确的列标题，请单击左上角的“ 格式 ”，然后选中结果刀片底部的“ 第一行是标题 ”框