简体   繁体   中英

ADF Copy Activity Fails CSV to Parquet when CSV has space in header column

When using a copy activity in Azure Data Factory to copy a typical CSV file with a header row into Parquet sink, the SINK fails with the following error due to the column names in the CSV having spaces in the header.

The column name is invalid. Column name cannot contain these character:[,;{}()\n\t=]

The CSV is pipe delimited and displays just fine using the preview feature of the dataset with the first row marked as the header. I see no options to handle this use-case on the parquet side (sink) of the copy activity. I realize this can probably be addressed using a data flow to transform column names to remove spaces, but does that mean the native copy activity is incapable of handling this condition where a space in included in a header row?

在此处输入图像描述

EDIT: I should have added that dataset uses default mappings so that we can use the same dataset for any CSV to PARQUET copy. The answer provided will work for explicit mappings, but we don't see any resolution for folks who use default/dynamic mappings since we do not have access to the column names to remove spaces.

As we can note from the official Doc here

Error code: ParquetInvalidColumnName

  • Message : The column name is invalid. Column name cannot contain these character:[,;{}()\n\t=] The column name is invalid. Column name cannot contain these character:[,;{}()\n\t=]

  • Cause : The column name contains invalid characters.

  • Resolution : Add or modify the column mapping to make the sink column name valid.

If you would like continue to use copy activity, there are few workarounds

1. make sure you have selected Column delimiter as Pipe(|)

在此处输入图像描述

2. If feasible, in mapping settings > import schema and rename the column name without spaces in destination column.

在此处输入图像描述

This is still an ongoing issue or request, follow here for more.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM