简体繁体中英

Problem loading ISO-8859-1 into BigQuery using DataFlow (Apache Beam)

原文 2019-07-23 06:46:33 6 1 java/ google-cloud-dataflow/ apache-beam

I'm trying to load an ISO-8859-1 file into BigQuery using DataFlow. I've built a template with Apache Beam Java. Everything works well but when I check the content of the Bigquery table I see that some characters like 'ñ' or accents 'á','é', etc. haven't been stored propertly, they have been stored as .

I've tried several charset changing before write into BigQuery. Also, I've created a special ISOCoder passed to the pipeline using the method setCoder(), but nothing works.

Does anyone know if is it possible to load into BigQuery this kind of files using Apache Beam? Only UTF-8?

Thanks in advance for your help.

1 answers

This feature is currently not available in the Java SDK of Beam. In Python this seems to be possible by using the additional_bq_parameters when using WriteToBigQuery , see: https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery.py#L177

Apache Beam Dataflow BigQuery

Apache Beam on Dataflow Not Accepting ValueProvider for BigQuery Query

JDBC iso-8859-1 encoding

Javamail ISO-8859-1 formatting

Regex and ISO-8859-1 charset in java

Jsoup parse iso-8859-1 file

ISO-8859-1 character encoding not working in Linux

ISO-8859-1 to UTF-8 in Java

Java fast stream copy with ISO-8859-1

Converting UTF-8 to ISO-8859-1 in Java

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Apache Beam Dataflow BigQuery Apache Beam on Dataflow Not Accepting ValueProvider for BigQuery Query JDBC iso-8859-1 encoding Javamail ISO-8859-1 formatting Regex and ISO-8859-1 charset in java Jsoup parse iso-8859-1 file ISO-8859-1 character encoding not working in Linux ISO-8859-1 to UTF-8 in Java Java fast stream copy with ISO-8859-1 Converting UTF-8 to ISO-8859-1 in Java

Related Tags

Problem loading ISO-8859-1 into BigQuery using DataFlow (Apache Beam)

Question

1 answers

solution1 1 2019-07-23 13:10:04

solution1
1 2019-07-23 13:10:04