简体   繁体   English

使用云数据融合创建从 SQL Server 到 BigQuery 的数据管道的问题

[英]Problem with creating a data pipeline from SQL Server to BigQuery using cloud data fusion

I am trying to create a data pipeline from "SQL SERVER (from GCP VM)" To "BigQuery" using CLOUD DATA FUSION;我正在尝试使用 CLOUD DATA FUSION 创建从“SQL SERVER(来自 GCP VM)”到“BigQuery”的数据管道; I have done all the below setup configurations,我已经完成了以下所有设置配置,

  1. Created the new instance in Cloud data fusion.在云数据融合中创建了新实例。
  2. Added this as a service account in IAM & Admin.在 IAM & Admin 中将此添加为服务帐户。
  3. Installed the JDBC driver in SQL Server plugin在 SQL Server 插件中安装 JDBC 驱动程序
  4. Create the wrangler and read the data from SQL server using this SQL Server plugin (in this step I can successfully authenticate my SQL server and I can see my SQL table data in it)创建 wrangler 并使用此 SQL Server 插件从 SQL Server 读取数据(在此步骤中,我可以成功验证我的 SQL Server,并且可以在其中看到我的 SQL 表数据)
  5. I Completed the pipleine config by adding Bigquery as a sink.我通过将 Bigquery 添加为接收器来完成管道配置。

And I try run the pipeline and it end up with few errors;我尝试运行管道,结果几乎没有错误; I have tried few google search but I didn't get the answer.我尝试了几次谷歌搜索,但没有得到答案。

I was able to create a data fusion pipeline between "GCS To BigQuery" and it was working fine.我能够在“GCS To BigQuery”之间创建数据融合管道,并且运行良好。 but this "SQL server to big query" pipeline showing some Error.但是这个“SQL server to big query”管道显示了一些错误。

Could anyone please help me on this?任何人都可以帮我解决这个问题吗?

Here is the error details,这是错误的详细信息,

2020-01-10 13:00:47,528 - WARN [Thread-95:oahmLocalJobRunner@589] - job_local976595976_0001 java.lang.Exception: java.lang.NullPointerException at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:491) ~[hadoop-mapreduce-client-common-2.9.2.jar:na] at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:551) ~[hadoop-mapreduce-client-common-2.9.2.jar:na] java.lang.NullPointerException: null at org.apache.hadoop.mapreduce.lib.db.DataDrivenDBInputFormat.createDBRecordReader(DataDrivenDBInputFormat.java:281) ~[hadoop-mapreduce-client-core-2.9.2.jar:na] at io.cdap.plugin.db.batch.source.DataDrivenETLDBInputFormat.createDBRecordReader(DataDrivenETLDBInputFormat.java:124) ~[1578661227434-0/:na] at org.apache.hadoop.mapreduce.lib.db.DBInputFormat.createRecordReader(DBInputFormat.java:245) ~[hadoop-mapreduce-client-core-2.9.2.jar:na] at io.cdap.cdap.etl.batch.preview.LimitingInputFormat.createRecordReader(LimitingInputFormat.java:51) ~[cdap-etl-core-6.1.0. 2020-01-10 13:00:47,528 - 警告 [Thread-95:oahmLocalJobRunner@589] - job_local976595976_0001 java.lang.Exception: java.lang.NullPointerException at org.apache.hadoop.mapred.LocalJobRunners$( .java:491) ~[hadoop-mapreduce-client-common-2.9.2.jar:na] 在 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:551) ~[hadoop-mapreduce- client-common-2.9.2.jar:na] java.lang.NullPointerException: null at org.apache.hadoop.mapreduce.lib.db.DataDrivenDBInputFormat.createDBRecordReader(DataDrivenDBInputFormat.java:281) ~[hadoop-mapreduce-client- core-2.9.2.jar:na] 在 io.cdap.plugin.db.batch.source.DataDrivenETLDBInputFormat.createDBRecordReader(DataDrivenETLDBInputFormat.java:124) ~[1578661227434-0/:na] 在 org.apache.hadoop.mapreduce .lib.db.DBInputFormat.createRecordReader(DBInputFormat.java:245) ~[hadoop-mapreduce-client-core-2.9.2.jar:na] 在 io.cdap.cdap.etl.batch.preview.LimitingInputFormat.createRecordReader( LimitingInputFormat.java:51) ~[cdap-etl-core-6.1.0. jar:na] at io.cdap.cdap.internal.app.runtime.batch.dataset.input.MultiInputFormat.createRecordReader(MultiInputFormat.java:92) ~[na:na] at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.(MapTask.java:521) ~[hadoop-mapreduce-client-core-2.9.2.jar:na] at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) ~[hadoop-mapreduce-client-core-2.9.2.jar:na] at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) ~[hadoop-mapreduce-client-core-2.9.2.jar:na] at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:270) ~[hadoop-mapreduce-client-common-2.9.2.jar:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_232] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_232] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_232] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_232] at java.lang.Thread.r jar:na] 在 io.cdap.cdap.internal.app.runtime.batch.dataset.input.MultiInputFormat.createRecordReader(MultiInputFormat.java:92) ~[na:na] 在 org.apache.hadoop.mapred.MapTask$ NewTrackingRecordReader.(MapTask.java:521) ~[hadoop-mapreduce-client-core-2.9.2.jar:na] 在 org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) ~[hadoop- mapreduce-client-core-2.9.2.jar:na] 在 org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) ~[hadoop-mapreduce-client-core-2.9.2.jar:na ] 在 org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:270) ~[hadoop-mapreduce-client-common-2.9.2.jar:na] 在 java.util.concurrent.Executors $RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_232] 在 java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_232] 在 java.util.concurrent .ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_232] 在 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_232] 在 java.lang .Thread.r un(Thread.java:748) ~[na:1.8.0_232] 2020-01-10 13:00:50,841 - ERROR [MapReduceRunner-phase-1:icciarProgramControllerServiceAdapter@97] - MapReduce Program 'phase-1' failed. un(Thread.java:748) ~[na:1.8.0_232] 2020-01-10 13:00:50,841 - 错误 [MapReduceRunner-phase-1:icciarProgramControllerServiceAdapter@97] - MapReduce 程序“phase-1”失败。 java.lang.IllegalStateException: MapReduce JobId job_local976595976_0001 failed at com.google.common.base.Preconditions.checkState(Preconditions.java:176) ~[com.google.guava.guava-13.0.1.jar:na] at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService.run(MapReduceRuntimeService.java:416) ~[na:na] at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:52) ~[com.google.guava.guava-13.0.1.jar:na] at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService$2$1.run(MapReduceRuntimeService.java:450) [na:na] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_232] 2020-01-10 13:00:50,842 - ERROR [MapReduceRunner-phase-1:icciarProgramControllerServiceAdapter@98] - MapReduce program 'phase-1' failed with error: MapReduce JobId job_local976595976_0001 failed. java.lang.IllegalStateException: MapReduce JobId job_local976595976_0001 在 com.google.common.base.Preconditions.checkState(Preconditions.java:176) ~[com.google.guava.guava-13.0.1.jar:na] 上失败。 cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService.run(MapReduceRuntimeService.java:416) ~[na:na] at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:52) ) ~[com.google.guava.guava-13.0.1.jar:na] 在 io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService$2$1.run(MapReduceRuntimeService.java:450) [na:na ] 在 java.lang.Thread.run(Thread.java:748) [na:1.8.0_232] 2020-01-10 13:00:50,842 - 错误 [MapReduceRunner-phase-1:icciarProgramControllerServiceAdapter@98] - MapReduce 程序'阶段 1' 失败并出现错误:MapReduce JobId job_local976595976_0001 失败。 Please check the system logs for more details.请检查系统日志以获取更多详细信息。 java.lang.IllegalStateException: MapReduce JobId job_local976595976_0001 failed at com.google.common.base.Preconditions.checkState(Preconditions.java:176) ~[com.google.guava.guava-13.0.1.jar:na] at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService.run(MapReduceRuntimeService.java:416) ~[na:na] at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:52) ~[com.google.guava.guava-13.0.1.jar:na] at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService$2$1.run(MapReduceRuntimeService.java:450) [na:na] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_232] 2020-01-10 13:00:50,916 - ERROR [WorkflowDriver:iccdSmartWorkflow@552] - Pipeline '0f084034-33a9-11ea-95f6-8e2648ebe039' failed. java.lang.IllegalStateException: MapReduce JobId job_local976595976_0001 在 com.google.common.base.Preconditions.checkState(Preconditions.java:176) ~[com.google.guava.guava-13.0.1.jar:na] 上失败。 cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService.run(MapReduceRuntimeService.java:416) ~[na:na] at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:52) ) ~[com.google.guava.guava-13.0.1.jar:na] 在 io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService$2$1.run(MapReduceRuntimeService.java:450) [na:na ] 在 java.lang.Thread.run(Thread.java:748) [na:1.8.0_232] 2020-01-10 13:00:50,916 - 错误 [WorkflowDriver:iccdSmartWorkflow@552] - 管道 '0f084034-33a9-11ea -95f6-8e2648ebe039' 失败。 2020-01-10 13:00:51,225 - ERROR [WorkflowDriver:icciarwWorkflowProgramController@89] - Workflow service 'workflow.default.0f084034-33a9-11ea-95f6-8e2648ebe039.DataPipelineWorkflow.20288f05-33a9-11ea-a505-8e2648ebe039' failed. 2020 年 1 月 10 日 13:00:51,225 - 错误 [WorkflowDriver:icciarwWorkflowProgramController@89] - 工作流服务 'workflow.default.0f084034-33a9-11ea-95f6-8e2648ebe039.DataPipeline2039.DataPipeline2039.DataPipeline28039.DataPipeline283839. . java.lang.IllegalStateException: MapReduce JobId job_local976595976_0001 failed at com.google.common.base.Preconditions.checkState(Preconditions.java:176) ~[com.google.guava.guava-13.0.1.jar:na] at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService.run(MapReduceRuntimeService.java:416) ~[na:na] at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:52) ~[com.google.guava.guava-13.0.1.jar:na] at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService$2$1.run(MapReduceRuntimeService.java:450) ~[na:na] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_232] java.lang.IllegalStateException: MapReduce JobId job_local976595976_0001 在 com.google.common.base.Preconditions.checkState(Preconditions.java:176) ~[com.google.guava.guava-13.0.1.jar:na] 上失败。 cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService.run(MapReduceRuntimeService.java:416) ~[na:na] at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:52) ) ~[com.google.guava.guava-13.0.1.jar:na] 在 io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService$2$1.run(MapReduceRuntimeService.java:450) ~[na: na] 在 java.lang.Thread.run(Thread.java:748) [na:1.8.0_232]

As per issue records reported, you have persisted with java.lang.nullpointerexception error, that might reflect the usage of a null when the object required within an application run path.根据报告的问题记录,您一直存在java.lang.nullpointerexception错误,这可能反映了在应用程序运行路径中需要对象时使用了 null。

Assuming the fact that you've successfully configured JDBC driver , I would recommend to check the source Database Properties across your pipeline in order to determine the undefined field, supposedly can be Import Query property field, that is used to import data from specified table by supplying SELECT query with appropriate $CONDITIONS if the number of splits to generate is more than 1:假设您已成功配置JDBC 驱动程序,我建议您检查管道中的源数据库属性以确定未定义的字段,据说可以是导入查询属性字段,用于从指定表导入数据如果要生成的拆分数大于 1,则使用适当的$CONDITIONS提供SELECT查询:

SELECT * FROM <table> WHERE $CONDITIONS

UPDATE: https://issues.cask.co/browse/CDAP-16453 It's a known issue, fixed in 6.1.2更新: https : //issues.cask.co/browse/CDAP-16453这是一个已知问题,已在 6.1.2 中修复

"Same error on MySQL 5.x Strange enough, if you deploy the pipeline and run it it works... I'm thinking about decoupling pipelines to have small sql-to-storage and the big pipeline in the outgoing flow" “在 MySQL 5.x 上出现同样的错误 很奇怪,如果你部署管道并运行它就可以工作......我正在考虑将管道解耦以在传出流中拥有小的 sql-to-storage 和大管道”

regards Virgilio问候 Virgilio

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM