简体   繁体   English

如何使用Dataflow的jdbc to Bigquery模板将数据从Oracle 11g第2版传输到Bigquery?

[英]How to transfer data from Oracle 11g release 2 to Bigquery using Dataflow's jdbc to Bigquery template?

I have been looking for data transfer services to transfer data from Oracle 11g release 2 database. 我一直在寻找数据传输服务来从Oracle 11g第2版数据库中传输数据。 I came across some 3rd party solution which I tested worked perfectly. 我遇到了一些经过测试的第三方解决方案,效果很好。

But! 但! I wanted to go with purely first party solution. 我想使用纯粹的第一方解决方案。 So I looked into it and I came across Dataflow. 因此,我调查了一下,发现了Dataflow。

Dataflow already has some templates which I can use it to transfer the data. 数据流已经有一些模板,我可以用它来传输数据。

I chose 'jdbc to bigquery' template. 我选择了“ jdbc to bigquery”模板。

It had some required parameters and optional paramters 它具有一些必需的参数和可选的参数

Required: jdbc connection url, driver class name, source sql query, bigquery output table, GCS path for jdbc drivers, temporary directory for bigquery loading process, temporary location 必需:jdbc连接URL,驱动程序类名称,源sql查询,bigquery输出表,jdbc驱动程序的GCS路径,bigquery加载过程的临时目录,临时位置

Optional: jdbc Username, jdbc password 可选:jdbc用户名,jdbc密码

I setup a VM in compute engine to host a oracle 11g release 2 DB. 我在计算引擎中设置了一个VM,以托管Oracle 11g第2版DB。 I created a DB 'database', added a table 'table1', and some rows in it. 我创建了一个数据库“数据库”,添加了一个表“ table1”和其中的一些行。

I created a job in dataflow with following paramters: 我在数据流中使用以下参数创建了一个作业:

jdbc connection url: jdbc:oracle:thin:@VMinternalIP:1521:database jdbc连接URL: jdbc:oracle:thin:@VMinternalIP:1521:database

driver class name: oracle.jdbc.driver.OracleDriver 驱动程序类名称: oracle.jdbc.driver.OracleDriver

source sql query: select * from database.table1 源sql查询: select * from database.table1

bigquery output table: <testproject-0122>:<testdataset>.<table1> bigquery输出表: <testproject-0122>:<testdataset>.<table1>

GCS path for jdbc drivers: gs://testbucket/jdbc/ojdbc6.jar,gs://testbucket/jdbc/ojdbc5.jar,gs://testbucket/jdbc/ons.jar,gs://testbucket/jdbc/orai18n.jar,gs://testbucket/jdbc/simplefan.jar,gs://testbucket/jdbc/ucp.jar,gs://testbucket/jdbc/xdb6.jar jdbc驱动程序的GCS路径: gs://testbucket/jdbc/ojdbc6.jar,gs://testbucket/jdbc/ojdbc5.jar,gs://testbucket/jdbc/ons.jar,gs://testbucket/jdbc/orai18n.jar,gs://testbucket/jdbc/simplefan.jar,gs://testbucket/jdbc/ucp.jar,gs://testbucket/jdbc/xdb6.jar

temporary directory for bigquery loading process: gs://testbucket/temporary bigquery加载过程的临时目录: gs://testbucket/temporary

temporary location: gs://testbucket/temporary/amkortmp 临时位置: gs://testbucket/temporary/amkortmp

jdbc username: username for oracle db jdbc用户名: username for oracle db用户username for oracle db

jdbc password: password for oracle db jdbc密码: password for oracle db

However this job failed and throwed five errors. 但是,这项工作失败了,并引发了五个错误。

Job status: Read from jdbcIO fail, Write to bigquery failed, drop inputs succeded 作业状态:从jdbcIO读取失败,写入bigquery失败,成功删除输入

Four of five errors displayed the following error: 五个错误中的四个显示以下错误:

Edit: I updated my url and there is no connection error but I am getting the following error now: 编辑:我更新了我的网址,没有连接错误,但是现在出现以下错误:

ava.sql.SQLSyntaxErrorException: ORA-00942: table or view does not exist ava.sql.SQLSyntaxErrorException:ORA-00942:表或视图不存在

I can see the table is in Oracle. 我可以看到该表在Oracle中。 So is it a schema error? 那是架构错误吗?

What am I doing wrong with this Dataflow template usage? 我使用此Dataflow模板有什么错?

I eventually found the answer. 我终于找到了答案。

First the query is wrong on its reference to the table. 首先,查询对表的引用是错误的。

Correct format : select * from table1 正确格式select * from table1

Bigquery table reference is wrong BigQuery表引用错误

Correct format : project_ID:dataset_ID.table_ID 正确格式project_ID:dataset_ID.table_ID

Correcting this, made it work. 更正此问题,使其生效。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM