[英]How to transfer data from Oracle 11g release 2 to Bigquery using Dataflow's jdbc to Bigquery template?
I have been looking for data transfer services to transfer data from Oracle 11g release 2 database. 我一直在寻找数据传输服务来从Oracle 11g第2版数据库中传输数据。 I came across some 3rd party solution which I tested worked perfectly.
我遇到了一些经过测试的第三方解决方案,效果很好。
But! 但! I wanted to go with purely first party solution.
我想使用纯粹的第一方解决方案。 So I looked into it and I came across Dataflow.
因此,我调查了一下,发现了Dataflow。
Dataflow already has some templates which I can use it to transfer the data. 数据流已经有一些模板,我可以用它来传输数据。
I chose 'jdbc to bigquery' template. 我选择了“ jdbc to bigquery”模板。
It had some required parameters and optional paramters 它具有一些必需的参数和可选的参数
Required: jdbc connection url, driver class name, source sql query, bigquery output table, GCS path for jdbc drivers, temporary directory for bigquery loading process, temporary location 必需:jdbc连接URL,驱动程序类名称,源sql查询,bigquery输出表,jdbc驱动程序的GCS路径,bigquery加载过程的临时目录,临时位置
Optional: jdbc Username, jdbc password 可选:jdbc用户名,jdbc密码
I setup a VM in compute engine to host a oracle 11g release 2 DB. 我在计算引擎中设置了一个VM,以托管Oracle 11g第2版DB。 I created a DB 'database', added a table 'table1', and some rows in it.
我创建了一个数据库“数据库”,添加了一个表“ table1”和其中的一些行。
I created a job in dataflow with following paramters: 我在数据流中使用以下参数创建了一个作业:
jdbc connection url: jdbc:oracle:thin:@VMinternalIP:1521:database
jdbc连接URL:
jdbc:oracle:thin:@VMinternalIP:1521:database
driver class name: oracle.jdbc.driver.OracleDriver
驱动程序类名称:
oracle.jdbc.driver.OracleDriver
source sql query: select * from database.table1
源sql查询:
select * from database.table1
bigquery output table: <testproject-0122>:<testdataset>.<table1>
bigquery输出表:
<testproject-0122>:<testdataset>.<table1>
GCS path for jdbc drivers: gs://testbucket/jdbc/ojdbc6.jar,gs://testbucket/jdbc/ojdbc5.jar,gs://testbucket/jdbc/ons.jar,gs://testbucket/jdbc/orai18n.jar,gs://testbucket/jdbc/simplefan.jar,gs://testbucket/jdbc/ucp.jar,gs://testbucket/jdbc/xdb6.jar
jdbc驱动程序的GCS路径:
gs://testbucket/jdbc/ojdbc6.jar,gs://testbucket/jdbc/ojdbc5.jar,gs://testbucket/jdbc/ons.jar,gs://testbucket/jdbc/orai18n.jar,gs://testbucket/jdbc/simplefan.jar,gs://testbucket/jdbc/ucp.jar,gs://testbucket/jdbc/xdb6.jar
temporary directory for bigquery loading process: gs://testbucket/temporary
bigquery加载过程的临时目录:
gs://testbucket/temporary
temporary location: gs://testbucket/temporary/amkortmp
临时位置:
gs://testbucket/temporary/amkortmp
jdbc username: username for oracle db
jdbc用户名:
username for oracle db
用户username for oracle db
jdbc password: password for oracle db
jdbc密码:
password for oracle db
However this job failed and throwed five errors. 但是,这项工作失败了,并引发了五个错误。
Job status: Read from jdbcIO fail, Write to bigquery failed, drop inputs succeded 作业状态:从jdbcIO读取失败,写入bigquery失败,成功删除输入
Four of five errors displayed the following error: 五个错误中的四个显示以下错误:
Edit: I updated my url and there is no connection error but I am getting the following error now: 编辑:我更新了我的网址,没有连接错误,但是现在出现以下错误:
ava.sql.SQLSyntaxErrorException: ORA-00942: table or view does not exist ava.sql.SQLSyntaxErrorException:ORA-00942:表或视图不存在
I can see the table is in Oracle. 我可以看到该表在Oracle中。 So is it a schema error?
那是架构错误吗?
What am I doing wrong with this Dataflow template usage? 我使用此Dataflow模板有什么错?
I eventually found the answer. 我终于找到了答案。
First the query is wrong on its reference to the table. 首先,查询对表的引用是错误的。
Correct format : select * from table1
正确格式 :
select * from table1
Bigquery table reference is wrong BigQuery表引用错误
Correct format : project_ID:dataset_ID.table_ID
正确格式 :
project_ID:dataset_ID.table_ID
Correcting this, made it work. 更正此问题,使其生效。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.