[英]Talend Open Studio data migration with MySQL on parent-child relation
I am using Talend Open Studio for data migration as I am upgrading my existing application architecture to a new one. 我正在使用Talend Open Studio进行数据迁移,因为我正在将现有的应用程序体系结构升级到新的体系结构。 I am using MySQL in both the applications but with different schema.
我在两个应用程序中都使用MySQL,但模式不同。 I have migrated data successfully between single tables but while I am transferring data from a single table to a parent-child relationship table with a foreign key constraint, the data transfer is tremendously slow.
我已经成功地在单个表之间迁移了数据,但是当我将数据从单个表转移到具有外键约束的父子关系表时,数据传输非常慢。 For eg I am migrating my Cities table to Cities and Citiesi18n and below is the schema for them:
例如,我正在将“城市”表迁移到“城市”和“ Citiesi18n”,以下是它们的架构:
My old schema : 我的旧模式:
CITIES (
id
city_name
status
created_at
)
The newly created schema where I need to migrate the data : 我需要在其中迁移数据的新创建的架构:
CITIES (
id
status
created_at
)
CITIESI18N (
id
lang_code
name
fk_city_id (// foreign key of cities table)
)
Below are the snapshots from my Talend jobs: 以下是我的Talend作业的快照:
And here is the tmap configuration : 这是tmap配置:
Now when I transfer the data without the foreign key the result are super fast. 现在,当我在没有外键的情况下传输数据时,结果非常快。 See below :
见下文 :
But the same when I transfer with a foreign key, my transfer is super slow : 但是,当我使用外键进行传输时,我的传输非常慢:
(Note: I have taken province table for example as it is similar to cities table) (注意:我以省表为例,因为它与城市表相似)
I think with Foreign key constraint it must be indexing the columns while transferring the data making it slower, but I am not sure. 我认为,在使用外键约束的情况下,它必须在传输数据时索引列,这会使它变慢,但我不确定。 Is there any way I can fix this as I have a lot of tables similar to this which needs to be migrated.
有什么办法可以解决此问题,因为我有很多与此表类似的表需要迁移。 I am just curious to know the reason.
我只是想知道原因。
I don't know why you have this behaviour : you can try to redirect 'provience_i18n' to a tHashOutput (cache component), then link to a subjob with tHashInput (refering to your tHashOutput)-->tMySQLOutput. 我不知道为什么会有这种行为:您可以尝试将'provience_i18n'重定向到tHashOutput(缓存组件),然后使用tHashInput(引用您的tHashOutput)-> tMySQLOutput链接到子作业。 You'll have 2 subjobs, one for each insertion.
您将有2个子工作,每个插入工作一个。
You are loading data to the parent & child at the same time . 你是在同一时间将数据加载到父和子。 Using one tmap.
使用一个tmap。 When you are inserting foreign key in the second table, there is also insertion being made in the foreign/parent table.
在第二张表中插入外键时,在外/父表中也会进行插入 。 What you could alternatively do is: Load the data in the main CITIES table first, then onSubJobOk, load into child/ CITIESI18N table.
或者,您可以做的是:先将数据加载到主CITIES表中,然后再将onSubJobOk加载到child / CITIESI18N表中。 It would be faster.
这样会更快。 Let me know if it works.
让我知道它是否有效。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.