简体   繁体   English

Talend Open Studio数据与基于父子关系的MySQL迁移

[英]Talend Open Studio data migration with MySQL on parent-child relation

I am using Talend Open Studio for data migration as I am upgrading my existing application architecture to a new one. 我正在使用Talend Open Studio进行数据迁移,因为我正在将现有的应用程序体系结构升级到新的体系结构。 I am using MySQL in both the applications but with different schema. 我在两个应用程序中都使用MySQL,但模式不同。 I have migrated data successfully between single tables but while I am transferring data from a single table to a parent-child relationship table with a foreign key constraint, the data transfer is tremendously slow. 我已经成功地在单个表之间迁移了数据,但是当我将数据从单个表转移到具有外键约束的父子关系表时,数据传输非常慢。 For eg I am migrating my Cities table to Cities and Citiesi18n and below is the schema for them: 例如,我正在将“城市”表迁移到“城市”和“ Citiesi18n”,以下是它们的架构:

My old schema : 我的旧模式:

CITIES (
  id   
  city_name
  status
  created_at
)

The newly created schema where I need to migrate the data : 我需要在其中迁移数据的新创建的架构:

CITIES (
  id   
  status
  created_at
)

CITIESI18N (
  id           
  lang_code
  name
  fk_city_id      (// foreign key of cities table)
)

Below are the snapshots from my Talend jobs: 以下是我的Talend作业的快照:

在此处输入图片说明

And here is the tmap configuration : 这是tmap配置:

在此处输入图片说明

Now when I transfer the data without the foreign key the result are super fast. 现在,当我在没有外键的情况下传输数据时,结果非常快。 See below : 见下文 :

在此处输入图片说明

But the same when I transfer with a foreign key, my transfer is super slow : 但是,当我使用外键进行传输时,我的传输非常慢:

(Note: I have taken province table for example as it is similar to cities table) (注意:我以省表为例,因为它与城市表相似)

在此处输入图片说明

I think with Foreign key constraint it must be indexing the columns while transferring the data making it slower, but I am not sure. 我认为,在使用外键约束的情况下,它必须在传输数据时索引列,这会使它变慢,但我不确定。 Is there any way I can fix this as I have a lot of tables similar to this which needs to be migrated. 有什么办法可以解决此问题,因为我有很多与此表类似的表需要迁移。 I am just curious to know the reason. 我只是想知道原因。

I don't know why you have this behaviour : you can try to redirect 'provience_i18n' to a tHashOutput (cache component), then link to a subjob with tHashInput (refering to your tHashOutput)-->tMySQLOutput. 我不知道为什么会有这种行为:您可以尝试将'provience_i18n'重定向到tHashOutput(缓存组件),然后使用tHashInput(引用您的tHashOutput)-> tMySQLOutput链接到子作业。 You'll have 2 subjobs, one for each insertion. 您将有2个子工作,每个插入工作一个。

You are loading data to the parent & child at the same time . 你是在同一时间将数据加载到父和子。 Using one tmap. 使用一个tmap。 When you are inserting foreign key in the second table, there is also insertion being made in the foreign/parent table. 在第二张表中插入外键时,在外/父表中也会进行插入 What you could alternatively do is: Load the data in the main CITIES table first, then onSubJobOk, load into child/ CITIESI18N table. 或者,您可以做的是:先将数据加载到主CITIES表中,然后再将onSubJobOk加载到child / CITIESI18N表中。 It would be faster. 这样会更快。 Let me know if it works. 让我知道它是否有效。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM