简体   繁体   English

两个具有相似列但主键不同的表

[英]Two tables with similar columns but different primary keys

I have two tables from two different databases, and both contain lastName and firstName columns. 我有两个来自两个不同数据库的表,都包含lastNamefirstName列。 I need to create JOIN a relationship between the two. 我需要创建JOIN两者之间的关系。 The lastName columns match about 80% of the time, while the firstName columns match only about 20% of the time. lastName列匹配大约80%的时间,而firstName列仅匹配大约20%的时间。 And each table has totally different personID primary keys. 每个表都有完全不同的personID主键。

Generally speaking, what would be some "best practices" and/or tips to use when I add a foreign key to one of the tables? 一般来说,当我向其中一个表中添加外键时,将使用哪些“最佳实践”和/或技巧? Since I have about 4,000 distinct persons, any labor-saving tips would be greatly appreciated. 由于我有大约4,000个不同的人员,因此不胜感激的小费将不胜感激。

Sample mismatched data: 样本不匹配的数据:

db1.table1_____________________  db2.table2_____________________
23    Williams       Fritz       98   Williams       Frederick
25    Wilson-Smith   James       12   Smith          James Wilson
26    Winston        Trudy       73   Winston        Gertrude

Keep in mind: sometimes they match exactly, often they don't, and sometimes two different people will have the same first/last name. 请记住:有时他们完全匹配,有时却不匹配,有时两个不同的人的姓氏/名字相同。

You can join on multiple fields. 您可以加入多个领域。

select * 
  from table1
    inner join table2
      on table1.firstName = table2.firstName
        and table1.lastName = table2.lastName

From this you can determine how many 'duplicate' firstname / last name combos there are. 从中可以确定有多少个“重复的”名字/姓氏组合。

select table1.firstName, table2.lastName, count(*)
  from table1
    inner join table2
      on table1.firstName = table2.firstName
        and table1.lastName = table2.lastName
  group by table1.firstName, table2.lastName
  having count(*) > 1

Conversely, you can also determine the ones which match identically, and only once: 相反,您也可以确定一次完全匹配的匹配项:

select table1.firstName, table2.lastName
  from table1
    inner join table2
      on table1.firstName = table2.firstName
        and table1.lastName = table2.lastName
  group by table1.firstName, table2.lastName
  having count(*) = 1

And this last query could be the basis for performing the bulk of your foreign key updates. 最后的查询可能是执行大量外键更新的基础。

For those names that match more than once between the tables, they'll likely need some sort of manual intervention, unless there are other fields in the table that can be used to differentiate them? 对于那些在表之间不止一次匹配的名称,它们可能需要某种手动干预,除非表中还有其他字段可用于区分它们?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM