MYSQL - 选择两个表之间的唯一公共列 - 最有效的查询

Question

I have two tables: 我有两张桌子：

db_contacts db_contacts

Phone | Name | Last_Name
--------------------
111   | Foo  | Foo
222   | Bar  | Bar
333   | John | Smith
444   | Tomy | Smith

users_contacts users_contacts

User_ID | Phone
--------------------
1       | 123
1       | 111
2       | 222
2       | 333
3       | 111
3       | 333
4       | 444

Notice from above that: 从上面注意到：

User with ID 2 is the only one that have the phone number 222 ID为2的用户是唯一拥有电话号码222的用户
User with ID 4 is the only one that have the phone number 444 ID为4的用户是唯一拥有电话号码444的用户

I need to obtain these results with a MySQL query. 我需要通过MySQL查询获得这些结果。

In other words: How can I select all the users that have a unique phone number in condition that this number exists in the db_contacts. 换句话说：如果db_contacts中存在此编号，我该如何选择具有唯一电话号码的所有用户。

I need my end result to be something like that: 我需要我的最终结果是这样的：

User_ID | Phone | Name | Last_Name
------------------------------------
2       | 222   | Bar  | Bar
4       | 444   | Tomy | Smith

PS: There is no Foreign key between the Phone columns, as a User can have a phone that is not in the db_contacts. PS：电话列之间没有外键，因为用户可以拥有不在db_contacts中的电话。

In real life, db_contacts contains about 1 million records and users_contacts about 5 million records. 在现实生活中，db_contacts包含大约100万条记录，users_contacts包含大约500万条记录。

What I tried and failed and taking a lot of time to execute: 我尝试过但失败了并花了很多时间来执行：

SELECT * 
FROM users_contacts 
WHERE users_contacts.phone IN (
    SELECT users_contacts.phone 
    FROM `users_contacts`
    JOIN db_contacts ON db_contacts.phone = users_contacts.phone
    GROUP BY users_contacts.phone
    HAVING COUNT(users_contacts.phone) = 1
)

Update: 更新：

Thank you for your replies, I have provided my solution that fits my case perfectly. 感谢您的回复，我提供的解决方案完全符合我的要求。

Answer 1

I think you want: 我想你想要：

select uc.*
from user_contacts uc
where not exists (select 1
                  from user_contacts uc2
                  where uc2.phone = uc.phone and uc2.user_id <> uc.user_id
                 );

For performance, you want an index on user_contacts(phone, user_id) . 为了提高性能，您需要user_contacts(phone, user_id)上的索引。

Another method is: 另一种方法是：

select max(user_id) as user_id, phone
from user_contacts
group by phone
having count(*) = 1;

The not exists version is probably going to be faster. not exists版本可能会更快。

Answer 2

I would use a simple JOIN with a NOT EXISTS condition. 我会使用一个简单的JOIN和NOT EXISTS条件。 This is usually the most efficient way to check that something has no duplicates ; 这通常是检查某些东西没有重复的最有效方法; compared to your solution, this has the advantage of avoiding aggregation. 与您的解决方案相比，这具有避免聚合的优势。

SELECT uc.User_ID, dc.*
FROM users_contacts uc
INNER JOIN db_contacts dc ON uc.Phone = dc.Phone
WHERE NOT EXISTS (
    SELECT 1 
    FROM users_contacts uc1 
    WHERE uc1.Phone = dc.Phone AND uc1.User_ID != uc2.User_ID
)

Hint: consider setting the following indexes: 提示：考虑设置以下索引：

users_contacts(Phone, User_ID)
db_contacts(Phone)

Answer 3

I first would like to thank everyone that posted solutions, they all worked. 我首先要感谢所有发布解决方案的人，他们都工作了。

But I was a bit crucial on response times, and solutions provided by the fellows took a lot of time to execute, couple of seconds. 但是我对响应时间有点关键，并且研究员提供的解决方案花费了大量时间来执行，几秒钟。

In case anyone was having a similar problem, I ended up by creating a new table calling it users_unique_contacts, and created a trigger AFTER INSERT on users_contacts that checks if the newly created contact existed in the users_unique_contacts, if it didn't exist, add it, else remove it as it means the number is not unique anymore. 如果有人遇到类似问题，我最后创建了一个名为users_unique_contacts的新表，并在users_contacts上创建了一个触发器AFTER INSERT，用于检查users_unique_contacts中是否存在新创建的联系人，如果不存在则添加它，否则删除它，因为这意味着该数字不再是唯一的。

My Trigger went like this: 我的触发器是这样的：

BEGIN
    IF EXISTS (SELECT 1 = 1 FROM users_unique_contacts WHERE phone = new.phone LIMIT 1) THEN
        BEGIN
                DELETE FROM users_unique_contacts WHERE phone = new.phone LIMIT 1;
        END;
    ELSE
        BEGIN
                INSERT INTO users_unique_contacts (user_id,phone) VALUES (new.user_id, new.phone);
        END;
    END IF;
END

Now everytime I want the unique numbers of a user, I query the users_unique_contacts and execution time is milliseconds. 现在，每当我想要用户的唯一编号时，我查询users_unique_contacts，执行时间是毫秒。

MYSQL - 选择两个表之间的唯一公共列 - 最有效的查询

问题描述

Update: 更新：

3 个解决方案

解决方案1
1 2019-02-28 17:14:11

解决方案2
1 2019-02-28 17:14:50

解决方案3
1 2019-02-28 22:40:17

MYSQL - 选择两个表之间的唯一公共列 - 最有效的查询

问题描述

Update: 更新：

3 个解决方案

解决方案1 1 2019-02-28 17:14:11

解决方案2 1 2019-02-28 17:14:50

解决方案3 1 2019-02-28 22:40:17

解决方案1
1 2019-02-28 17:14:11

解决方案2
1 2019-02-28 17:14:50

解决方案3
1 2019-02-28 22:40:17