简体   繁体   中英

Identify Smallest Number of Combinations Between 2 Users in a SQL Server Conversation Table

I have a table representing messages between users which is, roughly, as follows:

Message TABLE
(
    Id        INT (PK)
  , User_1_Id INT (FK)
  , User_2_Id INT (FK)
  , ...
)

I would like to write a query which outputs a summary of how many unique conversations were held between any two users - regardless which direction the message went.

To illustrate:

Let's say we have 3 users:

  • User A (Id: 1),
  • User B (Id: 2), and
  • User C (Id: 3)

In the table, we have the following entries:

Id  User_1_Id   User_2_Id   ...
1   1           2           ...
2   2           1           ...
3   1           2           ...
4   2           3           ...
5   1           2           ...

The query I desire would indicate that there were two unique conversations:

One between:

  • A) User A and User B, and
  • B) User B and User C.

What I don't want is for the query to also say that there is a conversation between:

  • C) User B and User A (the combination has already been covered by A, above - but in the reverse order).

This is easy if I'm working at the level of individual User Ids - but I can't figure out any kind of set-based method to achieve the outcome in single query.

Currently, the best I've been able to do is isolate that messages have been sent between users in each direction (ie it's returning C in addition to A and B).

UPDATE

A conversation includes all messages between any two users - regardless of the order or position of the individual messages in the context of the whole table.

I'm actually aiming to build a conversation table which probably should have been included in the original database model but was sadly left out. It wouldn't make sense to make the conversation direction-specific.

The answer would be appear to be equal to the number of rows returned by this query...

DROP TABLE IF EXISTS messages;

CREATE TABLE messages
(id  INT NOT NULL AUTO_INCREMENT PRIMARY KEY
,from_user   INT NOT NULL
,to_user INT NOT NULL
,INDEX(from_user,to_user)
);

INSERT INTO messages VALUES
(1,   1,           2),
(2   ,2           ,1),
(3   ,1           ,2),
(4   ,2           ,3);

SELECT DISTINCT LEAST (from_user,to_user) user1,GREATEST(from_user,to_user) user1 FROM messages;
+-------+-------+
| user1 | user1 |
+-------+-------+
|     1 |     2 |
|     2 |     3 |
+-------+-------+
2 rows in set (0.00 sec)

Would something like this work for your needs?

i would union two queries together ie, the first query puts user_1_id first, and the second puts user_2_id in position 1. Then when you union, it will distinct it for you, and you can simply count the returned rows.

The solution is to use a CASE statement which compares the size of the two columns and returns the smallest value in the first column and the largest value in the second column:

 SELECT
        CASE WHEN User_1_Id > User_2_Id THEN User_1_Id ELSE User_2_Id END
      , CASE WHEN User_2_Id > User_1_Id THEN User_1_Id ELSE User_2_Id END
   FROM
        Messages

Hat-tip to @Strawberry for the answer which pointed me in the right direction.

There might be some funny results if there are users who have messaged themselves I guess - but that shouldn't happen in practice...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM