简体   繁体   中英

How do I store correlation values between users in a database using MySQL and PHP?

I'm trying to make a database of users for a website that will store correlation values between all the users. What I mean by this is that for every pair of users, there is a stored value of correlation between the two users.

The correlation values will be calculated by PHP using a correlation algorithm. My question is what is the most correct way to store them in a MySQL database? I realize I could make a table like this:

         ---------------------------------
        | user1 | user2 | user3 | etc... |
 -----------------------------------------
| user1 | #val  | #val  | #val  | #val   |
 -----------------------------------------
| user2 | #val  | #val  | #val  | #val   |
 -----------------------------------------
| user3 | #val  | #val  | #val  | #val   |

etcetera. But I don't like this method because

  • It stores every value twice; for example the correlation between user1 and user3 is stored in row 1 column 3 as well as row 3 column 1.
  • I use prepared statements, which means I can't select columns named after user IDs unless I concatenate the user ID into the SQL statement, which is obviously not ideal.

What are my alternatives? If this can be done in MySQL well, how do I go about it?

If this can't be done well in MySQL, are there any other database types I should try to learn? For example, I realize a graph database system may work well for this, but I don't want to spend time learning how to use a graph database if this can be done in MySQL.

Typically you would do something like this in a JOIN table. So say you have a users table that has a user_id field and whatever other fields you need. You could build a table called user_relations or similar that would simply have two user_id foreign key fields that would relate users in some manner.

user_id_1  |   user_id_2
------------------------
1          |   2
1          |   3
2          |   1
3          |   1
...        |   ...

You would then have a compound primary key across both columns to enforce uniqueness. Note that I am assuming that the #val you mentioned in your question was simply some kind of flag (1/0) that represents the existence of a relationship. If you actually need that value to explain something about the relationship (ie parent/child or some other meaningful value) then you can obvioously add a third column to this table to store the value associated with the relationship.

When you need to query across a relation you could do it like this:

SELECT u1.*, u2.*
FROM
  users AS u1
  INNER JOIN user_relations AS ur
    ON u1.user_id = ur.user_id_1
  INNER JOIN users AS u2
    ON ur.user_id_2 = u2.user_id
WHERE u1.user_id = ? /* or whatever filter you may need to apply */

Note that depending on what type of relationship you are trying to represent (ie a two-way relationship) you may need two rows in the table to represents each relationship. That way you can always use the first column to look up all related users in the second column. This is shown in my example above where the relationships of user one are shown in both directions in the sample rows values.

Brilliant question.

Given users A, B, C, D and E, your data set is triangular;

  A B C D E
A   
B *  
C * *
D * * *
E * * * *
  1. In the above matrix AA, BB, CC, DD and EE are meaningless.
  2. To avoid duplication, AB is the same as BA. CD is the same as DC, and so on.

You could save a triangular data-set in a table-orientated SQL database in this way;

id usr usr c
------------
0  A   B   1
1  A   C   5
2  A   D   3
3  A   E   4
4  B   C   3

etc

In my opinion the best solution is to have 2 tables... Users and UsersRelations

UserRelations:

====================================
User1Field | User2Field | ValueField
====================================
#User      | #User      | #val
------------------------------------
#User      | #User      | #val
------------------------------------
#User      | #User      | #val
------------------------------------

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM