简体   繁体   中英

Best practice to build a followers/following MySQL database

I'm going to build a MySQL database for a social network style site, where users follow other users and then get updates from users their follow.

My DB is composed by one table with users base information:

| ID | username | password | email | ... other few columns | 

The 'ID' is primary and 'username' and 'email' are unique and index.

then I have a table with users feed that should be showed only if another users follow it, 'ID' is always primary:

| ID | feed_to_show_in_home |

then a table with followers statistics to speed up users profile page:

| ID | followers_count | following_count |

And at least the real followers net table where are stored who follow who:

| ID | following |

In this table 'ID' and 'following' are both primary because an user can follow only once another user.

Now I would to ask if my structure is good from performance point of view. I'm worried especially about how to check if an user is following another user, stop following an user, and how to display feeds only if I'm following that specific user.

In any of that cases, the solution I've in mind is to scan always the full table length, but I think that this is not a good choice since this DB is planned to store over then 10,000 users.

Short answer: 10,000 is so few that any design will be "good enough".

Long answer: For more scaling, consider the following...

These designs are usually bad practice:

  • two tables in a 1:1 relationship.
  • store something that can be computed.

I say " usually " because you are reaching into cases where exceptions are warranted. But first, let me mention some other schema designs:

CREATE TABLE Follow (
    er ...,  -- user id of the the follower
    ed ...,  -- user id of the the followed
    PRIMARY KEY(er, ed),
    INDEX(ed, er)
) ENGINE=InnoDB;

SELECT COUNT(*) FROM Follow WHERE ed = ?; -- number of followers for `ed`.
SELECT er FROM Follow WHERE ed = ?  -- list of such followers
(Similarly for the flip direction)

Notes:

  • No surrogate AUTO_INCREMENT , since there is a perfectly good PK. And the queries will run faster, as we will see in a minute.
  • Until you have 100K followers, the COUNT query is "fast enough" so that you don't need to precompute counts.

If you were to count the number of "Likes", it would be prudent to have a separate table for that frequently updated value. Such a table would be 1:1 with the User table, thereby violating the first bad practice. The justification here is to separate the very high write activity in Like from the low , but important read activity in the rest of the "user" info.

For things like this, I would prefer graph databases, because the real-world-problem you are trying to solve has a graph as its natural structure.

From a relational point of view, your idea looks good. I'm not quite sure if you already have all relations you need, but with the basic concept you're probably on the right way.

For performance issues you should do some tests with some arbitrary test data and EXPLAIN statements ( see this ). Now you can try setting some indexes on columns you are filtering for and test it again. Which indexes are best to set depends highly on your queries and which indexes better not to set depends on how often or how much you update/insert stuff. There are lots of other articles that explain it better than I do, so you should maybe have a look at some best practices in indexing in general and ask for specific performance problems, when they actually occur.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM