简体   繁体   中英

How to efficiently design MySQL database for my particular case

I am developing a forum in PHP MySQL. I want to make my forum as efficient as I can.

I have made these two tables

  1. tbl_threads
  2. tbl_comments

Now, the problems is that there is a like and dislike button under the each comment. I have to store the user_name which has clicked the Like or Dislike Button with the comment_id . I have made a column user_likes and a column user_dislikes in tbl_comments to store the comma separated user_names. But on this forum, I have read that this is not an efficient way. I have been advised to create a third table to store the Likes and Dislikes and to comply my database design with 1NF.

But the problem is, If I make a third table tbl_user_opinion and make two fields like this 1. comment_id 2. type (like or dislike)

So, will I have to run as many sql queries as there are comments on my page to get the like and dislike data for each comment. Will it not inefficient. I think there is some confusion on my part here. Can some one clarify this.

You have a Relational Scheme like this:

There are two ways to solve this. The first one, the "clean" one is to build your "like" table, and do "count(*)'s" on the appropriate column.

The second one would be to store in each comment a counter, indicating how many up's and down's have been there. If you want to check, if a specific user has voted on the comment, you only have to check one entry, wich you can easily handle as own query and merge them two outside of your database (for this use a query resulting in comment_id and kind of the vote the user has done in a specific thread.)

Your approach with a comma-seperated-list is not quite performant, due you cannot parse it without higher intelligence, or a huge amount of parsing strings. If you have a database - use it!

("One Information - One Dataset"!)

The comma-separate list violates the principle of atomicity , and therefore the 1NF. You'll have hard time maintaining referential integrity and, for the most part, querying as well.

Here is one way to do it in a normalized fashion:

在此处输入图片说明

This is very clustering -friendly: it groups up-votes belonging to the same comment physically close together (ditto for down-votes), making the following query rather efficient:

SELECT
    COMMENT.COMMENT_ID,
    <other COMMENT fields>,
    COUNT(DISTINCT UP_VOTE.USER_ID) - COUNT(DISTINCT DOWN_VOTE.USER_ID) SCORE
FROM COMMENT
    LEFT JOIN UP_VOTE
        ON COMMENT.COMMENT_ID = UP_VOTE.COMMENT_ID
    LEFT JOIN DOWN_VOTE
        ON COMMENT.COMMENT_ID = DOWN_VOTE.COMMENT_ID
WHERE
    COMMENT.COMMENT_ID = <whatever>
GROUP BY
    COMMENT.COMMENT_ID,
    <other COMMENT fields>;

[SQL Fiddle]

Please measure on realistic amounts of data if that works fast enough for you. If not, then denormalize the model and cache the total score in the COMMENT table, and keep it current it through triggers every time a new row is inserted to or deleted from *_VOTE tables.

If you also need to get which comments a particular user voted on, you'll need indexes on *_VOTE {USER_ID, COMMENT_ID}, ie the reverse of the primary/clustering key above. 1


1 This is one of the reasons why I didn't go with just one VOTE table containing an additional field that can be either 1 (for up-vote) or -1 (for down-vote): it's less efficient to cover with secondary indexes.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM