简体   繁体   中英

How To Design A Database for a “Check In” Social Service

I want to build a "check in" service like FourSquare or Untappd .

How do I design a suitable database schema for storing check-ins?

For example, suppose I'm developing "CheeseSquare" to help people keep track of the delicious cheeses they've tried.

The table for the items into which one can check in is fairly simple and would look like

+----+---------+---------+-------------+--------+
| ID |  Name   | Country |    Style    | Colour |
+----+---------+---------+-------------+--------+
|  1 | Brie    | France  | Soft        | White  |
|  2 | Cheddar | UK      | Traditional | Yellow |
+----+---------+---------+-------------+--------+

I would also have a table for the users, say

+-----+------+---------------+----------------+
| ID  | Name | Twitter Token | Facebook Token |
+-----+------+---------------+----------------+
| 345 | Anne | qwerty        | poiuyt         |
| 678 | Bob  | asdfg         | mnbvc          |
+-----+------+---------------+----------------+

What's the best way of recording that a user has checked in to a particular cheese?

For example, I want to record how many French cheeses Anne has checked-in. Which cheeses Bob has checked into etc. If Cersei has eaten Camembert more than 5 times etc.

Am I best putting this information in the user's table? Eg

+-----+------+------+--------+------+------+---------+---------+
| ID  | Name | Blue | Yellow | Soft | Brie | Cheddar | Stilton |
+-----+------+------+--------+------+------+---------+---------+
| 345 | Anne |    1 |      0 |    2 |    1 |       0 |       5 |
| 678 | Bob  |    3 |      1 |    1 |    1 |       1 |       2 |
+-----+------+------+--------+------+------+---------+---------+

That looks rather ungainly and hard to maintain. So should I have separate tables for recordings check in?

No, don't put it into the users table. That information is better stored in a join table which represents a many-to-many relationship between users and cheeses.

The join table (we'll call cheeses_users ) must have at least two columns ( user_ID, cheese_ID ), but a third (a timestamp) would be useful too. If you default the timestamp column to CURRENT_TIMESTAMP , you need only insert the user_ID, cheese_ID into the table to log a checkin.

cheeses (ID) ⇒ (cheese_ID) cheeses_users (user_ID) ⇐ users (ID)

Created as:

CREATE TABLE cheeses_users
  cheese_ID INT NOT NULL,
  user_ID INT NOT NULL,
  -- timestamp defaults to current time
  checkin_time DATETIME DEFAULT CURRENT_TIMESTAMP,
  -- (add any other column *specific to* this checkin (user+cheese+time))
  --The primary key is the combination of all 3
  -- It becomes impossible for the same user to log the same cheese
  -- at the same second in time...
  PRIMARY KEY (cheese_ID, user_ID, checkin_time),
  -- FOREIGN KEYs to your other tables
  FOREIGN KEY (cheese_ID) REFERENCES cheeses (ID),
  FOREIGN KEY (user_ID) REFERENCES users (ID),
) ENGINE=InnoDB; -- InnoDB is necessary for the FK's to be honored and useful

To log a checkin for Bob & Cheddar, insert with:

INSERT INTO cheeses_users (cheese_ID, user_ID) VALUES (2, 678);

To query them, you join through this table. For example, to see the number of each cheese type for each user, you might use:

SELECT
  u.Name AS username,
  c.Name AS cheesename,
  COUNT(*) AS num_checkins
FROM
  users u
  JOIN cheeses_users cu ON u.ID = cu.user_ID
  JOIN cheeses c ON cu.cheese_ID = c.ID
GROUP BY
  u.Name,
  c.Name

To get the 5 most recent checkins for a given user, something like:

SELECT
  c.Name AS cheesename,
  cu.checkin_time
FROM
  cheeses_users cu
  JOIN cheeses c ON cu.cheese_ID = c.ID
WHERE 
  -- Limit to Anne's checkins...
  cu.user_ID = 345
ORDER BY checkin_time DESC
LIMIT 5

Let's define more clearly, so you can tell me if I'm wrong:

  • Cheese instances exist and aren't divisible ("Cheddar/UK/Traditional/Yellow" is a valid checkinable cheese, but "Cheddar" isn't, nor is "Yellow" or "Cheddar/France/...)
  • Users check into a single cheese instance at a given time
  • Users can re-check into the same cheese instance at a later date.

If this is the case, then to store fully normalized data, and to be able to retrieve that data's history, you need a third relational table linking the two existing tables.

+-----+------------+---------------------+
| uid |  cheese_id | timestamp           |
+----+-------------+---------------------+
| 345 | 1          | 2014-05-04 19:04:38 |
| 345 | 2          | 2014-05-08 19:04:38 |
| 678 | 1          | 2014-05-09 19:04:38 |
+-----+------------+---------------------+

etc. You can add extra columns to correspond to the cheese data, but strictly speaking you don't need to.

By putting all this in a third table, you potentially improve both performance and flexibility. You can always reconstruct the additions to the users table you mooted, using aggregate queries.

If you really decide you don't need the timestamps, then you'd replace them with basically the equivalent of a COUNT(*) field:

+-----+------------+--------------+
| uid |  cheese_id | num_checkins |
+----+-------------+--------------+
| 345 | 1          | 15           |
| 345 | 2          | 3            |
| 678 | 1          | 8            |
+-----+------------+--------------+

That would dramatically reduce the size of your joining table, although obviously there's less of a "paper trail", should you need to reconstruct your data (and possibly say to a user "oh, yeah, we forgot to record your checkin on such-a-date.")

The entities 'User' and 'Cheese' have a many-to-many relationship. A user can have multiple cheeses he checked into, and a cheese can have multiple people that checked into it.

The only right way to design this in a relational database is to store it into a separate table. There are many reasons why storing it into the user table for instance, is a very bad idea. Read up on normalizing databases for more info on this.

Your table should look something like this:

CheckIns(CheeseId, UserId, (etc...))

Other useful columns might include date or rating, or whatever you want to store about a particular relationship between a user and a cheese.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM