简体   繁体   English

我应该如何设计这个问题的数据库结构?

[英]How should I design the database structure for this problem?

I am rebuilding the background system of a site with a lot of traffic.我正在重建一个流量很大的网站的后台系统。

This is the core of the application and the way I build this part of the database is critical for a big chunk of code and upcoming work.这是应用程序的核心,我构建这部分数据库的方式对于大量代码和即将进行的工作至关重要。 The system described below will have to run millions of times each day.下面描述的系统每天必须运行数百万次。 I would appreciate any input on the issue.我将不胜感激有关该问题的任何意见。

The background is that a user can add what he or she has been eating during the day.背景是用户可以添加他或她白天吃的东西。

Simplified, the process is more or less this:简化,过程或多或少是这样的:

  1. The user arrives to the site and the site lists his/her choices for the day (if entered before as the steps below describes).用户到达该站点,该站点列出了他/她当天的选择(如果之前按以下步骤所述输入)。
  2. The user can add a meal (consisting of 1 to unlimited different items of food and their quantity).用户可以添加一餐(由 1 到无限种不同的食物及其数量组成)。 The meal is added through a search field and is organized in different types (like 'Breakfast', 'Lunch').餐点通过搜索字段添加,并按不同类型进行组织(如“早餐”、“午餐”)。
  3. During the meal building process a list of the most commonly used food items (primarily by this user, but secondly also by all users) will be shown for quick selection.在膳食构建过程中,将显示最常用的食物列表(主要由该用户提供,但其次是由所有用户提供)以供快速选择。
  4. The meals will be stored in a FoodLog table that consists of something like this: id, user_id, date, type, food_data .餐食将存储在 FoodLog 表中,该表由以下内容组成: id, user_id, date, type, food_data

What I currently have is a huge database with food items from which the search will be performed.我目前拥有的是一个巨大的数据库,其中包含将执行搜索的食品。 The food items are stored with information on both the common name (like "pork cutlets") and on producer (like "coca cola"), along with other detailed information needed.食品项目与通用名称(如“猪排”)和生产商(如“可口可乐”)的信息以及所需的其他详细信息一起存储。

Question summary:问题总结:

My problem is that I do not know the best way to store the data for it to be easily accessible in the way I need it and without the database going out of hand.我的问题是我不知道存储数据的最佳方式,以便以我需要的方式轻松访问它,并且数据库不会失控。

Consider 1 million users adding 1 to 7 meals each day.假设有 100 万用户每天添加 1 到 7 顿饭。 To store each food item for each meal, each day and each user would potentially create (1*avg_num_meals*avg_num_food_items) million rows each day.为了存储每餐的每种食物,每天和每个用户可能每天创建 (1*avg_num_meals*avg_num_food_items) 百万行。

Storing the data in some compressed way (like the food_data is an json_encoded string), would lessen the amount of rows significally, but at the same time making it hard to create the 'most used food items'-list and other statistics on the fly.以某种压缩方式存储数据(比如 food_data 是一个 json_encoded 字符串),会显着减少行数,但同时也很难动态创建“最常用的食品”列表和其他统计信息.

Should the table be split into several tables?表应该拆分成几个表吗? If this is the case, how would they interact?如果是这样,他们将如何互动?

The site is currently hosted on a mid-range CDN and is using a LAMP (Linux, Apache, MySQL, PHP) backbone.该站点目前托管在中档 CDN 上,并使用 LAMP(Linux、Apache、MySQL、PHP)主干。

Roughly, you want a fully normalized data structure for this.粗略地说,您需要一个完全规范化的数据结构。 You want to have one table for Users, one table for Meals (one entry per meal, with a reference to User; you probably also want to have a time / date of the meal in this table), and a table for MealItems, which is simply an association table between Meal and the Food Items table.您想为用户提供一张桌子,为膳食提供一张桌子(每餐一个条目,引用用户;您可能还希望在此表中包含用餐的时间/日期),以及用于 MealItems 的表,其中只是 Meal 和 Food Items 表之间的关联表。

So when a User comes in and creates an account, you make an entry in the Users table.因此,当用户进入并创建帐户时,您在用户表中创建一个条目。 When a user reports a Meal they've eaten, you create a record in the Meals table, and a record in the MealItems table for every item they reported.当用户报告他们吃过的一顿饭时,您会在 Meals 表中创建一条记录,并在 MealItems 表中为他们报告的每个项目创建一条记录。

This structure makes it straightforward to have a variable number of items with every meal, without wasting a lot of space.这种结构使得每顿饭都有可变数量的物品变得很简单,而不会浪费很多空间。 You can determine the representation of items in meals with a relatively simple query, as well as determining just what the total set of items any one user has consumed in any given timespan.您可以通过相对简单的查询来确定膳食中项目的表示,以及确定任何一个用户在任何给定时间跨度内消费的总项目集。

This normalized table structure will support a VERY large number of records and support a large number of queries against the database.这种规范化的表结构将支持非常大量的记录并支持对数据库的大量查询。

First,第一的,

Storing the data in some compressed way (like the food_data is an json_encoded string)以某种压缩方式存储数据(例如 food_data 是一个 json_encoded 字符串)

is not a recommended idea.不是推荐的想法。 This will cause you countless headaches in the future as new requirements are added.随着新需求的增加,这将在未来给您带来无数麻烦。

You should definitely have a few tables here.你绝对应该在这里有几张桌子。

Users
id, etc

Food Items
id, name, description, etc

Meals
id, user_id, category, etc

Meal Items
id, food_item_id, meal_id

The Meal Items would tie the Meals to the Food Items using ids. Meal Items 将使用 id 将 Meals 与 Food Items 联系起来。 The Meals would be tied to Users using ids.餐食将使用 id 与用户绑定。 This makes it simple to use joins in order to get detailed lists of data- totals, averages, etc. If the fields are properly indexed, this should be a great model to support a large number of records.这使得使用连接来获取数据总计、平均值等的详细列表变得很简单。如果字段被正确索引,这应该是一个很好的 model 来支持大量记录。

In addition to what's been said :除了所说的:

  • be judicious in your use of indexes.谨慎使用索引。 Properly applying these to your database could significantly speed up read access to your tables.将这些正确应用到您的数据库可以显着加快对表的读取访问速度。
  • Consider using language-specific features to minimize space.考虑使用特定于语言的功能来最小化空间。 You mention that you're using mysql;您提到您正在使用 mysql; consider using ENUM when appropriate (food types, meal types) to minimize database size and to simplify management.考虑在适当的时候使用ENUM (食物类型、膳食类型)以最小化数据库大小并简化管理。

I would split up your meal table into two tables, one table stores a single row for each meal, the second table stores one row for each food item used in a meal, with a foreign key reference to the meal it was used in.我会把你的饭桌分成两张表,一张表为每顿饭存储一行,第二张表为一顿饭中使用的每种食物项目存储一行,外键引用它所用的饭菜。

After that, just make sure you have indices on any table columns used in joins or WHERE clauses.之后,只需确保在连接或 WHERE 子句中使用的任何表列都有索引。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM