简体   繁体   English

Mysql 在表中存储任意数量的值的最佳方式

[英]Mysql Best way to store an arbitrary number of value in a table

I am wondering what would be the best approach to store, for let's say languages in a user table when the user can have as many langauge as he wishes, and hopefully without using serialized data as this field will be searched intensivly.我想知道什么是最好的存储方法,让我们说user表中的languages ,当用户可以拥有他想要的尽可能多的语言时,并且希望不使用序列化数据,因为这个字段将被密集搜索。

I was thinking limtating the number of entries, for exemple maximun 4 language and in the user table have lang1, lang2..我在想限制条目的数量,例如最大 4 种语言,并且在用户表中有 lang1、lang2..

Is there a better way to achieve this?有没有更好的方法来实现这一目标?

It's called database normalization .这称为数据库规范化 Specifically you need to map a "Many to Many" association具体来说,你需要 map 一个“多对多”关联

You need 3 tables.你需要 3 张桌子。

User(id, name)
Language (id, language_name)
User_Language(id,id_user,id_language)

To get all the language for a user id 3:获取用户 id 3 的所有语言:

SELECT l.language_name
FROM User u
JOIN user_language ul ON (u.id=ul.id_user)
JOIN  Language l ON (l.id = ul.id_language)
WHERE u.id = 3

EDIT:编辑:

Two things are important to notice @silkAdmin.注意@silkAdmin 有两点很重要。 The first one, as @BryceAtNetwork23 noted, there's no need to put an id on the User_Language table.第一个,正如@BryceAtNetwork23 指出的那样,不需要在 User_Language 表上放置一个 id。 The second is that, you should learn about joins , specially MySQL Joins (becouse the SQL tends to differ in different DB engines).第二个是,您应该了解连接,特别是 MySQL 连接(因为 SQL 在不同的数据库引擎中往往不同)。 After you dig a little bit more you will be able to see that joining the User table in the previous query is also not needed, that could be simplified as:在你深入挖掘之后,你将能够看到在之前的查询中也不需要加入 User 表,这可以简化为:

SELECT l.language_name
FROM user_language ul
JOIN  Language l ON (l.id = ul.id_language)
WHERE ul.user_id = 3

But I added it in the first answer to make things easier to you.但是我在第一个答案中添加了它,以使事情对您来说更容易。

Why using the Language table为什么使用语言表

My answer just reflects the way I'd do it.我的回答只是反映了我会这样做的方式。 There are plenty of ways to acomplish what've asked for.有很多方法可以完成所要求的。 Said that, i explain myself.这么说,我自己解释。

Let's think in extremes.让我们极端地想一想。 The first extreme is to store the languages in the user table, as you said above.第一个极端是将语言存储在用户表中,正如您上面所说的。 For example, we can have a column and separate the values with a semicolon.例如,我们可以有一列并用分号分隔值。 Something like this像这样的东西

User: (1, "John", "spanish;english;japanese")

The advantage of that is that you won't need any join.这样做的好处是您不需要任何连接。 Given the id of your user you can get the languages.鉴于您的用户的 ID,您可以获得语言。 The disadvantages is that it will be really painful to search on that.缺点是搜索起来真的很痛苦。 How you get all your users with language "Spanish"?您如何让所有用户使用“西班牙语”语言? (The bottom line here is that you can't index your data). (这里的底线是你不能索引你的数据)。 Another disadvantage, that is kind of old now, is the overuse of disk space.另一个缺点,现在有点老了,是磁盘空间的过度使用。 In the time when the DBs and Normalization was invented, disk space was really costly.在发明 DB 和 Normalization 的时候,磁盘空间非常昂贵。 So, storing this:所以,存储这个:

User: (1, "John", "spanish;english;japanese") 
User: (2, "Mary", "spanish;english")

That was somthing that couldn't be tolerated.那是不能容忍的事情。 So, some guy came and say: "Hey, let's use ids, so, we can turn it into":所以,有人过来说:“嘿,让我们使用 id,这样我们就可以把它变成”:

User: (1, "John", "1;2;3") 
User: (2, "Mary", "1;2")

Language (1,"spanish")
Language (2,"english")

For 10.000 users and just a few hundred of languages, that's a huge improvement on disk usage (maybe in our time, this is not true anymore, and i'll come to that later).对于 10.000 名用户和几百种语言,这是磁盘使用量的巨大改进(也许在我们这个时代,这不再是真的,我稍后会谈到)。 That solved the disk problem, but we still has the search problem.这解决了磁盘问题,但我们仍然有搜索问题。 Again, How you get all your users with language "Spanish"?同样,您如何让所有用户使用“西班牙语”语言? Well, with this design, you should iterate over the users table and get the language column, split it between ";"那么,对于这种设计,您应该遍历用户表并获取语言列,将其拆分为“;” and look for the id 1.并寻找 id 1。

That's why we started using the approach I showed you before.这就是我们开始使用我之前向您展示的方法的原因。

So, so far so good.所以,到目前为止一切顺利。 Pretty good explanation;)很好的解释;)

Big disclaimer大免责声明

As I said before, there are several ways to do this.正如我之前所说,有几种方法可以做到这一点。 It depends on your case and what do you want to achive.这取决于您的情况以及您想要实现的目标。 If you want to search in terms of that column (give me users that speak english, for example) you should consider the design i told you at the top of my answer.如果您想根据该列进行搜索(例如,给我说英语的用户),您应该考虑我在答案顶部告诉您的设计。

Right now there are a "new wave" of data solutions that are called no-sql databases (it varies) that try to denormalize data.现在有一个“新浪潮”的数据解决方案被称为 no-sql 数据库(它有所不同),它们试图对数据进行非规范化。 If you're concerned about the over-normalization of your schemas, you should take a look at that.如果您担心模式的过度规范化,您应该看看它。 I recommend you MongoDB and CouchDB, becouse those are the easier to start with.我向您推荐 MongoDB 和 CouchDB,因为它们更容易上手。

About joins关于加入

Don't worry about the performance of 2 joins.不用担心 2 连接的性能。 If you've performance issues it's not for this.如果你有性能问题,那不是为了这个。 DB engines are created with this purpose.数据库引擎就是为此目的而创建的。 With a good memory cache and index optimization it should work smoothly.通过良好的 memory 缓存和索引优化,它应该可以顺利运行。

Yes, the best way is to use an additional table with columns lang_id and user_id .是的,最好的方法是使用一个包含lang_iduser_id列的附加表。 There you can store any number of user/language associations (one per row).您可以在那里存储任意数量的用户/语言关联(每行一个)。

Create table user_languages创建表 user_languages

 user_id int,
 language_id int,

with constraints:有约束:

 PRIMARY KEY (user_id, language_id),
 FOREIGN KEY (language_id) REFERENCES language(id),
 FOREIGN KEY (user_id) REFERENCES users(id)

With such constraints, users can have assigned as many languages as you want.有了这样的限制,用户可以根据需要分配任意多种语言。

I think the best way to achieve this, is to have a USER table, a USER_LANGUAGES table and a LANGUAGES table.我认为实现此目的的最佳方法是拥有一个 USER 表、一个 USER_LANGUAGES 表和一个 LANGUAGES 表。 This way, a user can have as many languages as they want.这样,用户可以拥有任意多的语言。

USER
user_id int
user_name varchar

USER_LANGUAGES
user_id int
lang_id int

LANGUAGES
lang_id int
lang_name varchar

USER stores the user-based fields. USER 存储基于用户的字段。 LANGUAGES stores data on each specific langauge (English, German, etc...). LANGUAGES 存储每种特定语言(英语、德语等)的数据。 USER_LANGUAGES stores the association of which users know which language(s). USER_LANGUAGES 存储哪些用户知道哪些语言的关联。

I think you should consider having two tables.我认为你应该考虑有两张桌子。 One with users and one with languages .一个是users ,一个是languages It is easier to maintain and it is easier to do joins to these tables.它更容易维护,也更容易joins这些表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM