简体   繁体   English

数据库结构:具有公共ID的多个表

[英]DB Structure : Multiple tables with common ID

I'm building an app that need to manage various objects (3 at the moment, but that might increase in time). 我正在构建一个需要管理各种对象的应用程序(目前为3个,但时间可能会增加)。 All these objects have a unique ID using the same format, but no other attribute in common. 所有这些对象都有使用相同格式的唯一ID,但没有其他共同的属性。

So I created a table for every object, but I'm wondering how to do an optimized search by ID.I want to build a good process form the start, because the total number of rows could become very high, and I don't want to have to rewrite code in a couple of months because it would have become too slow. 所以我为每个对象创建了一个表,但是我想知道如何通过ID进行优化搜索。我想从头开始构建一个好的流程,因为总行数可能会很高,而且我不会希望在几个月后重新编写代码,因为它变得太慢了。

I thought of NoSQL databases, but I am required to use MySQL. 我想到了NoSQL数据库,但必须使用MySQL。 The PHP code uses Laravel 4 with the Eloquent ORM. PHP代码将Laravel 4与Eloquent ORM结合使用。

Let's say I want the item with ID abcd-123456 , I have no idea which table to query, so I thought of this : 假设我想要ID为abcd-123456的项目,但我不知道要查询哪个表,所以我想到了这一点:

  1. When inserting an object, store the ID along with the table name in another table ( CommonIndex ) 插入对象时,将ID和表名一起存储在另一个表( CommonIndex )中
  2. When querying by ID, lookup the table name in the CommonIndex table, store in the $tableName variable 通过ID查询时,在CommonIndex表中查找表名,存储在$tableName变量中
  3. Retrieve the final data by using $tableName::find('abcd-123456') in Eloquent (using models nammed exactly like my tables) 通过在Eloquent中使用$tableName::find('abcd-123456')检索最终数据(使用与我的表完全相同的模型命名)

But I'm worried this process will become sluggish when I have to search my ID in 300k+ rows 但是我担心当我必须在300k +行中搜索我的ID时,此过程会变得很缓慢

Any thoughts about how to improve this process, or building a new one ? 是否有关于如何改进此过程或建立新过程的想法?

Thanks ! 谢谢 !

EDIT : More informations: 编辑:更多信息:

  • My tables are not linked to each other, each one represent a type of object 我的表没有相互链接,每个表代表一种对象
  • Each table has an ID field 每个表都有一个ID字段
  • Each object has a unique ID, but on the same format (ex : Table 1 contains objects ab-1 de-3 hi-5 Table 2 cd-2 gh-4 jk-6 etc...) 每个对象都有唯一的ID,但格式相同(例如:表1包含对象ab-1 de-3 hi-5表2 cd-2 gh-4 jk-6等)
  • Two objects from different types cannot have the same ID 不同类型的两个对象不能具有相同的ID
  • I do not assign the ID, each object already has one 我没有分配ID,每个对象已经有一个
  • Most of the searches will be done by ID, because it's easier for the users 大多数搜索将通过ID进行,因为这对用户来说更容易
  • In specific conditions, users may search a product on a different field than the ID, but I don't have the same worries about performance, because it will be very rare 在特定条件下,用户可能会在与ID不同的字段中搜索产品,但是我对性能没有相同的担心,因为这种情况很少见
  • To allow for searches on others fields, these specific fields will be indexed (1 or 2 per table) 为了允许在其他字段上进行搜索,将为这些特定字段建立索引(每个表1或2)
  • The search will be one by one 搜索将一一进行
  • If I add a batch search along the way, I will process it one by one 如果在此过程中添加批处理搜索,则将逐个处理它

Questions : 问题:

Sorry, I can't add comments because I don't have 50 rep yet, but I have some questions : 抱歉,由于我还没有50名代表,所以我无法添加评论,但是我有一些问题:

Where are the ids coming from? 身份证从哪里来? Is it from an external system? 是来自外部系统吗? Or are you giving them the ids? 还是您给他们提供ID?

Why do you need to search by id? 为什么需要按ID搜索? For internal purpose or users will use those ids? 出于内部目的,还是用户会使用这些ID?

Are the ids really alphabetical? id真的是字母吗? Numbers would be more efficient. 数字会更有效。

Will you search by multiple ids at the same time or just one by one? 您会同时搜索多个ID还是一个一个地搜索?


One possible solution : 一种可能的解决方案:

One simple thing you could do (depending on your needs), is to use just one table with 2 columns : 您可以做的一件事(取决于您的需要)很简单,就是只使用一个带有2列的表:

  • Your ID 你的身份证
  • The PHP object stored as a string (aka serialized) PHP对象存储为字符串(也称为序列化)

There's downsides though. 有缺点。 Check out this URL for pros and cons : http://www.mysqlperformanceblog.com/2010/01/21/when-should-you-store-serialized-objects-in-the-database/ 查看此URL的优缺​​点: http : //www.mysqlperformanceblog.com/2010/01/21/when-should-you-store-serialized-objects-in-the-database/

(for example, if you need to search on other property than your id, it won't work. If you need to update the DB often, it's not what's more efficient either) (例如,如果您需要搜索ID以外的其他属性,那么它将无法正常工作。如果您需要经常更新数据库,那也不是更有效的选择)

It's really easy to serialize and unserialize object in PHP : 在PHP中序列化和反序列化对象真的很容易:

$a = your_object;
$s = serialize($a);
// save data into database. $s is now your object, but in a string format.

// retreive the value from your database ($s)
$s = get_from_database($id);
$a = unserialize($s);
// do whatever you want now with your object

Another solution is the one you mentioned but I wouldn't store the table name. 另一个解决方案是您提到的解决方案,但我不存储表名。 A number is more efficient. 数字更有效。


Update : 更新:

Since you can't really store the serialized object, I think what you suggested is the best way. 由于您不能真正存储序列化的对象,因此我认为您的建议是最好的方法。 300k for MySQL is manageable, just ensure that you have an index on your id column. 对于MySQL,300k是可管理的,只需确保在id列上有一个索引即可。

Also, if there are often searches for a particular group of columns (for example the users often search by id, first name and last name), you'll want to use a composite index on both columns (it takes more disk space tough). 另外,如果经常搜索特定的一组列(例如,用户经常按ID,名字和姓氏搜索),则需要在两个列上使用复合索引(这会占用更多的磁盘空间) 。

If you want to be certain that the queries (1 to get the table and the 2nd to get the data) will be efficient, you can easily enter 300K entries with a small php script (a loop with inserts) or with data generators (I found this one : http://www.generatedata.com/ ). 如果您想确定查询(第一个获取表,第二个获取数据)的效率,则可以使用小型php脚本(带有插入的循环)或数据生成器(I)轻松输入300K条目找到了一个: http : //www.generationata.com/ )。

I would enter 300k in 2 tables (in your "index" table and one of the object table) and test the time it takes to make 2 queries, one on the "index" table and the other one on the object table. 我将在2个表(在您的“索引”表和一个对象表中)中输入300k,并测试进行2个查询所花费的时间,一个在“索引”表上,另一个在对象表上。


Another thing you could try is using a stored procedure (you could do the choice of the table based on the type of the object in the stored procedure). 您可以尝试的另一件事是使用存储过程(您可以根据存储过程中对象的类型来选择表)。

You have a fundamental problem with a one-query solution: you don't know what columns are being returned for each object, because the columns depend on the type. 单查询解决方案存在一个基本问题:您不知道要为每个对象返回哪些列,因为这些列取决于类型。

Your approach with the CommonIndex table sounds like a reasonable approach. 您对CommonIndex表的处理方式听起来很合理。 Just be sure that the id column is indexed on each of the object tables. 只需确保在每个对象表上都为id列建立了索引。

I do find it surprising that objects that share a common format for an id do not have any fields in common. 我确实感到惊讶的是,共享id的通用格式的对象没有任何公共字段。 If there are common fields, then these would go in CommonIndex . 如果存在公共字段,则将它们放在CommonIndex

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM