简体   繁体   English

关键问题:我应该在数据库中使用哪种关键策略?

[英]Key problem: Which key strategy should I use in my database?

Problem: When I use an auto-incrementing primary key in my database, this happens all the time: 问题:当我在数据库中使用自动递增的主键时,这种情况一直发生:

I want to store an Order with 10 Items. 我想存储一个包含10个项目的订单。 The ordered Items belong to the Order. 订购的商品属于订单。 So I store the order, ask the database for the last inserted id (which is dangerous when it comes to concurrency, right?), and then store the 10 Items with the foreign key (order_id). 因此,我存储了订单,向数据库询问最后插入的ID(当涉及到并发时,这很危险,对吗?),然后存储带有外键(order_id)的10个项目。

So I always have to do: 所以我总是要做:

INSERT ... 插入 ...

last_inserted_id = db.lastInsertId(); last_inserted_id = db.lastInsertId();

INSERT ... INSERT ... INSERT ... 插入...插入...插入...

and I believe this prevents me from using transactions in almost all INSERT cases where I need a foreign key. 而且我认为这使我无法在几乎所有需要外键的INSERT情况下使用事务。

So... here some solutions, and I don't know if they're really good: 所以...这里有一些解决方案,我不知道它们是否真的很好:

A) Don't use auto_increment keys! A)不要使用auto_increment键! Use a key table? 使用密钥表? Key Table would have two fields: table_name, next_key . 密钥表将具有两个字段: table_name, next_key Every time I need a key for a table to insert a new dataset, first I ask for the next_key by accessing a special static KeyGenerator class method. 每当我需要一个用于表的键来插入新的数据集时,首先我要通过访问特殊的静态KeyGenerator类方法来请求next_key。 This does a SELECT and an UPDATE, if possible in one transaction (would that work?) . 如果可能的话,这在一个事务中执行SELECT和UPDATE (这行得通吗?) Of course I would request that for every affected table. 当然,我会要求为每个受影响的表提供该服务。 Next, I can INSERT my entire object graph in one transaction without playing ping-pong with the database, before I know the keys already in advance. 接下来,我可以在一次交易中插入我的整个对象图,而无需与数据库打乒乓球,然后我才预先知道键。

B) Using GUUID / UUID algorithm for keys? B)对密钥使用GUUID / UUID算法吗? These suppose to be really unique worldwide, and they're LARGE. 这些应该在全球范围内确实是独一无二的,而且很大。 I mean ... L_A_R_G_E . 我的意思是... L_A_R_G_E So a big amount of memory would go into these gigantic keys. 因此,这些巨大的密钥将占用大量内存。 Indexing will be hard, right? 索引会很难,对吗? And data retrieval will be a pain for the database - at least I guess - integer keys are much faster to handle. 数据检索将是数据库的痛苦-至少我猜是这样-整数键处理起来要快得多。 On the other hand, these also provide some security: Visitors can't iterate anymore over all orders or all users or all pictures by just incrementing the id parameter. 另一方面,它们也提供了一定的安全性:访问者仅通过增加id参数就无法遍历所有订单或所有用户或所有图片。

C) Stick with auto_incremented keys? C)坚持使用自动增量键吗? Ok, if then, what about transactions like described in the example above? 好吧,如果可以的话,上例中所述的交易又如何呢? How can I solve that? 我该如何解决? Maybe by inserting a Ghost Row first and then doing an transaction with one UPDATE + n INSERTs? 也许是先插入Ghost行,然后使用一个UPDATE + n INSERT进行事务?

D) What else? D)还有什么?

When storing orders, you need transactions to prevent situations where only half your products are added to the database. 存储订单时,您需要进行交易以防止仅将一半产品添加到数据库的情况。

Depending on your database and your connector, the value returned by the last-insert-id function might be transaction-independent. 根据您的数据库和连接器,last-insert-id函数返回的值可能与事务无关。 For instance, with MySQL, mysql_insert_id returns the identifier for the last query from that particular client (without being affected by what other clients are doing concurrently). 例如,对于MySQL, mysql_insert_id返回来自该特定客户端的最后一个查询的标识符(不受其他客户端同时执行的操作的影响)。

Which database are you using? 您正在使用哪个数据库?

Yes, typically inserting a record and then trying to select it again to find the auto-generated key is bad, especially if you are using a naive select max(id) from table query. 是的,通常插入一条记录然后尝试再次选择它以查找自动生成的键是不好的,特别是如果您使用的是从表查询中选择朴素的select max(id)的话。 This is because as soon as two threads are creating records max(id) may not actually return the last id your current thread used. 这是因为一旦两个线程创建了记录,max(id)可能实际上不会返回当前线程使用的最后一个id。

One way to avoid this is to create a sequence in the database. 避免这种情况的一种方法是在数据库中创建一个序列。 From your code you select sequence.NextValue then use that value to then execute your inserts (or you can craft a more complex SQL statement that does this selection and the inserts in one go). 从您的代码中选择sequence.NextValue然后使用该值执行插入(或者您可以编写更复杂的SQL语句来一次进行选择和插入)。 Sequences are atomic / thread-safe. 序列是原子/线程安全的。

In MySQL you can ask for the last inserted id from the execution results which I believe will always give you the correct answer. 在MySQL中,您可以从执行结果中请求最后插入的ID,我相信它将始终为您提供正确的答案。

Sql Server supports SCOPE_IDENTITY (Transact-SQL) which should take care of your transaction issue and concurrency issue. Sql Server支持SCOPE_IDENTITY(Transact-SQL) ,它应该解决您的事务问题和并发问题。

I would say stick with auto_increment . 我会说坚持auto_increment

(Assuming you are using MySQL) (假设您正在使用MySQL)

"ask the database for the last inserted id (which is dangerous when it comes to concurrency, right?)" “向数据库询问最后插入的ID(在进行并发时很危险,对吧?)”

If you use MySQLs last_insert_id() function, you only see what happened in your session. 如果使用MySQL的last_insert_id()函数,则只会看到会话中发生的情况。 So this is safe. 所以这很安全。 You mention ths: 您提到:

db.last_insert_id()

I don't know what framework or language it is, but I would assume that uses MySQL's last_insert_id() under the covers (if not, it is a pretty useless database abstraction fromework) 我不知道它是什么框架或语言,但是我会假设它在后台使用了MySQL的last_insert_id()(如果没有,那是从工作中获得的非常无用的数据库抽象)

" I believe this prevents me from using transactions in almost all INSERT cases w" “我相信这使我无法在几乎所有插入情况下使用事务”

I don't see why. 我不明白为什么。 Please explain. 请解释。

D) Sequence D)顺序

: may not be available in your DBMS, but if it is, solves your problem elegantly. :可能在您的DBMS中不可用,但如果可用,则可以优雅地解决您的问题。

For Postgresql, have a look at Sequence Functions 对于Postgresql,请看一下Sequence函数

There is no final and general answer to this question. 这个问题没有最终和普遍的答案。

auto incrementing columns are easy to use when you add new records. 添加新记录时,易于使用自动递增列 To use them as foreign keys within the same transaction, they are not so straight forward. 要将它们用作同一事务中的外键,它们并不是那么简单。 You need database specific commands to get the newly created key. 您需要特定于数据库的命令来获取新创建的密钥。 This technology is common for certain databases, for instance sql server. 对于某些数据库(例如sql server),此技术很常见。

Sequences seem to be harder to use, because you need to get a key before you insert a row, but at the end its easier to use them as foreign keys. 序列似乎更难使用,因为在插入行之前需要先获得一个键,但最后它更易于用作外键。 This technology is common for certain databases, for instance oracle. 该技术对于某些数据库(例如oracle)是通用的。

When you use Hibernate or NHibernate , it is discouraged to use auto incrementing keys, because some optimizations are not possible anymore. 当您使用Hibernate或NHibernate时 ,不建议使用自动递增键,因为不再可能进行某些优化。 Using a hi-lo algorithm which uses an additional table is recommended. 建议使用使用附加表的hi-lo算法。

Guids are strong, for instance when sharing data between different databases, systems, disconnected scenarios, import / export etc. In many databases, most of the tables contain only a few hundred records, so memory and performance are not such an issue. 指南很强大,例如,在不同数据库,系统,断开连接的方案,导入/导出等之间共享数据时。在许多数据库中,大多数表仅包含几百条记录,因此内存和性能并不是问题。 When using NHibernate, you get an guid generator which produces sequential guids , because some databases perform better when keys are sequential. 使用NHibernate时,您会得到一个guid生成器,该生成器会生成顺序的guid ,因为当键是顺序的时,某些数据库的性能会更好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM