简体   繁体   English

在具有多个插入的INNODB表上使用复合PRIMARY KEY或UNIQUE键

[英]Using composite PRIMARY KEY or UNIQUE key on INNODB table with multiple inserts

I have been trying to figure this out, but no luck so far. 我一直在试图解决这个问题,但是到目前为止还没有运气。

Which one is better: A table with a composite PRIMARY KEY OR a single PRIMARY KEY and a UNIQUE index? 哪个更好:带复合PRIMARY KEY或单个PRIMARY KEY和UNIQUE索引的表?

My table looks like this: 我的桌子看起来像这样:

CREATE TABLE data (
  bucket_id INTEGER,
  backend_id INTEGER,
  unique_id INTEGER,
  weight INTEGER,
  PRIMARY KEY (bucket_id, unique_id)
) ENGINE=InnoDB

I am doing multiple inserts. 我正在多次插入。 6 billion+ rows to be inserted in multi value inserts in the form of 60亿+行将以以下形式插入到多值插入中

INSERT IGNORE INTO data VALUES (x1, x2, x3, x4), (y1, y2, y3, y4), .......)

with 500000 rows in each (limited by the client). 每个中有500000行(受客户端限制)。 These are all done on startup of the application and currently I need to speed this up as much as possible. 这些都是在应用程序启动时完成的,目前,我需要尽可能加快速度。 I need the uniqueness of (backend_id, unique_id). 我需要(backend_id,unique_id)的唯一性。 Though I do not control these so there are duplicates in the imported data. 尽管我不控制这些,所以导入的数据中存在重复项。

So the question is, will using a UNIQUE index instead of a composite PRIMARY KEY help me to increase speed of the insert statements? 所以问题是,使用UNIQUE索引而不是复合PRIMARY KEY可以帮助我提高insert语句的速度吗? I know a lot of other factors affect this ie buffer pools and so on. 我知道许多其他因素会影响此,例如缓冲池等。

I'm pretty sure that primary key constraints in all modern database management systems are implemented using unique indexes. 我非常确定,所有现代数据库管理系统中的主键约束都是使用唯一索引实现的。 In SQL the declarations PRIMARY KEY and NOT NULL UNIQUE are behaviorally equivalent. 在SQL中,声明PRIMARY KEYNOT NULL UNIQUE在行为上是等效的。

Your question boils down to this: is it faster to use a surrogate key in addition to the requisite constraint on {backend_id, unique_id}? 您的问题可以归结为: 除了对{backend_id,unique_id}的必要约束之外 ,使用替代键是否更快? Note carefully that using a surrogate key instead of the requisite constraint on {backend_id, unique_id} isn't generally acceptable, because it omits an important business requirement. 请仔细注意,在{backend_id,unique_id}上使用代理键代替必需的约束通常是不可接受的,因为它忽略了重要的业务需求。

Adding a surrogate key 添加代理密钥

  • makes the table wider, 使桌子更宽,
  • increases the number of bytes that have to be written to the table, and 增加了必须写入表的字节数,并且
  • increases the number of indexes that have to be written. 增加了必须写入的索引数。

So adding a surrogate key will probably slow you down. 因此,添加代理密钥可能会减慢您的速度。 This is probably the best structure for your stated requirements if you require concurrent access . 如果您需要并发访问,那么这可能是满足您陈述的要求的最佳结构。

CREATE TABLE data (
  backend_id INTEGER,
  unique_id INTEGER,
  weight INTEGER,
  PRIMARY KEY (backend_id, unique_id)
) ENGINE=InnoDB

But if you can essentially run in single-user mode, it's fastest to load a table with no constraints, using the bulk loader. 但是,如果你能在单用户模式运行的本质,它的最快加载表没有约束,使用批量加载器。 Then add the constraints later with ALTER TABLE statements. 然后,稍后使用ALTER TABLE语句添加约束。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM