[英]Store this data model as non-relational?
I am currently in a situation where I think of redesigning one of my classic relational database tables into a non-relational design and I am not sure if I should do it or not.我目前处于一种情况,我想将我的一个经典关系数据库表重新设计为非关系设计,但我不确定是否应该这样做。 Reason is that performance problems simply got out of control.原因是性能问题完全失控了。
This table has 11 columns and it is designed like this:该表有 11 列,其设计如下:
Id bigint (PK) -- Clustered Index
FK1 bigint (FK) -- Non-Clustered Unique Composite Index ( FK1_FK2_FK3 in this order )
FK2 bigint (FK) -- Non-Clustered Unique Composite Index ( FK1_FK2_FK3 in this order )
FK3 bigint (FK) -- Non-Clustered Unique Composite Index ( FK1_FK2_FK3 in this order )
Value1 nvarchar(100)
Value2 nvarchar(100)
Value3 nvarchar(100)
Value4 nvarchar(100)
Value5 nvarchar(100)
Value6 nvarchar(100)
Value7 nvarchar(100)
Value8 nvarchar(100)
Here are some facts:以下是一些事实:
5% of the requests look like this: 5% 的请求如下所示:
SELECT *
FROM myTable
WHERE Id = 12346 -- (works perfectly)
SELECT *
FROM myTable
WHERE Id IN (123456, 654321) -- (works OK because IN list contains only a small number of IDs)
UPDATE myTable
SET .....
WHERE Id = 123456 (works perfectly)
Unfortunately 95% of the requests look like this:不幸的是,95% 的请求看起来像这样:
SELECT *
FROM myTable
WHERE Fk1 = 123456 AND FK2 = 654321
-- works badly because it gets 100.000 - 300.000 records but I need all of them. Yes, unique index is used because order of index is correct )
UPDATE myTable
SET Value1 = '1', Value2 = '2', Value3 = '3', Value 4 = '4',
Value5 = '5', Value6 = '6', Value7 = '7', Value8 = '8'
WHERE Fk1 = 123456 AND FK2 = 654321 -- horrible because also 300.000 and yes, unique index is used because order of index is correct )
Instead I would like to design it like that:相反,我想这样设计它:
Id1 bigint (PK) -- Clustered Composite Index (former FK1 column)
Id2 bigint (PK) -- Clustered Composite Index (former FK2 column)
ContentColumn JSON -- Contains all former Value columns and the FK3 column as an array of objects. A object is column FK3, Value1, Value2 ....
ArrayLength INT -- length of json array
So what do you guys think?那你们怎么看? Should I give it a try?我应该试一试吗?
Or do you have some completely different ideas?或者你有一些完全不同的想法?
Thanks for any help!谢谢你的帮助!
Or do you have some completely different ideas?或者你有一些完全不同的想法?
You can do better with a better set of index designs.您可以使用一组更好的索引设计做得更好。 You want to optimize for your most expensive queries:您想针对最昂贵的查询进行优化:
SELECT *
FROM myTable
WHERE Fk1 = 123456 AND FK2 = 654321
-- works badly because it gets 100.000 - 300.000 records but I need all of them. Yes, unique index is used because order of index is correct )
UPDATE myTable
SET Value1 = '1', Value2 = '2', Value3 = '3', Value 4 = '4',
Value5 = '5', Value6 = '6', Value7 = '7', Value8 = '8'
WHERE Fk1 = 123456 AND FK2 = 654321
Making FK1_FK2_FK3 the clustered index and making ID a non-clustered PK would be better.将 FK1_FK2_FK3 设为聚集索引并将 ID 设为非聚集 PK 会更好。 For queries that retrieve a handful of rows, using nested loop join from the non-clustered PK to the composite clustered index should be fine.对于检索少量行的查询,使用从非聚集 PK 到复合聚集索引的嵌套循环连接应该没问题。 But doing 300,000 lookups when querying by (Fk1,Fk2) is going to be expensive.但是通过 (Fk1,Fk2) 查询时进行 300,000 次查找会很昂贵。 It's so expensive that these queries might be doing table scans instead.它是如此昂贵,以至于这些查询可能会进行表扫描。
And after clustering the table by (FK1,FK2,FK3) consider partitioning it by FK2 into 10-100 separate partitions.在通过 (FK1,FK2,FK3) 对表进行聚类后,考虑通过 FK2 将其划分为 10-100 个单独的分区。 Then a predicate like WHERE Fk1 = 123456 AND FK2 = 654321
will only have to scan the partition containing FK2=654321, and can seek in that partition directly to the first page with FK1=123456.然后像WHERE Fk1 = 123456 AND FK2 = 654321
这样的谓词只需要扫描包含 FK2=654321 的分区,并且可以在该分区中直接查找 FK1=123456 的第一页。
In addition consider ROW or PAGE compression if PAGEIOLATCH waits are a significant part of your query runtime.此外,如果 PAGEIOLATCH 等待是查询运行时的重要部分,请考虑 ROW 或 PAGE 压缩。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.