简体   繁体   English

Oracle SQL中的哈希分区

[英]Hash partitioning in oracle SQL

I have a table like this one: 我有一张这样的桌子:

CREATE TABLE "TS1" 
       (    
        "ID" VARCHAR2(32 BYTE) NOT NULL, 
        "CID" VARCHAR2(70 BYTE) NOT NULL, 
        "PID" VARCHAR2(21 BYTE) NOT NULL, 
        "LASTUSAGE" TIMESTAMP (6) NOT NULL, 
        "CREATIONTIME" TIMESTAMP (6) NOT NULL, 
        "COSTCENTER" NUMBER NOT NULL
       );

ALTER TABLE "TS1" ADD CONSTRAINT "TS1_PRIMARY" PRIMARY KEY ("ID", "CID", "PID");

I tried to find a good way to partition the table considering: 考虑到以下问题,我试图找到一种分区表的好方法:

  • I have no query that use creationTime in where clause ( So range partition maybe is not the best solution on this field) 我没有在where子句中使用creationTime的查询(因此范围分区可能不是此字段上的最佳解决方案)
  • LastUsage is updated very often (so range partition maybe is not the best solution on this field) LastUsage经常更新(因此范围分区可能不是此字段上的最佳解决方案)
  • Most of the queries uses ID, CID, PID in where clause 大多数查询在where子句中使用ID,CID,PID

So standing this, a good option should be HASH PARTITION on ID,CID,PID. 因此,一个好的选择应该是ID,CID,PID上的HASH PARTITION。

CREATE TABLE "TS1" 
       (    
        "ID" VARCHAR2(32 BYTE) NOT NULL, 
        "CID" VARCHAR2(70 BYTE) NOT NULL, 
        "PID" VARCHAR2(21 BYTE) NOT NULL, 
        "LASTUSAGE" TIMESTAMP (6) NOT NULL, 
        "CREATIONTIME" TIMESTAMP (6) NOT NULL, 
        "COSTCENTER" NUMBER NOT NULL
       )       
PARTITION BY HASH ("ID", CID, PID)
PARTITIONS N;  --N = number of partitions


ALTER TABLE "TS1" ADD CONSTRAINT "TS1_PRIMARY" PRIMARY KEY ("ID", "CID", "PID");

Is it a problem if I'm partitioning by hash using the primary key as parameter? 如果我使用主键作为参数按哈希分区是否有问题? Let's suppose to have lot of records in table TS1 (millions) I will receive some perfomance benefits from this partitioning? 假设在表TS1中有很多记录(百万),我将从这种分区中获得一些性能好处吗?

"Most of the queries uses ID, CID, PID in where clause" “大多数查询在where子句中使用ID,CID,PID”

This means most queries are single row lookups on the primary key, so there is no way partition elimination can make things faster. 这意味着大多数查询都是在主键上的单行查找,因此消除分区无法使事情变得更快。 All it might do is make those few queries which don't use the key slower (because say reads using an index range scan might not be as performative). 它可能要做的就是使一些不使用键的查询变慢 (因为说使用索引范围扫描的读取可能不那么有效)。

There are three reasons to implement Partitioning. 实施分区的三个原因。 They are: 他们是:

  • data management . 数据管理 We can load data into a single partition using partition exchange, or zap data using drop or truncate partition with no impact on the rest of the table. 我们可以使用分区交换将数据加载到单个分区中,或者使用drop或truncate分区将数据加载到zap数据中,而不会影响表的其余部分。
  • availability . 可用性 We can have a separate tablespace for each partition which localises the impact of datafile corruption or similar. 对于每个分区,我们可以有一个单独的表空间,以定位数据文件损坏或类似事件的影响。
  • performance . 表现 Queries which work with the grain of the partitioning key may benefit from partition pruning. 使用分区键粒度的查询可能会受益于分区修剪。 Queries which might benefit are those which will execute a range scan; 执行范围扫描的查询可能会有所帮助; if we load a million rows into a table each day and we generally want to retrieve records for a given day we would get a lot of benefit from partitioning by day. 如果我们每天将一百万行加载到一个表中,并且通常希望检索给定一天的记录,那么按天分区将为您带来很多好处。
  • concurrent DML . 并发DML If our application has a large number of users inserting, changing and deleting records we may have eg waits for ITL slots or latch contention, some times known. 如果我们的应用程序有大量的用户插入,更改和删除记录,我们可能会等待某些时候,例如,等待ITL时隙或闩锁争用。 hot blocks . 热块 Hash partitioning can help here, by distributing inserts and hence all other activity across the whole table. 哈希分区可以通过在整个表中分配插入内容以及所有其他活动来帮助您。

Partitioning by a hash of ("ID", CID, PID) won't help you with performance, if the usage profile is as you describe. 如果使用情况描述文件如您所描述的那样("ID", CID, PID)("ID", CID, PID)的散列进行分区将不会对性能有所帮助。 Nor will it give you any data management advantage. 它也不会给您任何数据管理优势。 It seems unlikely you're interested in the availability benefits (because millions of rows seems too small a number to worry about). 您似乎不太会对可用性的好处感兴趣(因为数百万行似乎太少了,不必担心)。

So that leaves concurrent DML. 这样就留下了并发DML。 If the performance problem you are trying to solve is writing rather than reading and the pattern of concurrent activity aligns with some aspect of the primary key (say most DML is for the newest rows) then perhaps hash partitioning will alleviate the latch contention. 如果您要解决的性能问题是写入而不是读取,并且并发活动的模式与主键的某些方面保持一致(例如,大多数DML用于最新行),那么散列分区可能会减轻闩锁争用。 If that sounds like your situation you should test Partitioning in an environment with Production-like volumes of data and Production levels of activity. 如果听起来像您的情况,则应在具有类似于生产的数据量和生产活动级别的环境中测试分区。 (Not always easy to do.) (并非总是容易做到的。)

Otherwise Partitioning seems like a solution in search of a problem. 否则,分区似乎是解决问题的一种解决方案。

My original wrong answer: 我原来的错误答案:

It does not make sense to partition by the primary key because every partition will hold a single row. 通过主键进行分区是没有意义的,因为每个分区都将保留一行。 There is overhead associated with partitioning so you want to keep the number of partitions to a reasonable number like under 1000. 分区会产生开销,因此您希望将分区数保持在合理的数量(例如1000以下)。

I think I was thinking of list partitions with your primary key values as the list values. 我想我正在考虑将主键值作为列表值的列表分区。 See the comments below. 请参阅下面的评论。

How about creating local indexes on ID, CID, PID and hash partition on same column. 如何在同一列的ID,CID,PID和哈希分区上创建本地索引。 Will there not be benefit of index scan rather scanning index for complete table it has to scan of individual partition 将没有索引扫描的好处,而是扫描整个表的索引,它必须扫描单个分区

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM