[英]Why Ceph calculate PG ID by object hash rather than CRUSH algorithm?
Ceph using CRUSH algorithm for PG->OSD mapping and it works fine for increasing/decreasing of OSD nodes. Ceph 使用 CRUSH 算法进行 PG->OSD 映射,它适用于增加/减少 OSD 节点。
But for obj->PG mapping, Ceph still uses the traditional hash, which is pgid = hash(obj_name) % pg_num
.但是对于 obj->PG 映射,Ceph 还是使用传统的 hash,即
pgid = hash(obj_name) % pg_num
。 This approach may lead to massive data migration if we change the number of PGs, even reduce the availability of the system.如果我们改变 PG 的数量,这种方法可能会导致大量的数据迁移,甚至会降低系统的可用性。
Why Ceph doesn't use CRUSH algirhtm (say straw2) for obj->PG mapping which could have optimal amount of data migration when the number of PGs is changed?为什么 Ceph 不使用 CRUSH algirhtm(比如稻草 2)进行 obj->PG 映射,当 PG 的数量发生变化时,它可能具有最佳的数据迁移量?
There are different scenarios and CRUSH is not a silver bullet I think.有不同的场景,我认为 CRUSH 不是灵丹妙药。
This is my perception, criticism or discussion is welcome.这是我的看法,欢迎批评或讨论。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.