I would like to know the difference between these 2 rules:
# rules
rule rack_rule{
ruleset 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type rack
step emit
}
and
rule 2rack_2host{
ruleset 0
type replicated
min_size 1
max_size 10
step take default
step choose firstn 2 type rack
step chooseleaf firstn 2 type host
step emit
}
In my understanding, the first rule rack_rule will take rack as failure domain as a result in every PG, we will have osds from different racks. So for example, if I have 2 racks and replication size = 2 I will have a PG [osd.1,osd.2] and these 2 osds should be from different racks.
In the second rule, I think it should select 2 different racks and for each rack it will select 2 different hosts. So, also if I have 2 racks and replication size = 2 I will have a PG [osd.1,osd.2] and these 2 osds should be from different racks.
This is theoritically, what I understood, but I don't see these expected results on practice. With these two rules, I have osds in the same rack for a PG inside a pool with replication size 2
Your conclusion is not entirely correct. The first rule
step take default
step chooseleaf firstn 0 type rack
you did understand correctly. Ceph will choose as many racks (underneath the "default" root in the crush tree) as your size
parameter for the pool defines. The second rule works a little different:
step take default
step choose firstn 2 type rack
step chooseleaf firstn 2 type host
Ceph will select exactly 2 racks underneath root "default", in each rack it then will choose 2 hosts. But this rule is designed for size = 4
not 2. By the way, don't use size = 2
, If you use this rule with size 2 you'll end up exactly as you already wrote. two hosts in the same rack will have both PGs. So if one rack fails your PGs will become inactive and clients will encounter I/O errors until this resolves.
There's a tool called crushtool
to test your changes before actually implementing it, it's very helpful, try it out!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.