[英]ceph df max available miscalculation
Ceph cluster shows following weird behavior with ceph df
output: Ceph 集群显示 ceph
ceph df
output 出现以下奇怪行为:
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 817 TiB 399 TiB 418 TiB 418 TiB 51.21
ssd 1.4 TiB 1.2 TiB 22 GiB 174 GiB 12.17
TOTAL 818 TiB 400 TiB 418 TiB 419 TiB 51.15
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
pool1 45 300 21 TiB 6.95M 65 TiB 20.23 85 TiB
pool2 50 50 72 GiB 289.15k 357 GiB 0.14 85 TiB
pool3 53 64 2.9 TiB 754.06k 8.6 TiB 3.24 85 TiB
erasurepool_data 57 1024 138 TiB 50.81M 241 TiB 48.49 154 TiB
erasurepool_metadata 58 8 9.1 GiB 1.68M 27 GiB 2.46 362 GiB
device_health_metrics 59 1 22 MiB 163 66 MiB 0 85 TiB
.rgw.root 60 8 5.6 KiB 17 3.5 MiB 0 85 TiB
.rgw.log 61 8 70 MiB 2.56k 254 MiB 0 85 TiB
.rgw.control 62 8 0 B 8 0 B 0 85 TiB
.rgw.meta 63 8 7.6 MiB 52 32 MiB 0 85 TiB
.rgw.buckets.index 64 8 11 GiB 1.69k 34 GiB 3.01 362 GiB
.rgw.buckets.data 65 512 23 TiB 33.87M 72 TiB 21.94 85 TiB
As seen above available storage 399TiB, and max avail in pool list shows 85TiB.如上所示,可用存储为 399TiB,池列表中的最大可用性显示为 85TiB。 I use 3 replicas for each pool replicated pool and 3+2 erasure code for the
erasurepool_data
.我为每个池复制池使用 3 个副本,为
erasurepool_data
使用 3+2 擦除代码。
As far as I know Max Avail
segment shows max raw available capacity according to replica size.据我所知,
Max Avail
段根据副本大小显示最大原始可用容量。 So it comes up to 85*3=255TiB.所以它达到 85*3=255TiB。 Meanwhile cluster shows almost 400 available.
同时集群显示近 400 个可用。
Which to trust?信任哪个? Is this only a bug?
这只是一个错误吗?
When you add the erasure coded pool, at 154, you get 255+154 = 399.当您添加纠删码池时,在 154 处,您得到 255+154 = 399。
MAX AVAIL column represents the amount of data that can be used before the first OSD becomes full. MAX AVAIL列表示在第一个 OSD 变满之前可以使用的数据量。 It takes into account the projected distribution of data across disks from the CRUSH map and uses the 'first OSD to fill up' as the target.
它考虑了来自 CRUSH map 的磁盘上数据的预计分布,并使用“第一个填满的 OSD”作为目标。 it does not seem to be a bug.
它似乎不是一个错误。 If MAX AVAIL is not what you expect it to be, look at the data distribution using ceph osd tree and make sure you have a uniform distribution.
如果MAX AVAIL不是您所期望的,请使用 ceph osd 树查看数据分布,并确保您具有均匀分布。
You can also check some helpful posts here that explains some of the miscalculations:您还可以在此处查看一些有用的帖子,这些帖子解释了一些错误计算:
As you have Erasure Coding involved please check this SO post:由于您涉及擦除编码,请查看此 SO 帖子:
Turns out max available space is calculated according to the fullest osds in the cluster and has nothing to do with total free space in the cluster.原来最大可用空间是根据集群中最满的 osds 计算的,与集群中的总可用空间无关。 From what i've found this kind of fluctiation mainly happens on small clusters.
从我发现这种波动主要发生在小集群上。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.