[英]why full replication Cassandra cluster have node data size difference
I have a 3-node cassandra cluster (version 3.11.11) with replication factor 3. only 2 of the nodes are receiving requests, and Node3 only sync with the other 2 nodes.我有一个 3 节点 cassandra 集群(版本 3.11.11),复制因子为 3。只有 2 个节点正在接收请求,Node3 只与其他 2 个节点同步。
In theory, each node should have the same data size.理论上,每个节点应该有相同的数据大小。 But in practice, I end up with nodes with different data sizes as shown in the picture.
但在实践中,我最终得到了具有不同数据大小的节点,如图所示。
we have daily nodetool repair
, operations like compaction are done automatically with default settings.我们每天都有
nodetool repair
,压缩等操作是使用默认设置自动完成的。
What can be the reason for the size difference?尺寸差异的原因可能是什么?
It finally ends up how data gets compacted in the long run.从长远来看,它最终结束了数据是如何被压缩的。 Since compaction is local process and how sstables can be stacked up cannot be guaranteed.
由于压缩是本地过程,因此无法保证如何堆叠 sstables。 So I dont see any abbreviation here.
所以我在这里看不到任何缩写。 Theory just say all nodes will have same data logically but physically it may vary.
理论只是说所有节点在逻辑上都将具有相同的数据,但在物理上它可能会有所不同。 For example in node3 you may have old sstables that are not getting compacted due to size (if using STCS) and in other nodes they have compacted and reduced the size of those nodes.
例如,在 node3 中,您可能有由于大小(如果使用 STCS)而没有被压缩的旧 sstable,而在其他节点中,它们已经压缩并减小了这些节点的大小。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.