简体   繁体   English

为什么全复制 Cassandra 集群有节点数据大小差异

[英]why full replication Cassandra cluster have node data size difference

I have a 3-node cassandra cluster (version 3.11.11) with replication factor 3. only 2 of the nodes are receiving requests, and Node3 only sync with the other 2 nodes.我有一个 3 节点 cassandra 集群(版本 3.11.11),复制因子为 3。只有 2 个节点正在接收请求,Node3 只与其他 2 个节点同步。

在此处输入图像描述

In theory, each node should have the same data size.理论上,每个节点应该有相同的数据大小。 But in practice, I end up with nodes with different data sizes as shown in the picture.但在实践中,我最终得到了具有不同数据大小的节点,如图所示。

we have daily nodetool repair , operations like compaction are done automatically with default settings.我们每天都有nodetool repair ,压缩等操作是使用默认设置自动完成的。

What can be the reason for the size difference?尺寸差异的原因可能是什么?

It finally ends up how data gets compacted in the long run.从长远来看,它最终结束了数据是如何被压缩的。 Since compaction is local process and how sstables can be stacked up cannot be guaranteed.由于压缩是本地过程,因此无法保证如何堆叠 sstables。 So I dont see any abbreviation here.所以我在这里看不到任何缩写。 Theory just say all nodes will have same data logically but physically it may vary.理论只是说所有节点在逻辑上都将具有相同的数据,但在物理上它可能会有所不同。 For example in node3 you may have old sstables that are not getting compacted due to size (if using STCS) and in other nodes they have compacted and reduced the size of those nodes.例如,在 node3 中,您可能有由于大小(如果使用 STCS)而没有被压缩的旧 sstable,而在其他节点中,它们已经压缩并减小了这些节点的大小。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Cassandra群集复制-现有节点和现有数据 - Cassandra Cluster Replication- Existing Node & Existing Data Cassandra集群上的数据分区和复制 - Data Partitioning and Replication on Cassandra cluster Cassandra复制因子:需要节点具有完整的报告数据 - Cassandra replication factor: need to node to have complete data for reporting 双节点Cassandra集群中的故障转移和复制 - Failover and Replication in 2-node Cassandra cluster Cassandra集群与每个节点的总复制 - Cassandra cluster with each node total replication Cassandra复制在群集中进行不对数据进行分区 - Cassandra Replication With in cluster Without partitioning Data 是否可以仅从复制因子为3的Cassandra群集中的单个节点读取数据? - Is it possible to read data only from a single node in a Cassandra cluster with a replication factor of 3? 在3节点的Cassandra集群中,将复制因子设置为2; 但是插入时仍然将数据复制到所有3个节点 - Set replication factor to 2, in a 3 node Cassandra cluster; but still data is getting replicated to all 3 nodes on insertion 当复制因子= =簇大小时,Cassandra分区如何工作? - How does Cassandra partitioning work when replication factor == cluster size? Cassandra集群 - 数据密度(每个节点的数据大小) - 寻找反馈和建议 - Cassandra cluster - data density (data size per node) - looking for feedback and advises
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM