简体繁体 English

卡珊德拉：金色飞贼与八卦

[英]Cassandra: Snitch vs. Gossip

原文 2017-09-30 01:57:27 4 3 cassandra

I can't understand the difference between Snitch and Gossip in Cassandra, and I can't find even one source which has discussed the subject, let alone providing a good answer. 我无法理解Cassandra中Snitch和Gossip之间的区别，我甚至找不到一个讨论过这个主题的来源，更不用说提供一个好的答案了。 Seems to me that Snitch and Gossip are both inter-node communication protocols; 在我看来，Snitch和Gossip都是节点间通信协议; so why do we need 2 of them? 那为什么我们需要其中的两个呢？

I know that Gossip helps a node to get information from bootstrap nodes, but that doesn't really explain the difference since when a node starts, it needs to learn about the data centers and racks as well which is supposed to be the domain of the Snitch. 我知道Gossip帮助节点从引导程序节点获取信息，但这并没有真正解释差异，因为当一个节点启动时，它需要了解数据中心和机架，这应该是应该是告密者。

3 个解决方案

Gossip is a protocol and Snitch is a component which utilizes it. Gossip是一种协议，Snitch是一个利用它的组件。 Snitch is a little bit more than gossip and it has at least some heuristics like identifying data centers or racks while gossip is like a convenient tool to get this information. Snitch比八卦有点多，它至少有一些启发式方法，比如识别数据中心或机架，而八卦就像是获取这些信息的便捷工具。 Almost all that gossip is doing is spreading arround with some rules to cover all necessary nodes and receive some technical data like ip, health etc. While Snitch utilizes this info to perform something more. 几乎所有八卦都在做的是通过一些规则来扩展所有必要的节点并接收一些技术数据，如ip，health等。而Snitch利用这些信息来执行更多的操作。 One of its features is to identify different data centers by analyzing received ips. 其功能之一是通过分析收到的ips来识别不同的数据中心。 Then this info is used by other components for further actions like replicas location etc. So they've decided to give this functionality separate name to identify it and actually it's all about layering the functionality. 然后，其他组件将此信息用于进一步的操作，例如副本位置等。因此，他们决定为此功能提供单独的名称以识别它，实际上它是关于分层功能的。

Some relevant information also can be found here: https://books.google.ru/books?id=h36CCwAAQBAJ&pg=PT21&lpg=PT21&dq=snitch+gossip&source=bl&ots=fjxy_z78Gj&sig=KpqdkKaREIo2YAWyJj3yMZCyNn4&hl=ru&sa=X&ved=0ahUKEwiUktS8q8zWAhWIQZoKHTViD0U4ChDoAQhUMAc#v=onepage&q=snitch%20gossip&f=false 一些相关的信息，也可以在这里找到： https://books.google.ru/books?id=h36CCwAAQBAJ&pg=PT21&lpg=PT21&dq=snitch+gossip&source=bl&ots=fjxy_z78Gj&sig=KpqdkKaREIo2YAWyJj3yMZCyNn4&hl=ru&sa=X&ved=0ahUKEwiUktS8q8zWAhWIQZoKHTViD0U4ChDoAQhUMAc#v=onepage&q=snitch ％20gossip＆F =假

And here is a more detailed snitch definition (but in scylla): https://github.com/scylladb/scylla/wiki/Snitches 这里有一个更详细的小报定义（但在scylla中）： https ： //github.com/scylladb/scylla/wiki/Snitches

Gossip is used to identify the state of machines (are they in the cluster, up/down/joining/leaving). 八卦用于识别机器的状态（它们是在集群中，上/下/加入/离开）。

The snitches help map ownership to an actual machine, and route queries (given these 10 nodes in the cluster, which of the 10 own the data for a given key). snitch有助于将所有权映射到实际的机器，并路由查询（给定集群中的这10个节点，10个节点中的哪个节点拥有给定键的数据）。

Different snitches can help assign data in different ways - the simple snitch just places all instances into datacenter1/rack1, and uses the simple distributed hashtable / naive partitioner placement. 不同的snitch可以帮助以不同的方式分配数据 - 简单的snitch只是将所有实例放入datacenter1 / rack1，并使用简单的分布式散列表/天真分区器放置。 The property file snitch lets you create a file that has all of the instances, and maps the instance to a datacenter/rack, ensuring that replicas always exist on different racks (and datacenters, as defined by the replication strategy). 属性文件snitch允许您创建包含所有实例的文件，并将实例映射到数据中心/机架，确保副本始终存在于不同的机架（以及复制策略定义的数据中心）上。

The gossiping-property-file-snitch and the ec2 snitches are somewhat like the property file snitch in that they're rack/topology aware, but they read the local instance topology information (either from a file or from the ec2 apis) and then gossip it to others, so each node is responsible for broadcasting its own topology information (through gossip). gossiping-property-file-snitch和ec2 snitch有点像属性文件snitch，因为它们是机架/拓扑识别的，但是它们读取本地实例拓扑信息（来自文件或来自ec2 apis）然后把它闲聊给别人，所以每个节点都负责广播自己的拓扑信息（通过八卦）。

Gossip is an epidemic protocol that spreads through the cluster. 八卦是一种通过集群传播的流行病协议。 It transmits cluster metadata ie the state of the cluster. 它传输集群元数据，即集群的状态。 Following are the information shared as part of Gossip: 以下是作为Gossip的一部分共享的信息：

Generation: when it booted 生成：启动时
Version : Timestamp 版本：时间戳

Application state: 申请状态：

Status : Normal/Joining/leaving 状态：正常/加入/离开
DC : data center location DC：数据中心位置
Rack: rack number of this node 机架：此节点的机架号
Schema:Schema version on the node 架构：节点上的架构版本
Load: Disk pressure on the node 加载：节点上的磁盘压力
Severity:The pressure on the system from the I/O standpoint 严重性：从I / O角度看系统的压力
etc... 等等...

Snitch helps map IPs to racks and data centers, in other words. 换句话说，Snitch有助于将IP映射到机架和数据中心。 It creates a topology by grouping nodes to help determine where data is read from. 它通过对节点进行分组来创建拓扑，以帮助确定从哪里读取数据。 When a read request comes in, it reaches the coordinator node, the consistency level of the read request and the read_repair_chance for that Column family decide how the snitch steps in. Only one node will send back the requested data, it is up to the snitch to determine that. 当读取请求进入时，它到达协调器节点，读取请求的一致性级别和该列族的read_repair_chance决定了告密者如何进入。只有一个节点将发送回请求的数据，这取决于告密者确定。