简体   繁体   English

从4节点群集备份和还原Cassandra

[英]Backup and Restore Cassandra from 4 node cluster

I have a Cassandra 4 node cluster. 我有一个Cassandra 4节点集群。 Each node has 50% of the data. 每个节点都有50%的数据。 Can anyone please suggest me the best way how should I take backup so that when I restore I should get back all the data. 任何人都可以建议我最好的方法如何进行备份,以便在还原时可以取回所有数据。

Thanks for your help. 谢谢你的帮助。

Best practice is to create a snapshot (basically backs up all your existing data by creating a hardlink to the sstables which are cassandra's data files). 最佳实践是创建快照(通过创建与cassandra数据文件的sstables的硬链接来基本上备份所有现有数据)。 What other threads don't seem to mention is that you also want to back up your schema. 其他线程似乎没有提到的是,您还希望备份架构。 This can be done using cqlsh's describe command eg: 这可以使用cqlsh的describe命令完成,例如:

DESCRIBE TABLE system.schema_columns;

CREATE TABLE system.schema_columns (
    keyspace_name text,
// some output removed
    PRIMARY KEY (keyspace_name, columnfamily_name, column_name)
) WITH CLUSTERING ORDER BY (columnfamily_name ASC, column_name ASC)
// removed rest ouf output.

Also use a parallel ssh tool to create the snapshots on all your nodes ( pssh is one of the popular tools) 还可以使用并行ssh工具在所有节点上创建快照( pssh是流行的工具之一)

So to outline the process: 因此,概述一下该过程:

  1. Back up your schema (only necessary once per table ALTER) 备份您的架构(每个表仅需要更改一次)
  2. Use pssh to create a parallel snapshot 使用pssh创建并行快照
  3. Back the snapshots up somewhere in another non-cassandra machine (if you have hardware failure leaving the snapshots on the same machine as cassandra means you're running a risk of loosing them and the node at the same time). 将快照备份到另一台非Cassandra机器上的某个位置(如果您遇到硬件故障,则将快照与cassandra放在同一台机器上意味着您有同时丢失它们和节点的风险)。

There is an overview of how to snapshot here and an overview of how to recover lost nodes using a snapshot here . 还有就是如何快照的概述在这里以及如何恢复使用快照丢失节点的概述这里

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM