简体   繁体   English

当我在备份目录中有大量文件时,恢复 Cassandra 增量备份的过程是什么

[英]What is the procedure to follow to restore Cassandra incremental backup when I have large number of files in backup directory

I'm posing this question as I don't see any specific method on DataStax Docs.我提出这个问题是因为我在 DataStax Docs 上没有看到任何特定的方法。

I have enabled backup after I took Snapshot, and now I see there are around 200k files in backup directory.我在拍摄快照后启用了备份,现在我看到备份目录中有大约 20 万个文件。 I'm not sure what is best way to restore them.我不确定恢复它们的最佳方法是什么。

Copying all of them to Keyspace table directory and did a nodetool refresh <ks> <tbl> but I don't see it working as expected and it is throwing StackOverflow exception.将它们全部复制到 Keyspace 表目录并执行nodetool refresh <ks> <tbl>但我没有看到它按预期工作并且它抛出 StackOverflow 异常。 Is there a way to do work around for this?有没有办法解决这个问题?

I'm using 16G Xmx as of now.到目前为止,我正在使用 16G Xmx。 I see some errors in logs as below.我在日志中看到一些错误,如下所示。 Is this something to so with JVM params?这与 JVM 参数有关吗?

ERROR [gbp-cass-49] [Reference-Reaper:1] 2020-07-29 18:49:01,704 Ref.java:223 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@156d6370) to class org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$Tidy@464162733:[Memory@[0..80), Memory@[0..a00)] was not released before the reference was garbage collected ERROR [gbp-cass-49] [Reference-Reaper:1] 2020-07-29 18:49:01,704 Ref.java:223 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State @156d6370) to class org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$Tidy@464162733:[Memory@[0..80), Memory@[0..a00)] was not released before the reference was garbage collected

nodetool refresh has thrown the following errors on stdout: nodetool refresh在标准输出上引发了以下错误:

error: null
-- StackTrace --
java.lang.AssertionError
    at org.apache.cassandra.io.util.FileUtils.renameWithConfirm(FileUtils.java:178)
    at org.apache.cassandra.io.util.FileUtils.renameWithConfirm(FileUtils.java:173)
    at org.apache.cassandra.io.sstable.format.SSTableWriter.rename(SSTableWriter.java:273)
    at org.apache.cassandra.db.ColumnFamilyStore.loadNewSSTables(ColumnFamilyStore.java:714)
    at org.apache.cassandra.db.ColumnFamilyStore.loadNewSSTables(ColumnFamilyStore.java:658)
    at org.apache.cassandra.service.StorageService.loadNewSSTables(StorageService.java:4555)
    at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
    at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
    at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
    at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
    at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
    at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
    at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
    at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
    at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
    at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1466)
    at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
    at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1307)
    at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1399)
    at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:828)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:323)
    at sun.rmi.transport.Transport$1.run(Transport.java:200)
    at sun.rmi.transport.Transport$1.run(Transport.java:197)
    at java.security.AccessController.doPrivileged(Native Method)
    at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
    at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
    at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
    at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$241(TCPTransport.java:683)
    at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda$177/1629407070.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

There really isn't enough actionable information in your question to be able to provide a meaningful answer but I'll try my best to respond.您的问题中确实没有足够的可操作信息来提供有意义的答案,但我会尽力做出回应。

Incremental backups allow you to offload copies of the data to an off-server storage.增量备份允许您将数据副本卸载到服务器外存储。 However, since Cassandra hard-links every single flushed memtable to the backups/ directory, its contents can grow wild pretty quickly so you need to manage it.但是,由于 Cassandra 将每个刷新的 memtable 硬链接到backups/目录,因此它的内容可能会迅速增长,因此您需要对其进行管理。 This would explain why you ended up with 200K backups.这可以解释为什么您最终得到 200K 备份。

Incremental backups are meant to be used in conjunction with snapshots which are the equivalent of full backups in the traditional sense that most people think of backups.增量备份旨在与快照结合使用,快照相当于大多数人认为的传统意义上的完整备份。 Consider snapshots as akin to cold backups, incremental backups as the delta since the last snapshot.将快照视为类似于冷备份,将增量备份视为自上次快照以来的增量。

This means that every time you take a snapshot on a node, you need to clear the incremental backups in the backups/ directory.这意味着每次在节点上拍摄快照时,都需要清除backups/目录中的增量备份。 Following on from this, when you restore incremental backups you need to restore the respective snapshot (aka full backup) then apply the incrementals (backup of "deltas" after the snapshot).接下来,当您恢复增量备份时,您需要恢复相应的快照(也称为完整备份),然后应用增量备份(快照后的“增量”备份)。

In order to respond to the other points you raised, you will need to explain what you meant by "I don't see it working as expected".为了回应您提出的其他观点,您需要解释“我认为它没有按预期工作”的意思。 Also, what is the full error message plus full stack trace for the exception?此外,完整的错误消息和异常的完整堆栈跟踪是什么? That level of detail is required in order to make a meaningful diagnosis other than "it doesn't work".为了做出有意义的诊断,而不是“它不起作用”,需要这种详细程度的细节。

The error you posted is safe to ignore.您发布的错误可以忽略不计。 That's just a message that the Reference-Reaper thread was successful in finding orphaned references and released them back to the pool.这只是Reference-Reaper线程成功找到孤立引用并将它们释放回池的消息。 It really should be logged at INFO and not ERROR level.它确实应该记录在INFO而不是ERROR级别。

I hope this helps.我希望这有帮助。 Cheers!干杯!

[EDIT] The stack trace you posted in your update to me looks like you have a filesystem permissions issue. [编辑]您在更新中发布给我的堆栈跟踪看起来像您有文件系统权限问题。 C* can't rename the files so probably (a) have the wrong ownership, (b) incorrect permissions, or (c) both. C* 无法重命名文件,因此可能 (a) 拥有错误的所有权,(b) 不正确的权限,或 (c) 两者兼而有之。 Cheers!干杯!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM