Apache Cassandra Lunch #28: Cassandra Backup / Restore Scenarios

In case you missed it, Apache Cassandra Lunch #28 focused on specific scenarios for Cassandra’s backup and restore. We discussed some methods for restoring data to a Cassandra cluster. We also covered how factors like the topology of a cluster or the need for constant uptime can affect the backup/restore process. The live recording of Cassandra Lunch, which includes a more in-depth discussion, is also embedded below in case you could not attend live. If you would like to attend Apache Cassandra Lunch live, it is hosted every Wednesday at 12 PM EST. Register here now!

In Cassandra Lunch #28, we discuss a number of situations that affect the Cassandra backup and restore process. This discussion started based on a specific question about using nodetool snapshot to restore data to a cluster.

We begin by covering the way that Cassandra stores data, focusing on the specific folder structure within a node. This matters because nodetool snapshots or copied directories can only restore data to a cluster with the same topology. We also discussed other tools for data restoration, like SSTableLoader and external repair tools. We discuss Cassandra Medusa, DSE Opscenter, and Tablesnap.

For an in-depth, general discussion of Cassandra’s backup/restore, see Cassandra Lunch #15 focusing on that topic. In that post, we discussed disaster avoidance, disaster recovery, and external backup/restore tools. Cassandra Lunch #28 also covers scenarios in which you may need specific backup/restore processes. Either due to changes in cluster topology or due to needing to maintain data availability.


Cassandra.Link is a knowledge base that we created for all things Apache Cassandra. Our goal with Cassandra.Link was to not only fill the gap of Planet Cassandra, but to bring the Cassandra community together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.

