In Cassandra Lunch #15, we discuss Cassandra Backup / Restoration. We discuss disaster avoidance, disaster recovery, and different tools that can be used for backup and restoration of your Cassandra data. Also, we discuss an example scenario of how someone has set up multi-node clusters and how they go about data backup and restoration.
Strategy
In the strategy for backup / restoration, we cover disaster avoidance, disaster recovery, and tools for cassandra backup and restoration.
Disaster Avoidance
For disaster avoidance, we discuss strategy for using multi-datacenter, multi-region, and/or multi-cloud clusters. We also discuss examples using AWS and Google, which you can see more in-depth in the video linked below.
Disaster Recovery
We discussed 3 methods of disaster recovery and a more in-depth explanation can be found in the video linked below.
Cassandra Backup / Restore
Single node
Snapshot + Restore
Multi-node
Snapshot + Restore (same size cluster vs different sized cluster)
Cloud Iaas Snapshot Backup / Restore
AWS EBS
Cassandra Backup + Upload to Distributed Filesystem (S3)
Tools for Backup / Restoration
We also covered a few different tools that can be used for backup and restoration. A more in-depth discussion about those tools can be seen in the video linked below.
Uses inotify to monitor Cassandra SSTables and upload them to AWS S3
Cassandra Medusa
Medusa is an Apache Cassandra backup system.
Medusa is a command-line tool that offers the following features:
single node backup
single node restore
cluster-wide in place restore
cluster-wide remote restore
backup purge
support for local storage, GCS, AWS S3, and others through Apache Libcloud
support for clusters using single tokens or vnodes
full or incremental backups
Currently does not support
Cassandra deployments with multiple data folder directories
Cassandra-Backup
Backup utility and library for Apache Cassandra
The tool is able to perform these operations:
backup of SSTables
restore of SSTables
backup of commit logs
restore of commit logs
Rubrik Mosaic (Datos.io)
Simplifies protection and data management of MongoDB, DataStax Enterprise, and Cassandra while assuring application availability.
Achieve a significant storage economy with incremental forever backup and semantic deduplication.
Mosaic always-consistent backup speeds recovery and lets you start using the application during recovery.
Mosaic is cloud-native, runs on-premises, or both.
Mosaic reduces multiple NoSQL replicas into a single always-consistent copy and stores the backup on any cloud.
Example Scenario
We also discussed an example scenario of how someone has set up multi-node clusters and how they go about data backup and restoration. A more in-depth discussion of this example can be seen in the video linked below.
Cassandra.Link is a knowledge base that we created for all things Apache Cassandra. Our goal with Cassandra.Link was to not only fill the gap of Planet Cassandra, but to bring the Cassandra community together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.
We are a technology company that specializes in building business platforms. If you have any questions about the tools discussed in this post or about any of our services, feel free to send us an email!
Posted in Modern Business|Comments Off on Apache Cassandra Lunch #15: Cassandra Backup / Restoration