3 Useful Apache Cassandra Tools Pt. 2

Apache Cassandra is a free and open-source NoSQL database management system that is designed to handle large amounts of data with no single point of failure. In this post, I’m going to highlight 3 different tools that make life easier when using Apache Cassandra.

The first tool that I’m going to be talking about is called Cassandra Snapshot Backup. Natively, Cassandra does not provide an easy way to snapshot files and store these snapshots in local directories. Snapshotting files is when you capture the state of the virtual machine settings and the virtual disk. If you are taking a memory snapshot, you also capture the memory state of the virtual machine. These states are saved to files that reside with the virtual machine’s base files. This tool provides scripts that take snapshots, then restores them using the same snapshot files created by the snapshooter. To use this resource, you need to use Ansible which requires ssh access keys to all remote hosts. The Ansible host needs boto3 for AWS S3 services, and the nodes need PyYaml.

The second tool that I’m going to introduce is called Cassandra Operator. The Cassandra Operator is a Kubernetes operator that manages Cassandra clusters inside Kubernetes. Here are all the main features of the tool:

  • rack awareness
  • scaling out (more racks, more pods per rack)
  • scheduled backups with retention policy
  • works with official Cassandra Docker images
  • deployable per namespace with RBAC permissions limited to it
  • deployable cluster-wide
  • customizable Cassandra config (cassandra.yaml, JVM.options, extra libs)
  • customizable liveness/readiness probes
  • automated rolling update of Cassandra cluster definition changes
  • cluster and node-level metrics
  • a comprehensive e2e test suite

Currently, the project is in alpha status and can be used in development environments but is not recommended for use in production environments.

The last tool that I’m going to talk about is called Slothsandra. Contrary to the name of the tool, it does not make your Cassandra instance run slower. It is a combination of Slack (Sloth) + Cassandra because it connects the Slack API and Apache Cassandra. Slack is a great chat and collaboration application. The free version only keeps up to 10,000 messages. This is not a problem for small teams, but large teams may quickly lose track of their messages. Slothsandra solves this limitation by storing all public chat messages into Apache Cassandra. It also provides a UI to lookup all messages from the Cassandra DB. Here is an example of how it looks.

Cassandra.Link is a knowledge base that our team created to act as a central POI for all things Apache Cassandra. Our goal with Cassandra.Link was to not only fill the gap of Planet Cassandra but we want to bring the Cassandra community, no matter what variant they use, together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.

We are a technology company that specializes in building business platforms. If you have any questions about the tools discussed in this post or about any of our services, feel free to send us an email!

Photo by Fotis Fotopoulos on Unsplash