3 Useful Apache Cassandra Tools

Apache Cassandra is a free and open-source NoSQL database management system that is designed to handle large amounts of data with no single point of failure. In this post, I’m going to highlight 3 different tools that make life easier when using Apache Cassandra.

The first tool on this list is Ansible-DSE. Essentially this is a playbook that automates the creation of a DataStax cluster. First of all, what is an ansible playbook? From Ansible’s documentation, ”Playbooks are Ansible’s configuration, deployment, and orchestration language. They can describe a policy you want your remote systems to enforce or a set of steps in a general IT process.” Simply put, a playbook is a simple configuration management and multi-machine deployment system that lets you deploy complex applications such as Apache Cassandra and DataStax through an automated process. Playbooks can declare configurations, orchestrate steps of any manual ordered process and they can launch tasks synchronously or asynchronously. This specific playbook builds a DSE cluster by pre-building a Rackspace Cloud environment or running against it an existing environment. The added bonus is that it supports multiple regions and multiple virtual data centers per region, which allows you to separate workloads.

The next tool on the list is Uber’s Peloton. This is a resource scheduler that manages resources across distinct workloads, combining separate compute clusters. It is designed for large businesses with millions of containers and tens of thousands of nodes. Some of its main features include elastic resource sharing, hierarchical max-min fairness, resource overcommitting, optimization for Big Data and Machine Learning, high scalability, a Protobuf/gRPC based API, and the ability to co-schedule mixed workloads. Peloton is also Cloud agnostic which means it can be run in on-premise data centers or in the Cloud. Basically, if you’re a worldwide company that needs to process and manage a large amount of data simultaneously, you should look into investing in Peloton.

The last tool is called CassandraCAS. This tool allows you to compare-and-swap data in Cassandra. First, you need to start up a 3+ node Cassandra cluster. Then, run a CAS Stress Test by initiating a program that will start an n number of threads racing to compare and swap a counter from 0 up to n. Each thread will pause for m amount of seconds between each operation. You can either run multiple threads on one client process or you can run multiple client processes, depending on your needs. You can also validate CAS in the steps outlined in the project documentation.

Cassandra.Link is a knowledge base that our team created to act as a central POI for all things Apache Cassandra. Our goal with Cassandra.Link was to not only fill the gap of Planet Cassandra but we want to bring the Cassandra community, no matter what variant they use, together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.

We are a technology company that specializes in building business platforms. If you have any questions about the tools discussed in this post or about any of our services, feel free to send us an email!

Photo by Christopher Gower on Unsplash