In Cassandra Lunch #80, we discussed how DataStax Astra can be used to create and track a content management system. We will demo a small application that is a clone of Tik Tok using Astra’s Document API. The live recording of Cassandra Lunch, which includes a more in-depth discussion and a demo, is embedded below in case you were not able to attend live. If you would like to attend Apache Cassandra Lunch live, it is hosted every Wednesday at 12 PM EST. Register here now!
Any discussion involving Cassandra needs to start with the data model. NoSQL systems provide fast read and write operations because tables tend to be created specifically to query certain data. Unlike traditional SQL schemas, joins are not allowed. This requires thinking about the data that is going to be stored and retrieved before creating a table. Special consideration needs to be given for what your Primary, Partition, and Clustering keys. The primary/partition key determines what columns a table can be queried. The clustering key determines how that data can be sorted.
The following Chebotko diagram demonstrates a Cassandra data model for a video application. Primary keys are denoted with a “K” and clustering columns are indicated with a “C” and an arrow up or down.
For the demonstration in Cassandra Lunch #80: Using Cassandra for a Content Management System, we used DataStax Astra. Astra is a Cassandra-as-a-Service in the cloud. Some great things to point out about Astra are that it is free to start, database and infrastructure administration are optional, and there are multiple options to choose from when connecting to Cassandra via API.
In our demonstration, we used Astra’s Stargate Document API. As mentioned above, this allows us to query and modify data stored as unstructured JSON documents in a collection and does not require the aforementioned data modeling typical of Cassandra. Once a namespace is created we can start adding data by connecting to the API via a URL containing the database id, region, keyspace name, and an authorization token (a Cassandra token) in the authorization header.
Multiple collections can be stored in a namespace, but a collection can only be stored in a single namespace. Collections are specified once a document is inserted. Once a document is inserted, each value in the JSON object is stored as a cell in the table. Below is an image of how a table is created when a document is submitted to Stargate’s Document API.
One thing to note is that writes are a batch and will contain inserts and deletes. This can cause document rows to show two different states for a JSON field. If this happens, the document API will resolve by accepting the data with the later write time. To learn more about the Stargate Document API and details surrounding how deletes are handled and API performance checkout Stargate’s blog post.
To follow along in leveraging this information in order to create a Tik Tok clone be sure to check out the video and repository linked below. If you missed Cassandra Lunch #78: Cass Operator, it is embedded below! Additionally, all of our live events can be rewatched on our YouTube channel, so be sure to subscribe and turn on your notifications!
Cassandra.Link is a knowledge base that we created for all things Apache Cassandra. Our goal with Cassandra.Link was to not only fill the gap of Planet Cassandra but to bring the Cassandra community together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.
We are a technology company that specializes in building business platforms. If you have any questions about the tools discussed in this post or about any of our services, feel free to send us an email!
Subscribe to our monthly newsletter below and never miss the latest Cassandra and data engineering news!