Apache Cassandra Lunch #93: K8ssandra on Digital Ocean

In Cassandra Lunch #93, Stefan Nikolovski will discuss how to use k8ssandra on Digital Ocean. The live recording of Cassandra Lunch, which includes a more in-depth discussion and a demo, is embedded below in case you were not able to attend live. If you would like to attend Apache Cassandra Lunch live, it is hosted every Wednesday at 12 PM EST. Register here now!

Setup in Digital Ocean

  • Create a VPC
    • A Virtual Private Cloud (VPC) is a private network interface for collections of DigitalOcean resources. VPC networks provide a more secure connection between resources because the network is inaccessible from the public internet and other VPC networks. Traffic within a VPC network doesn’t count against bandwidth usage.
  • Spin up a Kubernetes cluster on DO
    •  
  • Create Spaces
    • DigitalOcean Spaces is an object storage service designed to make it easy and cost-effective to store and serve large amounts of data.
  • Create Space access keys
    • After creating the keynote the key id and secret. We will use these values later when configuring Medusa. Failure to do so will require regenerating the secret later.

Working in console

  • digitalocean/doctl
  • wget https://github.com/digitalocean/doctl/releases/download/v1.71.1/doctl-1.71.1-linux-amd64.tar.gz
  • tar xf doctl-1.71.1-linux-amd64.tar.gz
  • sudo mv doctl /usr/local/bin
  • doctl auth init
  • Retrieve kubeconfig
    • After provisioning the DOKS cluster we must request a copy of the kubeconfig. This provides the kubectl command with all connection information including TLS certificates and IP addresses for Kube API requests.
    • doctl kubernetes cluster list
    • doctl kubernetes cluster kubeconfig save k8ssandra
  • We can start using kubectl
    • kubectl cluster-info
    • kubectl version
  • Taint Nodes
    • Taints and tolerations work together to ensure that pods are not scheduled onto inappropriate nodes. One or more taints are applied to a node; this marks that the node should not accept any pods that do not tolerate the taints.
    • With mixed node types available we should apply taints to each node type in order to facilitate scheduling. The larger instances will be tainted with the value app=cassandra:NoSchedule effectively blocking the scheduling of Pods unless they have the taint app=cassandra. In our example deployment here Cassandra nodes are part of the pool named pool-fmufo1teg.
    • kubectl get nodes
    • kubectl taint node -l doks.digitalocean.com/node-pool=pool-fmufo1teg app=cassandra:NoSchedule

Installing k8ssandra

  • Create Backup / Restore Service Account Secrets
    • In order to allow for backup and restore operations, we must create a service account for the Medusa operator which handles coordinating the movement of data to and from DigitalOcean Spaces. As part of the provisioning section, a key was generated for this purpose. Plug in the key and secret from the provisioning section in the following file and save the file as medusa_s3_credentials. The secret will disappear.
  • Secret
    • A Secret is an object that contains a small amount of sensitive data such as a password, a token, or a key. Such information might otherwise be put in a Pod specification or in a container image. Using a Secret means that you don’t need to include confidential data in your application code. Because Secrets can be created independently of the Pods that use them, there is less risk of the Secret (and its data) being exposed during the workflow of creating, viewing, and editing Pods. Kubernetes, and applications that run in your cluster, can also take additional precautions with Secrets, such as avoiding writing secret data to nonvolatile storage.
    • Create file medusa_s3_credentials.yaml with content:
    • 1apiVersion: v1 2kind: Secret 3metadata: 4  name: prod-k8ssandra-medusa-key 5type: Opaque 6stringData: 7  medusa_s3_credentials: |- 8    [default] 9    aws_access_key_id = REDACTED 10    aws_secret_access_key = REDACTED      
    • kubectl create secret generic prod-k8ssandra-medusa-key --from-file=medusa_s3_credentials=./medusa_s3_credentials.yaml
    • The name of the key file within the secret MUST bemedusa_s3_credentials. Any other value will result in Medusa not finding the secret and backups failing.
  • Storage Class
    • A StorageClass provides a way for administrators to describe the “classes” of storage they offer. Different classes might map to quality-of-service levels, or to backup policies, or to arbitrary policies determined by the cluster administrators. Kubernetes itself is unopinionated about what classes represent. This concept is sometimes called “profiles” in other storage systems.
    • Each StorageClass contains the fields provisioner, parameters, and reclaimPolicy, which are used when a PersistentVolume belonging to the class needs to be dynamically provisioned. The name of a StorageClass object is significant and is how users can request a particular class. Administrators set the name and other parameters of a class when first creating StorageClass objects, and the objects cannot be updated once they are created. Administrators can specify a default StorageClass only for PVCs that don’t request any particular class to bind to
    • K8ssandra requires a Kubernetes Storage Class that has volumeBindingMode: WaitForFirstConsumer. The default preinstalled do-block-storage storage class has volumeBindingMode: Immediate. We will create a new storage class with the required mode based on the existing version.
    • Create a file do-block-storage-wait.yaml with content:
    • 1apiVersion: storage.k8s.io/v1 2kind: StorageClass 3metadata: 4  name: do-block-storage-wait 5provisioner: dobs.csi.digitalocean.com 6reclaimPolicy: Delete 7volumeBindingMode: WaitForFirstConsumer 8allowVolumeExpansion: true  
    • kubectl apply -f do-block-storage-wait.yaml
  • Helm File
    • configuration options for running K8ssandra in DigitalOcean on DOKS, I had to slim down the memory/CPU allocation for demo purpose
    • 1cassandra: 2 # Version of Apache Cassandra to deploy 3 version: "3.11.10" 4 5 # Configuration for the /var/lib/cassandra mount point 6 cassandraLibDirVolume: 7 storageClass: do-block-storage-wait 8 size: 5Gi 9 10 heap: 11 size: 1G 12 newGenSize: 1G 13 14 resources: 15 requests: 16 cpu: 1000m 17 memory: 2Gi 18 limits: 19 cpu: 1000m 20 memory: 2Gi 21 22 # This key defines the logical topology of your cluster. The rack names and 23 # labels should be updated to reflect the Availability Zones where your GKE 24 # cluster is deployed. 25 datacenters: 26 - name: dc1 27 size: 1 28 racks: 29 - name: rack-a 30 31 32stargate: 33 enabled: true 34 replicas: 1 35 heapMB: 1024 36 cpuReqMillicores: 1000 37 cpuLimMillicores: 1000 38 39medusa: 40 enabled: true 41 storage: s3_compatible 42 storage_properties: 43 host: nyc3.digitaloceanspaces.com 44 port: 443 45 secure: "True" 46 bucketName: k8ssandra-prod-backups 47 storageSecret: prod-k8ssandra-medusa-key  
    • helm install prod-k8ssandra k8ssandra/k8ssandra -f doks.values.yaml

Bug

Kubernetes – Kubelet Unable to attach or mount volumes – timed out waiting for the condition – vEducate.co.uk Attaching the PVC to the pods is interrupted so follow this guide if you have the same issue, basically, volumeattachment is not attaching correctly possibly read somewhere about timezone difference but just do this:

  • kubectl get volumeattachment
  • kubectl delete volumeattachment <name>
  • kubectl get pods
  • kubectl delete pod <name of pod that is in INIT state>

Retrieve K8ssandra superuser credentials

  • kubectl get secret prod-k8ssandra-superuser -o jsonpath="{.data.username}" | base64 --decode ; echo
    • prod-k8ssandra-superuser
  • kubectl get secret prod-k8ssandra-superuser -o jsonpath="{.data.password}" | base64 --decode ; echo
    • EG5kjNqH2Z0YAHi4Id4o
  • curl -L -X POST 'http://165.227.248.243:8081/v1/auth' -H 'Content-Type: application/json' --data-raw '{"username": "<k8ssandra-username>", "password": "<k8ssandra-password>"}'
    • 4d26a2e7-b103-4e41-91b3-a37ad87ea3da

Cleanup Resources

  • helm uninstall prod-k8ssandra

If you missed Apache Cassandra Lunch #93: K8ssandra on Digital Ocean, it is embedded below! Additionally, all of our live events can be rewatched on our YouTube channel, so be sure to subscribe and turn on your notifications!

Cassandra.Link

Cassandra.Link is a knowledge base that we created for all things Apache Cassandra. Our goal with Cassandra.Link was to not only fill the gap of Planet Cassandra but to bring the Cassandra community together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.

We are a technology company that specializes in building business platforms. If you have any questions about the tools discussed in this post or about any of our services, feel free to send us an email!


Join Anant's Newsletter

Subscribe to our monthly newsletter below and never miss the latest Cassandra and data engineering news!