Apache Cassandra Lunch #16: Cassandra Anti-entropy, Repair & Synchronization

In Cassandra Lunch #16, we discuss Cassandra Anti-entropy which is a process of comparing the data of all replicas and updating each replica to the newest version. We also looked at repair and synchronization in Cassandra and how you can prepare for the unexpected.

Hinted Handoff

Hinted handoff is a Cassandra feature that optimizes the cluster consistency process and anti-entropy when a replica-owning node is not available, due to network issues or other problems, to accept a replica from a successful write operation. Hinted handoff is not a process that guarantees successful write operations, except when a client application uses a consistency level of ANY. You enable or disable hinted handoff in the cassandra.yaml file.

By default, hints are saved for three hours after a replica fails because if the replica is down longer than that, it is likely permanently dead. You can configure this interval of time using the max_hint_window_in_ms property in the cassandra.yaml file. If the node recovers after the save time has elapsed, run a repair to re-replicate the data written during the downtime.

During a write operation, when hinted handoff is enabled and consistency can be met, the coordinator stores a hint about dead replicas in the local system.hints table under either of these conditions:

  • A replica node for the row is known to be down ahead of time.
  • A replica node does not respond to the write request.

When the cluster cannot meet the consistency level specified by the client, Cassandra does not store a hint.

A hint indicates that a write needs to be replayed to one or more unavailable nodes. The hint consists of:

  • The location of the replica that is down
  • Version metadata
  • The actual data being written

Read Repair

Read Repair is the process of repairing data replicas during a read request. If all replicas involved in a read request at the given read consistency level are consistent the data is returned to the client and no read repair is needed. But if the replicas involved in a read request at the given consistency level are not consistent a read repair is performed to make replicas involved in the read request consistent. The most up-to-date data is returned to the client. The read repair runs in the foreground and is blocking in that a response is not returned to the client until the read repair has completed and up-to-date data is constructed.

Nodetool Repair

The repair command repairs one or more nodes in a cluster and provides options for restricting repair to a set of nodes, see Repairing nodes. Performing an anti-entropy node repair on a regular basis is important, especially in an environment that deletes data frequently.

  • Number of nodes performing a repair:
    • Parallel runs repair on all nodes with the same replica data at the same time. (Default behavior in the DataStax Distribution of Apache Cassandra™ (DDAC).)
    • Sequential (-seq, –sequential) runs repair on one node after another.
    • Datacenter parallel (-dcpar, –dc-parallel) combines sequential and parallel by simultaneously running a sequential repair in all datacenters; a single node in each data center runs repair, one after another until the repair is complete.
  • Amount of data that is repaired:
    • Full repair (default) compares all replicas of the data stored on the node where the command runs and updates each replica to the newest version. Does not mark the data as repaired or unrepaired. Default for DDAC. To switch to incremental repairs, see Migrating to incremental repairs.
    • Full repair with partitioner range (-pr, –partitioner-range) repairs only the primary replicas of the data stored on the node where the command runs. Recommended for routine maintenance.
    • Incremental repair (-inc) splits the data into repaired and unrepaired SSTables, only repairs unrepaired data. Marks the data as repaired or unrepaired.

DataStax OpsCenter Repair

The Repair Service performs repair operations across a DataStax Enterprise cluster in a minimally impactful manner. The Repair Service runs in the background, repairing small chunks of a cluster to alleviate the pressure and potential performance impact of having to periodically run a repair on entire nodes.

The Repair Service cyclically repairs a DataStax Enterprise (DSE) cluster within the specified time to completion. Any anticipated overshoot of the targeted completion time is communicated with a revised estimate.

DataStax NodeSync

  • NodeSync automatically and continuously synchronizes replicas of designated keyspaces and tables of a cluster as a background process. The NodeSync Service is initially enabled by default in DSE version 6.0 and later but the keyspaces and tables that are actively monitored by the OpsCenter NodeSync Service must be enabled. Enable keyspaces and tables with a click of a button in the Settings area of NodeSync.
  • NodeSync metrics are available to indicate progress on repaired data and objects (rows and range tombstones), validated data and objects, and processed pages details. Set up alerts and dashboard graphs based on NodeSync metrics pertinent to your production environment.
  • The NodeSync Best Practice Service rule checks that NodeSync is running on each node. The NodeSync Not Running rule in the Best Practice Service fails if the NodeSync Service is not running unless the rule has been turned off.

Cassandra Reaper

Reaper is an open-source tool that aims to schedule and orchestrate repairs of Apache Cassandra clusters.

It improves the existing nodetool repair process by

  • Splitting repair jobs into smaller tunable segments.
  • Handling back-pressure through monitoring running repairs and pending compactions.
  • Adding the ability to pause or cancel repairs and track progress precisely.

Reaper ships with a REST API, a command line tool and a web UI.

Additional Resources

Cassandra Links | Anant Corporation Project Cassandra.Link
https://cassandra.link/post/repair-in-cassandra 

Cassandra Links | Anant Corporation Project Cassandra.Link
https://cassandra.link/post/incremental-repair-improvements-in-cassandra-4 

Cassandra Links | Anant Corporation Project Cassandra.Link
https://cassandra.link/post/reaper-easy-repair-management-for-apache-cassandra 

Cassandra Links | Anant Corporation Project Cassandra.Link
https://cassandra.link/post/should-you-use-incremental-repair

Deck
Cassandra Lunch Recording

ICYMI

Cassandra.Link

Cassandra.Link is a knowledge base that we created for all things Apache Cassandra. Our goal with Cassandra.Link was to not only fill the gap of Planet Cassandra, but to bring the Cassandra community together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.

We are a technology company that specializes in building business platforms. If you have any questions about the tools discussed in this post or about any of our services, feel free to send us an email!