• Apache Cassandra
  • Popular
  • Technical
Apache Cassandra Compaction Strategies

Cassandra is optimized for writing large amounts of data very quickly. Cassandra writes all incoming data in an append-only manner in internal files called SSTables. These SSTables hence contain newly inserted data and updates/deletes of previously inserted data. 

Cassandra Compaction is a process of reconciling various copies of data spread across distinct SSTables. Cassandra performs compaction of SSTables as a background activity. Cassandra has to maintain fewer SSTables and fewer copies of each data row due to compactions improving its read performance. Compaction is a crucial area of Cassandra performance and maintenance. 

There are certain methods used to handle these compaction operations and the timing of when these operations are performed. This blog post describes various compaction strategies along with other essential details.

While regular compactions are an integral part of any healthy Cassandra cluster – the way that they are configured can vary significantly depending on the way a particular table is being used. It is important to take some time when designing a Cassandra table schema to think about how each table will be used, and therefore the Cassandra Compaction strategy would be most effective.

Although it is possible to change compaction strategies after the table has been created, doing so will have a significant impact on the cluster’s performance that is proportionate to the amount of data in the table. This is because Cassandra will re-write all of the data in that table using the new Compaction Strategy.

Cassandra’s Write Path

To understand the importance of compactions in Cassandra, you must first understand how Cassandra writes data to a disk. The Cassandra write path in a nutshell:

  1. Cassandra stores recent writes in memory (in a structure called the Memtable).
  2. When enough writes have been made, Cassandra flushes the Memtable to disk. Data on disk is stored in relatively simple data structures called Sorted String Tables (SSTable). At the most simplified level, an SSTable could be described as a sorted array of strings.
  3. Before writing a new SSTable, Cassandra merges and pre-sorts the data in the Memtable according to Primary Key. In Cassandra, a Primary Key consists of a Partition Key (the unique key that determines which node the data is stored on) and any Clustering Keys that have been defined.
  4. The SSTable is written to disk as a single contiguous write operation. SStables are immutable. Once they are written to disk they are not modified. Any updates to data or deletion of data within an SSTable is written to a new SSTable. If data is updated regularly, Cassandra may need to read from multiple SSTables to retrieve a single row.
  5. Compaction operations occur periodically to re-write and combine SSTables. This is required because SSTables are immutable (no modifications once written to disk). Compactions prune deleted data and merge disparate row data into new SSTables in order to reclaim disk space and keep read operations optimized.

If you are unfamiliar with Cassandra’s write path, please read The write path to compaction from Cassandra Wiki.

Cassandra Compaction Strategies

Multiple Compaction Strategies are included with Cassandra, and each is optimized for a different use case:

Type of Compaction StrategyDescriptionWhen?
SizeTiered Compaction Strategy (STCS)This is the default compaction strategy. This compaction strategy triggers a compaction when multiple SSTables of a similar size are present. Additional of parameters allow STCS to be tuned to increase or decrease the number of compactions it performs and how tombstones are handled.This compaction strategy is good for insert-heavy and general workloads.
Leveled Compaction Strategy (LCS)This strategy groups SSTables into levels, each of which has a fixed size limit which is 10 times larger than the previous level. SSTables are of a fixed, relatively small size (160MB by default) – so if Level 1 might contain 10 SSTables at most, then Level 2 will contain 100 SSTables at most. SSTables are guaranteed to be non-overlapping within each level – if any data overlaps when a table is promoted to the next level, overlapping tables are re-compacted.

For example: when Level 1 is filled, any new SSTables being added to that level are compacted together with any existing tables that contain overlapping data. If these compactions result in Level 1 now containing too many tables, the additional table(s) overflow to Level 2.

This compaction strategy is the best for read-heavy workloads (because tables within a level are non-overlapping, LCS guarantees that 90% of all reads can be satisfied from a single SSTable) or workloads where there are more updates than there are inserts.
DateTiered Compaction Strategy (DTCS)This compaction strategy is designed for use with time-series data. DTCS stores data written within a the same time period in the same SSTable. Multiple SSTables that are themselves written in the same time window will be compacted together, up until a certain point, after which the SSTables are no longer compacted. SSTables are also configured with a TTL. SSTables that are older than the TTL will be dropped, incurring zero compaction overhead.DTCS is deprecated in Cassandra 3.0.8/3.8 and later. The TWCS is improved version of DTCS, it is available with version 3.0.8/3.8 and later.

DTCS is highly performant and efficient, but only if the workload matches the strict requirements of DTCS. DTCS is not designed to be used with workloads where there are updates to old data or inserts that are out of order. If your workload does not fit these requirements, you may be better off using STCS and using a bucketing key (such as hour/day/week) to break up your data.

Time Window Compaction StrategyThe Time Window Compaction Strategy is designed to work on time series data. It compactes SSTables within a configured time window. TWCS utilizes STCS to perform these compactions. At the end of each time window, all SSTables are compacted to a single SSTable so there is one SSTable for a time window. TWCS also effectively purges data when configured with time to live by dropping complete SSTables after TTL expiry.

TWCS is similar to DTCS except for operational improvements. Write amplifications is prevented by only compacting SSTables within a time window. The maximum SSTable timestamp is used to decide which time window it belongs to.

TWCS is ideal for time series data which is immutable after a fixed time interval. Updating the data after the specified time window results in multiple SSTables referring to the same data. Those SSTables will not get compacted together.

 

TWCS can be used with Cassandra 2.x by  adding jar file.

Configuring a Cassandra Compaction Strategy

Compaction options are configured at the table level via CQLSH. This allows each table to be optimized based on how it will be used. If a compaction strategy is not specified, SizeTieredCompactionStrategy will be used.

Take a look at the compaction sub properties documentation for more information on the options that are available to tweak for each compaction strategy. The default options are generally sufficient for most workloads. We suggest leaving the default options initially and modifying them as required if a table is not meeting performance expectations.

Managed Apache Cassandra® Versus DynamoDB – understand which is best for your use case.

Download Now