Understanding Apache Cassandra® Memory Usage

Apache Cassandra
Technical

August 12, 2021
By Cameron Zemek

This article describes Apache Cassandra components that contribute to memory usage and provides some basic advice on tuning.

Cassandra memory usage is split into JVM heap and offheap.

Heap is managed by the JVM’s garbage collector.

Offheap is manually managed memory, which is used for:

Bloom filters: Used to quickly test if a SSTable contains a partition
Index summary: A search lookup of index positions
Compression metadata
Key cache (key_cache_size_in_mb): Used to store SSTable positions for a given partition key (so you can skip Index summary and Index scanning for data position)
File cache (file_cache_size_in_mb): 32MB of this are reserved for pooling buffers. The remaining MB serve as a cache holding uncompressed SSTables.
Row cache (row_cache_size_in_mb)
Counter cache (counter_cache_size_in_mb)
Offheap memtables (memtable_allocation_type)
Direct ByteBuffer (ByteBuffer.allocateDirect)
Memory mapped files (default disk_access_mode is mmap): Reading Data.db and Index.db files will use memory mapped files, which is where the operating system loads parts of the file into memory pages

Bloom Filters

Bloom filters are a data structure that lets you determine if a specific element is present in a set. Bloom filters let you look at data in Cassandra and determine between one of two possibilities for a given partition:

1. It definitely does not exist in the given file, or:

2. It probably does exist in the file

If you want to make your bloom filters more accurate, configure them to consume more RAM. You can adjust this behavior for your bloom filters by changing bloom_filter_fp_chance to a float between 0 and 1. This parameter defaults to 0.1 for tables using LeveledCompactionStrategy, and 0.01 otherwise.

Bloom filters are stored offheap in RAM. As the bloom_filter_fp_chance gets closer to 0, memory usage increases, but does not increase in a linear fashion.

Values for bloom_filter_fp_chance for false positives are usually between 0.01 (1%) to 0.1 (10%) chance.

You should adjust the parameter for bloom_filter_fp_chance depending on your use case. If you need to avoid excess IO operations, you should set bloom_filter_fp_chance to a low number like 0.01. If you want to save RAM and care less about IO operations, you can use a higher bloom_filter_fp_chance number. If you rarely read, or read by performing range slices over all the data (like Apache Spark does when scanning a whole table), an even higher number may be optimal.

Index Summary

Cassandra stores the offsets and index entries in offheap.

Compression Metadata

Cassandra stores the compression chunk offsets in offheap.

Key Cache

The key cache saves Cassandra from having to seek for the position of a partition. The key cache saves a good deal of time given how small it is, so it is worth using for at-large numbers. The global limit for the key cache is controlled in cassandra.yaml by setting key_cache_size_in_mb.

There is also a per-table setting defined in the schema, in the property caching under keys, with the default set to ALL.

Row Cache

Compared to the key cache, the row cache saves more time but takes up more space. So the row cache should only be used for static rows or hot rows. The global limit for row cache is controlled in cassandra.yaml by setting row_cache_size_in_mb.

There is also a per-table setting defined in the schema, in the property caching under key rows_per_partition, with the default set to NONE.

Counter Cache

Counter cache helps cut down on counter locks’ contention for hot counter cells. Only the local (clock, count) tuple of a counter cell is, not the whole counter, so it is relatively cheap. You can adjust the global limit for counter cache managed in cassandra.yaml by setting counter_cache_size_in_mb.

Offheap Memtables

Since Cassandra 2.1, offheap memory can be used for memtables.

This is set via memtable_allocation_type in cassandra.yaml. If you want to have the least impact on reads, use offheap_buffers to move the cell name and value to DirectBuffer objects.

With offheap_objects you can move the cell offheap. Then you only have a pointer to the offheap data.

Direct ByteBuffer

There are a few miscellaneous places where Cassandra allocates offheap, such as HintsBuffer, and certain compressors such as LZ4 will also use offheap when file cache is exhausted.

Memory Mapped Files

By default, Cassandra uses memory mapped files. If the operating system is unable to allocate memory to map the file to, you will see message such as:

Native memory allocation (mmap) failed to map 12288 bytes for committing reserved memory.

If this occurs, you will need to reduce offheap usage, resize to hardware with more available memory, or enable swap.

Maximum Memory Usage Reached

This is a common log message about memory that often causes concern:

INFO o.a.c.utils.memory.BufferPool Maximum memory usage reached (536870912), cannot allocate chunk of 1048576.

Cassandra has a cache that is used to store decompressed SSTable chunks in offheap memory. In effect, this cache performs a similar job to the OS page cache, except the data doesn’t need to be decompressed every time it is fetched.

This log message just means that the cache is full. When this is full, Cassandra will allocate a ByteBuffer outside the cache, which can be a degradation of performance (since it has to allocate memory). This is why this message is only at INFO level and not WARN.

The default chunk cache size is 512MB. It could be modified by altering file_cache_size_in_mb in cassandra.yaml.

Examining Memory Usage

Running nodetool info will provide heap and offheap memory usage.

However, comparing this with actual memory usage will usually show a discrepancy.

For example: with Cassandra running in Docker, Cassandra was using 16.8GB according to docker stats; nodetool info reported 8GB heap, 4GB heap. This leaves 4.8GB not accounted for. Where does the memory go?

This happens because offheap usage reported by nodetool info only includes:

Memtable offheap
Bloom filter offheap
Index summary offheap
Compression metadata offheap

Other sources of offheap usage are not included, such as file cache, key cache, and other direct offheap allocations.

To get started with Apache Cassandra, sign up for a free trial of Instaclustr Managed Cassandra today. Or, connect with one of our experts to get advice on optimizing Apache Cassandra memory usage for your unique environment.