An In-Memory Database or Data Structure Server?
Redis is a fast in-memory database and cache, open source under a BSD license, written in C and optimized for speed. Redis’ name comes from “REmote DIctionary Server”.
Redis is often called a data structure server because its core data types are similar to those found in programming languages like strings, lists, dictionaries (or hashes), sets, and sorted sets. It also provides many other data structures and features for approximate counting, geolocation, and stream processing.
Of the NoSQL databases Redis’ various data structures take it closest to the native data structures programmers most commonly use inside applications and algorithms. This ease of use makes it ideal for rapid development and fast applications, as core data structures are easily shared between processes and services.
By default Redis stores data in memory, with periodic disk persistence as a default. As Redis persists data to disk it can serve as a classical database for many use cases as well as a cache. When full, Redis will return an error to a client, but it can be configured as a cache to eject older and less important data as new data comes in. In both cases the size of available memory is the main constraint on its use.
What Is Redis Primarily Used For?
Redis is commonly used as a cache to store frequently accessed data in memory so that applications can be responsive to users. With the capacity to designate how long you want to keep data, and which data to evict first, Redis enables a series of intelligent caching patterns.
There are many reasons for intelligent caching, and it has a large impact on user experiences, user productivity, bounce rates, and revenue for retail. The research on these effects is compiled in our white paper on the advantages of in-memory databases.
Data Expiration and Eviction Policies
Data structures in Redis can be marked with a Time To Live (TTL) set in seconds, after which they will be removed. A series of configurable intelligent “eviction policies” are available. Under these impermanent data, marked with a TTL, can be considered before other data which does not have a TTL allowing the creation of a tiered hierarchy of memory objects.
In some use cases a least recently used (LRU) or least frequently used (LFU) metric makes more sense for eviction. Redis provides tunable probabilistic implementations of these cache policy options as well.
What Are the Other Uses and Features of Redis?
The in-memory architecture of Redis is a keyspace with arbitrary in-memory objects owned by their keys. The versatile architecture of Redis has enabled it to evolve to add extra features that map to this model.
Streams and Stream Processing
Redis makes an excellent task queue and message broker with its lists and pub/sub messaging (below). In version 5.0 of Redis streams and stream processing, inspired by the Apache Kafka project, were added to its feature set. In a similar way to Kafka topics streams of work can have processing “consumer” groups which check out work and acknowledge when it has been completed. If no acknowledgement is received after a period of time has expired then other consumers can pick up this work to ensure it is done.
This enables Kafka-like patterns in-memory and is particularly useful for responsive non-blocking user interface experiences.
Publication and Subscription Messaging (Pub/Sub)
Pub/Sub messaging allows for messages to be passed to channels and for all subscribers to that channel to receive that message. This feature enables information to flow quickly through your infrastructure without using up space in the database as messages are not stored. So you can make services aware of load on other pieces of infrastructure or applications, or to update gaming scores and pass notifications.
Redis has a scripting facility which enables custom scripts to be written and executed in the Lua language. This allows users to add features to Redis themselves in the form of fast executing scripts. Lua has extremely fast initialization, enabling scripts to perform various tasks on the data without significantly slowing Redis down. As the core Redis process is single threaded this ensures atomic operations.
Redis provides a series of geospatial index data structures and commands. Latitude and longitude coordinates are stored and users can query distances between objects or query for objects within a given radius of a point. These commands are able to return their values in a variety of formats (feet, kilometers, etc.).
The speed of Redis allows these data points to be updated quickly. In a ridesharing application these features could be used to connect a user with the nearby drivers, and then provide real-time updates as they approach or during the ride. Redis is in use with major transport and delivery companies for exactly this use case.
The hyperloglog data structure enables approximate set counting in a much smaller space than keeping a full unique set of items. A simple counter can double count, and a set of user IDs or IPs would take up a large amount of space. A hyperloglog allows a very small amount of memory to hold a good approximation of the unique objects.
Bitmaps allow for the highly efficient storage of True and False values as 1 or 0 inside Redis strings. By allowing this type of Boolean data to be stored efficiently many use cases are possible.
Bitmaps can be used to efficiently store a user’s progress through some content, like an online course or a large download. Another use is to represent the online/offline status of someone’s contacts in an application.
Redis’ creator Salvatore Sanfilippo, or Antirez, originally drafted an in-memory data store in a few hundred lines of TCL for a startup. In 2009 he released a version written in C to the open source community.
In the decade since, Redis has gone through years of development and testing and real world use in enterprise, processing trillions of transactions for the largest companies and services in the world. Redis 6.0 includes SSL/TLS encrypted communication between nodes and granular Access Control Lists (ACLs) for secure deployments.
You can try out some of these features by starting our free trial of Redis and taking it for a spin.
Who’s Using Redis?
Redis is used by many websites as a cache but also by some of the largest ridesharing companies and social networks. In the 2020 StackOverflow developer survey 20.5% of developers said they were currently using Redis. In the last four years of the survey Redis was also rated as the most loved database technology by developers.
It’s worth drilling down into that last point. The way the survey defines “love” is if a developer is currently using the technology and plans to use it again. Redis consistently has the highest ratio of any database technology of people who use it, know it, and like it so much they plan to use it again in the future.
A Great Interface
Part of Redis’ attraction for developers is its simple and elegant command interface. Eric Redmond and Jim Wilson summed it up in their book Seven Databases in Seven Weeks:
It’s not simply easy to use; it’s a joy. If an API is UX for programmers, then Redis should be in the Museum of Modern Art alongside the Mac Cube.
Redis simplifies the use of common fundamental data structures between services and processes at high speed. With Lua scripts and modules Redis becomes an extensible domain-specific language (DSL) for your data.
If It’s In-Memory Why Does It Persist to Disk?
Redis can be configured to write to disk in two formats, a binary format and an “append only file” (AOF) format. The binary format mirrors what is in memory and is on by default. The AOF file can be turned on in the configuration and is a simple log of all commands which can be replayed to return a node to its previous state. Both can be configured to write to disk more or less frequently.
For critical caching deployments, particularly where Redis exists in front of slower infrastructure, these disk persistence facilities allow for “warm restarts”. Without a warm restart capability an empty cache that was handling traffic above the capacity of the infrastructure behind it would pass all that traffic through, potentially causing that infrastructure to be overwhelmed.
Redis as a Primary Database
If the data size and risk profile are well known and match with Redis’ disk persistence model, then it can serve as a primary database. In this scenario it’s important to think carefully about a cluster topology with data replication, your eviction policy settings, using disk persistence, and having automated backups. Disk persistence can be configured to write to disk on every new write, but this can reduce performance.
How Does It Cluster?
Redis provides multiple facilities for scaling and availability. One is a primary/secondary server setup, and a service called Redis Sentinel. The most scalable solution is to use horizontal clustering which has built in replication that spreads the data across nodes in a cluster.
When caching, with data stored elsewhere, it can sometimes be best to maximize storage in cluster topology by not using replication. The warm restart capability of the cache can handle any possible node restarts while your other data store can serve as the canonical copy of your data.
Is It Open Source and What Are the Alternatives?
Using pure open source in your stack is a great way to avoid vendor lock-in and still obtain highly performant technologies. Redis is completely open source under a BSD 3-clause license which no doubt has contributed to its popularity.
The Memcached project is a solid open source cache which is the closest competitor to Redis, although it is less popular in developer surveys and on GitHub. It focuses on the key/value caching use case and lacks Redis’ extra data structures or the capacity to persist to disk for warm restarts. With similar in-memory performance in the key/value use case and the many extra features Redis supports the more common choice is to use Redis.
When Should I Not Use Redis?
Redis Isn’t Really for Caching Static Assets
Redis isn’t really for caching static assets for websites like images, CSS, or video files. The delivery of these assets is best optimized through web server configuration or through using a content delivery network (CDN).
Where Latency Isn’t an Issue
When latency isn’t an issue at all Redis can be less attractive than databases like Apache Cassandra which can store to disk. Before you conclude response times don’t affect your use case in some way you should read our white paper on the topic which gives a complete overview of the research on latency.
For Storing Extremely Large Data Sets or Certain Types of Critical Data
Redis can hold critical data, but as a data store if the size of the data exceeds the memory capacity of your cluster it may not be the optimal choice.
For critical data Redis clusters can be configured with data replication, such that more than one copy of the data is held across the nodes. Additionally Redis’ disk persistence can be configured to occur on every write, although this will impact performance. Backups of the on-disk files can add another layer of data security. In use cases where even this is insufficient Redis can still serve as a fast cache in front of another data store.
Redis is an extremely popular, fast, flexible in-memory database with lots of great data structures. These features make it one of the most versatile NoSQL databases, with a superset of features of other in-memory competitors.
You can give Redis a try with our free trial or get in touch if you have any questions.