Today we were super excited to announce our first release of Shotover as an open source project.
Shotover is a layer 7 database proxy built to allow developers, admins, DBAs, and operators to modify in-flight database requests. Currently, Shotover has been used to implement multi-region active-active capability and cluster-aware routing for Redis for non-cluster aware drivers.
We’ve been working on the concept and various iterations of Shotover for the last 2 years and it has proven to be an invaluable tool to build some interesting capabilities for our managed platform.
While Redis mirroring and cluster-hiding are the first production use cases of Shotover, Shotover also currently has additional capabilities (in various stages of quality), including:
- Redis caching of Cassandra queries
- Field-level encryption with key storage support for AWS KMS
- Basic query routing
Why Shotover Proxy?
Many applications and services rely on a specific database to persist state. Many different databases and databases-as-a-service exist with differing characteristics such as performance, data model, guarantees, and pricing.
As businesses come to better understand application and service requirements, the original database selection may no longer be appropriate. User behavior may be different than expected, or database performance different from benchmarking or vendor advertising. Developer velocity may also face challenges.
Based on this, it is increasingly common for applications and services to not only be deployed on multiple databases to support specific use cases but for those databases to change over the course of the application’s lifecycle.
Instaclustr has both benefited and suffered from this behavior; new customers come on board with Instaclustr after facing challenges with another database, but existing customers may also leave Instaclustr after discovering Apache Cassandra (or Apache Kafka/Redis/OpenSearch/PostgreSQL) is not the right fit for them. There is a significant switching cost associated with this change, but it does happen occasionally, especially with simpler NoSQL data models and query patterns.
The level at which this problem is experienced can either be incredibly granular or incredibly broad. On the broad side of the spectrum and getting more and more granular, we’ve observed the following challenges:
- Entire applications may get rewritten against a different database to solve a set of problems
- Services that make up an application change database technologies
- Changes to how data is stored are required. E.g. we must now encrypt personally identifiable information (PII)
- A specific use case, query, or dataset within an application or service may get moved to a different database. E.g. a single table gets moved to a different database
- The same database is kept, but the data model, query pattern, or other configuration options need to be changed to support requirements
- One-off events, such as service migrations, outages, or increased load may require different data storage or query patterns to handle them properly
Or more plainly:
- I need my database to behave differently, but not my application
- Some queries are slow for certain keys (customers/tenants etc.)
- Some queries could be implemented more efficiently (queries are not quite right)
- Some tables are too big or inefficient (the data model is not quite right)
- Some queries occur far more than others (hot partitions)
- I have this sinking feeling I should have chosen a different database (hmmm yeah… )
- My database slows down for a period of time (GC, autovacuum, flushes)
- I don’t understand where my queries are going or how they are performing (poor observability at the driver level)
The common theme across these scenarios is that fundamentally the behavior/implementation of the application is not actually changing, yet it requires a change to application code. Usage patterns may change, but the actual business logic is not changing.
Shotover creates a decomposition layer that allows application owners, developers, and operations teams to separate data infrastructure changes from application changes. With Shotover, we hope to increase developer velocity by making it so they don’t have to spend weeks changing their applications for database changes.
Now those of you who have mentally stepped through running Shotover in production may come to identify an issue with this approach. We have shifted some of these problems into the configuration for a proxy tool. As time goes on, more and more fixes/changes/routes etc. will end up in Shotover resulting in a significant pile up of technical debt.
Like all good changes to the way we work, Shotover is not a silver bullet for all issues (though it is for some I would argue). Those bad data models still need to be fixed and user/tenant behavior can still not be papered over by a data layer change.
Like other software solutions for developer velocity such as feature flagging, it still requires discipline and effort to maintain. What Shotover does allow you to do is keep the plane flying and you can fix the underlying problems during regular business hours and in your normal roadmap/sprint planning model!