Zookeeper to KRaft Migration Process
Introduction
Apache Kafka historically relied on ZooKeeper for metadata management and cluster coordination. The KIP-500 initiative introduced KRaft (Kafka Raft) as an experimental alternative in Kafka 2.8, enabling internal metadata management. Kafka 4.0 and later (Kafka 4.0 Release Announcement) uses KRaft exclusively, requiring ZooKeeper-based clusters to migrate to use the most up to date versions of Kafka.
Migration Overview
The ZooKeeper to KRaft migration allows you to transition your Apache Kafka cluster to KRaft (Kafka Raft) for streamlined metadata management, eliminating the need for a separate ZooKeeper ensemble. This process is designed to minimize disruption, enabling the transition without requiring a full Kafka cluster shutdown.
Instaclustr’s managed platform customers can request migration on clusters running Kafka version 3.9.x. Support-only customers can initiate a self-managed migration on any Kafka Bridge Release versions, following the process outlined in the Apache Kafka documentation.
Manual Migration
Support-only customers must manually initiate a self-managed migration process by following the Apache Kafka Documentation. This involves configuring KRaft controllers, migrating metadata, and decommissioning ZooKeeper, with the option to roll back to ZooKeeper at specific stages. The process is time-intensive and requires significant effort. Note that for our Managed platform customers we have automated a significant chunk of the process including health checks to ensure the migration works correctly – please refer to the following sections for more details.
Automated Migration
Instaclustr’s automated migration (available only for our managed platform customers) is managed by our 24/7 support team and streamlines the migration process outlined in Apache Kafka Documentation to ensure no disruption throughout the process.
Prerequisites for Migration
Cluster Requirements
Before initiating the migration, ensure that your Kafka cluster meets the following prerequisites:
- Kafka Version: The cluster must be a Zookeeper-based Kafka cluster running version 3.9.x.
- Add-on Compatibility: Transition ZooKeeper-dependent add-ons (e.g., Confluent Schema Registry, Confluent REST Proxy) to Karapace equivalents, as older add-ons are not compatible with KRaft. Please reach out to our support team to organise a transition from the Confluent Schema Registry / Confluent REST Proxy to the Karapace Schema Registry / Karapace REST Proxy.
- API Endpoints: Instaclustr’s managed platform customers must use the current (not deprecated) API endpoints for Kafka KRaft cluster management operations.
Contact our support team for tailored guidance in getting your cluster ready for migration.
Scheduling a maintenance window
To minimise disruption, schedule a maintenance window with our support team for the migration. Consider the following:
- Timing: Choose a low-traffic period to reduce the impact on clients and applications.
- Duration: Our support team will assist you in developing a tailored plan based on your cluster’s size and complexity. Larger clusters may require multiple maintenance windows with the migration being carried out in smaller chunks in each available window.
What Goes on During the Automated Migration
The automated migration, managed by our support team, involves the following phases:
- Backup Creation: A backup of all ZooKeeper data is created to ensure data integrity.
- KRaft Setup: The cluster is prepared for KRaft by setting up new KRaft controller nodes (for dedicated ZooKeeper clusters) or designating existing brokers to handle controller tasks (for co-located clusters).
- Metadata Migration: Metadata is transferred from ZooKeeper to KRaft, allowing the cluster to operate in a hybrid mode where both ZooKeeper and KRaft are temporarily active to ensure stability.
- Controller Transition: KRaft controllers take over metadata management, ensuring seamless coordination across the cluster.
- Broker Update: Brokers are updated to use KRaft exclusively, removing reliance on ZooKeeper.
- ZooKeeper Removal: ZooKeeper is decommissioned by either shutting down dedicated ZooKeeper nodes or stopping ZooKeeper processes on co-located brokers, followed by the complete removal of all ZooKeeper data.
- Validation and Cleanup: The support team verifies that the cluster is functioning correctly, with partitions and replication intact, and removes any residual configurations.
Note: For larger clusters, the migration can be paused between certain steps of the migration process if the maintenance window is too short while still ensuring the cluster is fully operational. The process can be resumed by our support team in a new window coordinated with you, without impacting availability of your cluster.
Clients may experience brief resource usage spikes during transitions, but the cluster remains operational throughout, including during paused periods.
Limitations
The following limitations are inherent to the Apache Kafka project’s ZooKeeper to KRaft migration process:
- No Downgrade Path: After migrating to KRaft, reverting to ZooKeeper is not supported.
- Performance Overhead: Temporary increases in CPU and memory usage may occur during metadata transition.
- Kafka brokers operate in dual mode (ZooKeeper and KRaft), requiring additional CPU and memory for metadata synchronization and processing.
- During steps like Broker Configuration and Controller Configuration, Kafka brokers and controllers undergo changes and restarts, leading to temporary CPU and memory spikes.
- Metadata Version Updates: While a cluster is being migrated from ZK mode to KRaft mode, Kafka does not support changing the metadata version (also known as the inter.broker.protocol.version).
Note: While pause/resume allows splitting the process across multiple maintenance windows, it is strongly recommended to complete the migration as a cohesive process to maintain a stable cluster state; partial migrations should be avoided.
FAQs
Q: Is the Instaclustr managed automated migration available for managed platform customers only?
A: Yes, Instaclustr’s automated migration is exclusively available for our managed platform customers. Support-only customers must perform a self-managed manual migration, as outlined in the Apache Kafka Documentation.
Q: Is data restored or transferred from the old ZooKeeper cluster to the new KRaft cluster?
A: Data is not “restored” but migrated in place. The migration process transfers metadata from ZooKeeper to KRaft while preserving all topic data and configurations on existing brokers. Backups are created as a precaution, but restoring from them is only needed if an issue occurs.
Q: Will my existing Kafka tools and add-ons work with KRaft?
A: Most Kafka tools should function, but ZooKeeper-dependent add-ons on our managed platform, such as Confluent Schema Registry and Confluent REST Proxy, are not supported with KRaft. It is advisable to work with our support team to transition to Karapace Schema Registry and Karapace REST Proxy before migration.
Q: What happens if the maintenance window is not long enough to complete a full migration of my cluster?
A: For larger clusters, the migration can be paused, keeping the cluster operational in migration mode. You can resume in a new maintenance window coordinated with our support team, ensuring minimal disruption and flexibility for complex migrations. Please note though that this system has only been designed keeping in mind short periods of pauses. Our support team will be able to work with you to work out what is suitable for your cluster.
Q: What if the migration fails?
A: If the migration fails, the process supports rollbacks to ZooKeeper at specific stages, keeping the cluster operational in ZooKeeper or migration mode until issues are resolved. Backups ensure data can be restored if needed.
Q: Is ZooKeeper removed after migration?
A: Yes, ZooKeeper is decommissioned, and the cluster operates fully in KRaft mode.
Q: What are my rollback options if I need to revert to ZooKeeper?
A: Due to Apache Kafka project constraints, rollbacks to ZooKeeper are possible only at specific stages during the migration, requiring significant manual intervention by our support team and are rarely needed. After completion of migration, reverting to ZooKeeper is not supported by the Kafka Project.
For support to migrate your Kafka cluster or related questions, contact Instaclustr Support