What Is Apache Kafka?
Apache Kafka is an open source, distributed event streaming platform developed by the Apache Software Foundation. It handles real-time data feeds, providing a high-throughput, low-latency solution for data streams. Kafka is used for building real-time data pipelines and streaming applications. It is popular in use cases involving real-time analytics, monitoring, and data warehousing.
Kafka’s architecture revolves around producers, brokers, and consumers, facilitating fast and reliable message exchange between systems. Messages are stored in topics, and each message has a unique key, ensuring ordered processing. Its distributed, scalable nature allows it to handle large volumes of data across many nodes.
To download the latest Kafka versions, see the official downloads page.
The history of Apache Kafka versions
Apache Kafka’s updates are regularly scheduled, with major releases typically every few months. Here’s a summary of the active Kafka releases and their support timelines, as of the time of this writing:
- Version 3.8: Released in July 2024, currently supported.
- Version 3.7: Released in March 2024, support ends in January 2025.
- Version 3.6: Released in October 2023, support ends in August 2025.
- Version 3.5: Released in May 2023, support ends in February 2025.
- Version 3.4: Released in February 2023, support ends in May 2025.
Users are encouraged to stay current with Kafka versions to take advantage of new advancements and maintain a secure data streaming environment.
Tips from the expert
Andrew Mills
Senior Solution Architect
Andrew Mills is an industry leader with extensive experience in open source data solutions and a proven track record in integrating and managing Apache Kafka and other event-driven architectures
In my experience, here are tips that can help you better manage Apache Kafka upgrades and ensure a successful transition:
- Use canary testing to detect anomalies early: Before fully committing to an upgrade, deploy a subset of brokers with the new version in a canary group. This approach helps identify compatibility issues, performance degradations, or unexpected behaviors early in the process, allowing you to rollback with minimal impact.
- Monitor critical metrics before, during, and after the upgrade: Establish a baseline of critical Kafka metrics like broker CPU usage, network I/O, disk I/O, and request latency before starting the upgrade. Continuously monitor these metrics during and after the upgrade to detect any performance regressions or unusual patterns.
- Set up chaos engineering experiments: Introduce controlled failures during the upgrade process using chaos engineering tools like Chaos Monkey. This practice helps you understand Kafka’s resilience and fault tolerance under real-world conditions, ensuring that your cluster can handle unexpected issues during and after the upgrade.
- Enable and validate log compaction settings: Upgrades can sometimes affect log compaction behavior, especially when format versions change. Review your log compaction settings, perform tests in a staging environment, and ensure that critical topic compaction works as expected post-upgrade to avoid data inconsistencies.
- Plan for downtime windows for high-traffic periods: Although rolling upgrades aim to minimize downtime, prepare contingency plans for short downtime windows during off-peak hours, particularly for critical business periods. This minimizes impact and allows for more controlled and safe upgrades if immediate rollbacks are needed.
Apache Kafka: Upgrading from previous versions
Upgrading to 3.8.0 from any version 0.8.x through 3.7.x
When upgrading to Apache Kafka 3.8.0 from any version between 0.8.x and 3.7.x, there are several critical changes and new features to be aware of:
- Updates in MirrorMaker 2, which now supports emitting checkpoints for offsets mirrored before the start of the checkpoint task. This enhancement improves offset translation but requires that MirrorMaker 2 has READ authorization for the checkpoint topic. If this authorization is not granted, checkpointing will only apply to offsets mirrored after the task’s initiation.
- JBOD (just a bunch of disks) configuration in KRaft, which was previously in early access, is now fully supported. For those using tiered storage, which remains in early access, it now accommodates clusters configured with multiple log directories.
Upgrading to 3.7.1 from any version 0.8.x through 3.6.x
Upgrading to Kafka 3.7.1 involves several steps that vary depending on whether you are using ZooKeeper-based or KRaft-based clusters.
For ZooKeeper-based clusters, it is crucial to note that if you are upgrading from a version earlier than 2.1.x, the schema used to store consumer offsets has changed, which will prevent downgrading to a version before 2.1.x once the inter-broker protocol version is updated.
To perform a rolling upgrade:
- Update the
server.properties
on all brokers to reflect the current Kafka and message format versions. - Upgrade each broker individually by shutting it down, applying the update, and restarting it. During this process, you can still downgrade if issues arise.
- Once all brokers are updated and their performance is verified, update the
inter.broker.protocol.version
to 3.7, and restart the brokers again to finalize the upgrade. - If the message format version was overridden, perform another rolling restart after updating the
log.message.format.version
to 3.7.
For Kraft-based clusters, the process is similar, but you must be cautious about metadata version changes introduced after 3.3.0, as these changes prevent downgrades.
Upgrading to 3.6.2 from any version 0.8.x through 3.5.x
The upgrade to Kafka 3.6.2 from versions between 0.8.x and 3.5.x also requires careful attention, particularly regarding changes in ZooKeeper-based clusters.
For ZooKeeper-based clusters:
- As with other upgrades, you need to adjust the
server.properties
to ensure compatibility with the current Kafka and message format versions. - Conduct a rolling upgrade by updating and restarting brokers one at a time.
- After verifying the performance, update the
inter.broker.protocol.version
to 3.6, and then restart each broker to apply the new protocol version. - If the message format was overridden, a final rolling restart is required after updating the
log.message.format.version
to 3.6.
For Kraft-based clusters, the upgrade process is similar, with a critical focus on metadata version changes. It’s important to note that once you update the metadata version to 3.6, downgrading to versions prior to 3.3-IV0 is not possible due to the introduction of metadata changes.
Best practices for upgrading Apache Kafka
Perform a Rolling Upgrade
A rolling upgrade is a method that allows you to update your Kafka cluster without downtime. This involves incrementally upgrading each broker one at a time, ensuring continuous availability of the Kafka service. During this process, it is crucial to maintain client compatibility and prevent disruption to the message flow. Start by upgrading the Kafka software on each broker and then proceed to update configuration files step-by-step.
Compatibility Considerations
When upgrading, it’s important to ensure that your existing clients and applications remain compatible with the new version. Kafka is backward compatible, meaning newer brokers can communicate with older versions. However, it is advised to update clients to the latest version to utilize new features fully.
Before performing an upgrade, review the release notes and KIPs to understand potential compatibility issues. Consider staging the update in a test environment that mimics production settings. Validate your applications’ interaction with the updated Kafka to avoid unforeseen issues.
Backup and Data Retention
Prior to any upgrade, it is vital to back up metadata and relevant data associated with Kafka. This includes configurations, partition logs, and ZooKeeper data to mitigate potential data loss. Implement a consistent data retention policy that defines how long data is stored and ensure it aligns with compliance requirements.
Use the KIP-Upgrade Process
The KIP (Kafka Improvement Proposal) process is a community-driven guide that defines methods to enhance Kafka features. Each KIP is aimed at improving Kafka’s functionality without disrupting existing operations. Keeping abreast of KIPs related to upgrades can significantly ease the transition from an older version to a newer one.
Implementing KIPs involves understanding the changes proposed in each upgrade and how they interact with current infrastructure. As Kafka continuously evolves, following these proposals ensures that deployments gain enhancements while maintaining maximum compatibility.
Post-Upgrade Validation
After an upgrade, validation ensures that your Kafka cluster is functioning correctly. This involves checking system logs, monitoring performance metrics, and verifying data integrity across the cluster. Ensure that all services reliant on Kafka operate seamlessly and produce expected outcomes.
Conduct thorough tests to validate that all Kafka components, such as brokers, producers, and consumers, efficiently handle operational workloads. Validate configurations and evaluate if they align with business use cases before fully committing the updated version to production.
Instaclustr's comprehensive support for multiple versions of Apache Kafka
Instaclustr offers comprehensive support for different versions of Apache Kafka. As a distributed streaming platform, Apache Kafka has evolved over time with new features, enhancements, and bug fixes introduced in each version. Instaclustr recognizes the importance of providing support for multiple Kafka versions to cater to the diverse needs of its customers.
Instaclustr ensures that customers have the flexibility to choose the Kafka version that best suits their requirements. Whether it’s the latest stable release or a specific version preferred by the customer, Instaclustr offers support for a wide range of Kafka versions. This allows customers to leverage the features and improvements introduced in newer versions or maintain compatibility with existing applications built on older versions.
The support for different Kafka versions provided by Instaclustr includes assistance with installation, configuration, and ongoing management of Kafka clusters. Instaclustr’s experienced team of engineers is well-versed in the intricacies of each Kafka version and can provide expert guidance and troubleshooting for any version-specific issues that may arise.
Furthermore, Instaclustr ensures that its support for Kafka is backed by rigorous testing and validation. Before making a new Kafka version available to customers, Instaclustr thoroughly tests it in various scenarios to ensure stability, performance, and compatibility with other components of the Kafka ecosystem. This meticulous testing process helps minimize any potential risks or compatibility issues for customers when upgrading or deploying Kafka clusters.
For more information read: