• Apache Cassandra
  • Apache Kafka
  • Apache Spark
  • Cadence
  • Feature Releases
  • News
  • OpenSearch
  • PostgreSQL
  • Redis
Instaclustr Product Update: December 2023

There have been a lot of exciting things happening at Instaclustr! Here’s a roundup of the updates to our platform over the last few months. 

As always, if you have a specific feature request or improvement you’d love to see, please get in contact with us. 

Major Announcements 

Cadence®: 

  • Instaclustr Managed Platform adds HTTP API for Cadence®

The Instaclustr Managed Platform now provides an HTTP API to allow interaction with Cadence® workflows from virtually any language. For further information or a technical briefing, contact an Instaclustr Customer Success representative or [email protected] 

  • Cadence® on the Instaclustr Managed Platform Achieves PCI Certification 

We are pleased to announce that Cadence® 1.0 on the Instaclustr Managed Platform is now PCI Certified on AWS and GCP, demonstrating our company’s commitment to stringent data security practices and architecture. For further information or a technical briefing, contact an Instaclustr Customer Success representative or [email protected] 

Apache Cassandra®: 

  • Debezium Connector Cassandra is Released for General Availability  

Instaclustr’s managed Debezium connector for Apache Cassandra makes it easy to export a stream of data changes from an Instaclustr Managed Cassandra cluster to Instaclustr Managed Kafka cluster. Try creating a Cassandra cluster with Debezium Connector on our console. 

Ocean for Apache Spark™:  

  • Interactive data engineering at scaleJupyter Notebooks, embedded in the Ocean for Apache Spark UI, is now generally available in Ocean Spark. Our customers’ administrators no longer need to install components since Ocean Spark is now hosting Jupyter infrastructure. Users no longer need to switch applications as they can have Ocean for Apache Spark process massive amounts of data in the cloud while viewing live analytic results on our screen

Other Significant Changes 

Apache Cassandra®: 

  • Apache Cassandra® Version 3.11.16 was released as generally available on Instaclustr Platform 
  • A new feature has been released for zero-downtime restores for Apache Cassandra® that eliminates downtime when the in-place restore operation is performed on a Cassandra cluster 
  • Added support for customer-initiated resize on GCP and Azure, allowing customers to vertically scale their Cassandra cluster and select any larger production disk or instance size when scaling up. And for instance size, the ability to also scale down through the Instaclustr Console or API. 

Apache Kafka®: 

  • Support for Apache ZooKeeper™ version 3.8.2 released in general availability 
  • Altered the maximum heap size allocated to Instaclustr for Kafka and ZooKeeper on smaller node sizes to improve application stability. 

Cadence®: 

  • Cadence 1.2.2 is now in general availability 

Instaclustr Managed Platform: 

  • Added support for new AWS region: UAE, Zurich, Jakarta, Melbourne and Osaka 

OpenSearch®: 

  • OpenSearch 1.3.11 and OpenSearch 2.9.0 are now in general availability on the Instaclustr Platform 
  • Dedicated Ingest Nodes for OpenSearch® is released as public preview

PostgreSQL®: 

  • PostgreSQL 15.4, 14.9, 13.12, and 16 are now in general availability 

Ocean for Apache Spark™: 

  • New memory metrics charts in our UI give our users the self-help they need to write better big data analytical applications. Our UI-experience-tracking software has shown that this new feature has already helped some customers to find the root cause of failed applications. 
  • Autotuning memory down for cost savings. Customers can use rules to mitigate out of memory errors. 
  • Cost metrics improved in both API and UI: 
    • API: users can now make REST calls to retrieve Ocean Spark usage charges one month at a time 
    • UI: metrics that estimate cloud provider costs have been enhanced to account for differences among on-demand instances, spot instances, and reserved instances.  

Upcoming Releases 

Apache Cassandra®: 

  • AWS PrivateLink with Instaclustr for Apache Cassandra® will soon be released to general availability. The AWS PrivateLink offering provides our AWS customers with a simpler and more secure option for network connectivity.​​ Log into the Instaclustr Console with just one click to trial the AWS PrivateLink public preview release with your Cassandra clusters today. Alternatively, the AWS PrivateLink public preview for Cassandra is available at the Instaclustr API. 
  • AWS 7-series nodes for Apache Cassandra® will soon be generally available on Instaclustr’s Managed Platform. The AWS 7-series nodes introduce new hardware, with Arm-based AWS Graviton3 processors and Double Data Rate 5 (DDR5) memory, providing customers with leading technology that Arm-based AWS Graviton3 processors are known for.  

Apache Kafka®: 

  • Support for custom Subject Alternative Names (SAN) on AWS​ will be added early next year, making it easier for our customers to use internal DNS to connect to their managed Kafka clusters without needing to use IP addresses. 
  • Kafka 3.6.x will be introduced soon in general availability. This will be the first Managed Kafka release on our platform to have KRaft available in GA, entitling our customers to full coverage under our extensive SLAs. 

Cadence®: 

  • User authentication through open source is under development to enhance security of our Cadence architecture. This also helps our customers more easily pass security compliance checks of applications running on Instaclustr Managed Cadence. 

Instaclustr Managed Platform: 

  • Instaclustr Managed Services will soon be available to purchase through the Azure Marketplace. This will allow you to allocate your Instaclustr spend towards any Microsoft commit you might have. The integration will first be rolled out to RIIA clusters, then a fully RIYOA setup. Both self-serve and private offers will be supported. 
  • The upcoming release of Signup with Microsoft integration marks another step in enhancing our platform’s accessibility. Users will have the option to sign up and sign in using their Microsoft Account credentials. This addition complements our existing support for Google and GitHub SSO, as well as the ability to create an account using email, ensuring a variety of convenient access methods to suit different user preferences.  

OpenSearch®: 

  • Release of searchable snapshots​​ feature is coming soon as GA. This feature will make it possible to search indexes that are stored as snapshots within remote repositories (i.e., S3) without the need to download all the index data to disk ahead of time, allowing customers to save time and leverage cheaper storage options. 
  • Continuing on from the public preview release of dedicated ingest nodes for Managed OpenSearch, we’re now working on making it generally available within the coming months. 

PostgreSQL®: 

  • Release of PostgreSQL–Azure NetApp (ANF) Files Fast Forking feature will be available soon. By leveraging the ANF filesystem, customers can quickly and easily create an exact copy of their PostgreSQL cluster on additional infrastructure. 
  • Release of pgvector extension is expected soon. This extension lets you use your PostgreSQL server for vector embeddings to take advantage of the latest in Large Language Models (LLMs). 

Ocean for Apache Spark™: 

  • We will soon be announcing SOC 2 certification by an external auditor since we have implemented new controls and processes to better protect data. 
  • We are currently developing new log collection techniques and UI charts to accommodate longer running applications since our customers are running more streaming workloads on our platform. 

Did you know? 

It has become more common for customers to create many accounts, each managing smaller groups of clusters. The account search functionality helps answer several questions: where the cluster is across all accounts, where the account is, and what it owns. 

Combining the real-time data streaming capabilities of Apache Kafka with the powerful distributed computing capabilities of Apache Spark can simplify your end-to-end big data processing and Machine Learning pipelines. Explore a real-world example in the first part of a blog series on Real Time Machine Learning using NetApp’s Ocean for Apache Spark and Instaclustr for Apache Kafka 

Apache Kafka cluster visualization—AKHQ—is an open source Kafka GUI for Apache Kafka, which helps with managing topics, consumers groups, Schema Registry, Kafka® Connect, and more. In this blog we detail the steps needed to be done to get AKHQ, an open source project licensed under Apache 2, working with Instaclustr for Apache Kafka. This is the first in a series of planned blogs where we’ll take you through similar steps on using other popular open source Kafka user interfaces and how to use them with your beloved Instaclustr for Apache Kafka. 

Authors

Varun Ghai Product Manager
Ving Ngo Lead Software Engineer