Harness the power of Apache Spark from your app
Apache Spark is a high performing engine for large-scale analytics and data processing. While Spark can provide advanced analytics capabilities, it requires a fast, distributed back-end data store. Apache Cassandra is the most modern, reliable and scalable choice for that data store.
Instaclustr’s Managed Apache Spark, including Spark Jobserver, provides a reliable and managed platform, collocated with your Apache Cassandra data store, for your application to leverage the power of Apache Spark for stream or batch analytics.
Colocated Data Engine
Your Spark engine is right where your operational database resides. No need for extracting, transforming and loading into a new environment.
We provide Apache Cassandra as the underlying data store. Spark fully integrates with the key components of Cassandra and provides the resilience and scale required.
Spark Jobserver and Apache Zeppelin
To provide easy access to your Spark processing engine, Instaclustr’s Spark cluster can include Spark Jobserver (REST API) and Apache Zeppelin (analyst notebook UI).
Spark has an advanced DAG execution engine that supports cyclic data flow and in-memory computing. It runs programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk.
Managed for Reliability
Instaclustr’s focus is supporting application reliability at scale. Our Spark architecture and support offering enables you to use the power of Spark from your application, with the confidence to meet your availability and processing requirements.