Media: Apache Spark™ - Hadoop friend or foe?

Apache Spark
Technical

Media: Apache Spark™ – Hadoop friend or foe?

February 06, 2015
By Instaclustr

The Spark project encompasses no fewer than four other components, including a stream analytics engine and a structured query interface. It provides out-of-the-box capabilities that previously required learning, setting up and maintaining a mishmash of different technologies. And once users inevitably start shifting their attention away from third-party components to Spark, vendors will follow suit.

“So it wouldn’t be a huge surprise if more organizations start to use Spark with Cassandra and other file systems, perhaps eroding Hadoop’s position as the dominant Big Data platform in the enterprise”, said Ben Bromhead, co-founder & CTO of Instaclustr Pty Ltd.

“If you’re using Hadoop just against HDFS, you still need a data store or database to run your day-to-day transactional operations,” he said. “Spark and Cassandra’s integration make it really simple because you can directly run your queries against your operational database, and you don’t have to run an entire cluster just dedicated to analytics.”

OpenSearch® Versions 2.14 and 1.3.17 Now Available

Powering AI Workloads with Intelligent Data Infrastructure and Open Source

Instaclustr for ClickHouse® now in Private Preview