NetApp announces intent to acquire Instaclustr Read the announcement
Connecting to Spark via JDBC/ODBC Thrift Server

This page will walk you through connecting to JDBC via Thrift Server to use for querying to your Spark cluster.

Table of Contents

Setting Up Your Environment

Create and set up your Spark cluster

To start, follow the first three steps in Getting Started with Instaclustr Spark & Cassandra:

  1. Provision a cluster with Cassandra
  2. Set up a Spark Client (but changing the configuration to “AMI: Ubuntu Server 167.04 LTS (HVM), SSD Volume Type”).
  3. Configure Client Network Access

Java 8

Ensure that your Spark client machine has Java 8 installed and selected as the preferred Java version.

sudo apt update
sudo apt install openjdk-8-jdk

Install Spark 2.1.1:

wget https://archive.apache.org/dist/spark/spark-2.1.1/spark-2.1.1-bin-hadoop2.7.tgz
tar -xf spark-2.1.1-bin-hadoop2.7.tgz

 

Start the Thrift Server

./start-thriftserver.sh --master spark://<spark_master_IP1>:7077,<spark_master_IP2>:7077,<spark_master_IP3>:7077

 

Query using Beeline

Start beeline (included with Spark):

./bin/beeline

Once Beeline starts, connect to the Thrift Server:

!connect jdbc:hive2://localhost:1000

Username: ubuntu
Password: <empty> (just press enter when prompted).

 

Now run your queries as you wish!

By Instaclustr Support
Need Support?
Experiencing difficulties on the website or console?
Already have an account?
Need help with your cluster?
Contact Support
Why sign up?
To experience the ease of creating and managing clusters via the Instaclustr Console
Spin up a cluster in minutes