Apache Cassandra Connector

The Bundled Cassandra Connector included in Kafka Connect is an open source connector developed by lenses.io with an Apache 2.0 license. The documentation below describes how to configure it on your Instaclustr Managed Kafka Connect cluster. More details on the Connector can be found on https://docs.lenses.io/connectors/sink/cassandra.html and https://docs.lenses.io/connectors/source/cassandra.html.

The Connector uses JSON as the data format for the messages read from of Kafka and uses JSON insert functionality to insert the rows into Cassandra. Similar conversion occurs in the other direction too. However, Cassandra Source Connector is not covered under the Instaclustr Support policy as the approach used by this Connector to read data from Cassandra is not suitable for use in a production deployment. If used, it could lead to significant performance issue when the Read query eventually breaks when it spans across multiple partitions.

An example use case for this Connector is when taking a stream of data from an IoT and moving it to long term storage for later analysis. In this scenario, we can use the sink connector to grab the data from IOT in the Kafka cluster and push it to the Cassandra database.

The connectors use a specific query language called KCQL, which can be specified in a connector config for inserting into Cassandra (sink) or selecting rows from Cassandra (source). Something that we may have to keep in mind is that the connectors are used to transfer the data in its entirety between Cassandra and Kafka, so there is no filtering capability. This is reflected in how KCQL does not have WHERE clause to the query.

Setting up the connector is really easy, the main property that we need to set up is connect.cassandra.kcql, which specify the query, and the connection properties for Cassandra. There will be more concrete examples when we discuss the source and sink in more detail.

For a running example, we assumed that we have the following:

  1. A keyspace demo in the Cassandra database.
  2. A column family orders in the Cassandra database with a column created defined as TIMEUUID type.
  3. A Kafka instance with topic orders-topic.

An example of CQL queries and command to setup the Kafka topic as above:

Make sure that you have can access your Kafka, Kafka Connect, and Cassandra cluster. Additionally, you also need to ensure that the Kafka Connect cluster can communicate with the Cassandra cluster. Pay attention to the firewall rule.

Sink Connector

This connector is used to write to a Cassandra database. For full descriptions of the options, consult https://docs.lenses.io/connectors/sink/cassandra.html. This is an example of the connector config that reads data from a topic named orders-topic in Kafka and pushes it into the Cassandra database described above.

To test this, create a kafka producer to produce some values in the form of JSON string. This is an example:

And check that the values produced are put into Cassandra

Using Other Converters

By default, our Kafka Connect instance property uses org.apache.kafka.connect.json.StringConverter. We can set the connector to use JSONConverter instead by adding the bolded lines to the configuration for the connector:

And it will yield the following output:

We can also use the AvroConverter if we are using Schema Registry. Add the following lines to the configuration for the connector:

And it will yield the following output:

By Instaclustr Support
Need Support?
Experiencing difficulties on the website or console?
Already have an account?
Need help with your cluster?
Contact Support
Why sign up?
To experience the ease of creating and managing clusters via the Instaclustr Console
Spin up a cluster in minutes