Using the Kafka Schema Registry

Menu

This example shows how to use the Confluent Kafka Schema Registry to store data schemas for Kafka topics which we will generate using Apache Avro. The example will also demonstrate how to use the Schema Registry to produce and consume generated Apache Avro objects using an Instaclustr Kafka cluster.

Prerequisites

The Confluent Kafka Schema Registry is a standalone application that uses but runs separately to Kafka. To use the Schema Registry you will need to download and build the source code that is available here. To build these dependencies you will need Maven and Gradle (1.7+) installed. To install Maven, run:

sudo apt-get install maven

To install Gradle, download a gradle release from here, for example 4.8.1, and install it:

Before you can build the Schema Registry you first need to clone and build Kafka, confluent-common, and confluent-rest-utils:

Clone and install Kafka if you do not already have a local Kafka installation:

Next, clone and install confluent-common and confluent-rest-utils if you do not already have them installed:

Finally, clone and install the confluent Schema Registry:

Schema Registry Setup

Kafka Schema Registry Configuration

Before you can use the Kafka Schema Registry you first need to configure a number of things. Basic configuration requires the following configuration options, please see here for a full list of options. Make a file schema-registry.properties with the following content:

In order to use the Schema Registry with Instaclustr Kafka you also need to provide authentication credentials. If your cluster has client ⇆ broker encryption enabled you will also need to provide encryption information. For more information on using certificates with Kafka and where to find them see here. If your cluster has client broker ⇆ encryption enabled add the following to your schema-registry.properties file, ensuring the password, truststore location, and bootstrap servers are correct:

If your cluster does not have client ⇆ broker encryption enabled, instead add the following to your schema-registry.properties file, ensuring the password and bootstrap servers are correct:

Note: To connect to your Kafka cluster over the private network, use port 9093 instead of 9092.

Start the Schema Registry

Once the configuration files are created, use the following command to start the Schema Registry:

If everything was set up correctly you should see output similar to the following:

Using the Schema Registry

Now that the Schema Registry is up and running, you can now use it in your applications to store data schemas for your Kafka topics. The following example is a Java application that uses the Schema Registry and Apache Avro to produce and consume some simulated product order events.

Dependencies

Add the kafka_2.12, avro, and kafka-avro-serializer packages to your application. These package are available via Maven (kafka_2.12, avro, kafka-avro-serializer). To add the following dependencies using Maven, add the following to your pom.xml file:

You will also need the avro-tools utility in order to compile the data schema into a Java class. The avro-tools utility is available here.

Create the Avro Schema

Before you can produce or consume messages using Avro and the Schema Registry you first need to define the data schema. Create a file orderEventSchema.avsc with the following content:

This file specifies a simple OrderEvent data serialization schema for product orders, with each OrderEvent containing an id, timestamp, product name, and price. For more information on the Avro serialization format see the documentation here.

Generate the Avro Object Class

With the schema file created, use the avro-tools utility to compile the schema file into an actual Java class:

Note: The src/main/java file path at the end of the command can be wherever you want, just make sure the generated class will be accessible by your application code. An example file structure is:

Create Kafka Topic

Use the guide here to create a new topic called orders.

Producing Avro Objects

Client configuration

Before creating a Kafka producer client, you first need to define the configuration properties for the producer client to use. In this example we provide only the required properties for the producer client. See here for the full list of configuration options.

If your cluster has client ⇆ broker encryption enabled, create a new file named producer.properties with the following content, ensuring the password, truststore location, and bootstrap servers list are correct:

If your cluster does not have client ⇆ broker encryption enabled, create a new file named producer.properties with the following content, ensuring the password and bootstrap servers list are correct:

Make sure the password and bootstrap servers list are correct.

Note: To connect to your Kafka cluster over the private network, use port 9093 instead of 9092.

Java Code

Now that the configuration properties have been setup you can create a Kafka producer.

First, load the properties:

Once you’ve loaded the properties you can create the producer itself:

Next, create some OrderEvents to produce:

Where the getTimestamp() function is:

Now turn each OrderEvent into a ProducerRecord to be produced to the orders topic, and send them:

Finally, use the producer’s flush() method to ensure all messages get sent to Kafka:

Full code example:

Consuming Avro Objects

Client configuration

As in the producer example, before creating a Kafka consumer client you first need to define the configuration properties for the consumer client to use. In this example we provide only the required properties for the consumer client. See here for the full list of configuration options.

If your cluster has client ⇆ broker encryption enabled, create a new file named consumer.properties with the following content, ensuring the password, truststore location, and bootstrap servers list are correct:

If your cluster does not have client ⇆ broker encryption enabled, create a new file named consumer.properties with the following content, ensuring the password and bootstrap servers list are correct:

Make sure the password and bootstrap servers list are correct.

Note: To connect to your Kafka cluster over the private network, use port 9093 instead of 9092.

Java Code

Now that the configuration properties have been setup you can create a Kafka consumer.

First, load the properties:

Once you’ve loaded the properties you can create the consumer itself:

Before you can consume messages, you need to subscribe the consumer to the topic(s) you wish to receive messages from, in this case the orders topic:

Finally, continually poll Kafka for new messages, and print each OrderEvent received:

Full code example:

Putting Them Together

Now that you have a consumer and producer set up, it’s time to combine them.

Start the Consumer

Start the consumer before starting the producer, because by default, consumers only consume messages that were produced after the consumer started.

Start the Producer

Now that the consumer is setup and ready to consume messages, you can now start your producer.

If the consumer and producer are setup correctly the consumer should output the messages sent by the producer shortly after they were produced, for example:

Site by Swell Design Group