Use IAM roles and policies to provide the Kafka Connect cluster with access to AWS S3 bucket

In this document we will look at how to set up IAM roles and policies to provide access for your Kafka Connect cluster to your AWS S3 bucket.

If you want to provide the Kafka Connect cluster with direct access to the S3 bucket, see
Provide access to S3 bucket using permission policy
If you want to manage the access to the S3 bucket in a separate role, see
Provide access to S3 bucket using role

Provide access to S3 bucket using permission policy

Recommended method for RIYOA account where both cluster instances and S3 bucket are in the same AWS account. This method will add a permission policy to the instance role. It is useful when you want to provide the Kafka Connect cluster with direct access to the S3 bucket.

Follow Custom Kafka Connect Connectors until step 9 to create a Kafka Connect cluster with custom connectors

If you use the Instaclustr Console, in the Custom Connector Configuration section, choose “Add permission policy to instance role later”

If you use the Provisioning API, specify the S3 bucket name without any further access detail in the body of the POST request. For example

{
    "clusterName": "<cluster-name>",
    "bundles": [
        {
            "bundle": "KAFKA_CONNECT",
            "version": "3.1.1",
            "options": {
                "targetKafkaClusterId": "<target-kafka-cluster-id>",
                "vpcType": "KAFKA_VPC",
                "s3.bucket.name": "<s3-bucket-name>"
            }
        }
    ],
    "provider": {
        "name": "AWS_VPC",
        "accountName": null
    },
    "nodeSize": "n1-standard-2-10",
    "dataCentre": "us-central1",
    "clusterNetwork": "10.225.0.0/16",
    "privateNetworkCluster": "false",
    "rackAllocation": {
        "numberOfRacks": "3",
        "nodesPerRack": "1"
    }    ]
}

{

"clusterName": "<cluster-name>",

"bundles": [

{

"bundle": "KAFKA_CONNECT",

"version": "3.1.1",

"options": {

"targetKafkaClusterId": "<target-kafka-cluster-id>",

"vpcType": "KAFKA_VPC",

"s3.bucket.name": "<s3-bucket-name>"

}

"provider": {

"name": "AWS_VPC",

"accountName": null

"nodeSize": "n1-standard-2-10",

"dataCentre": "us-central1",

"clusterNetwork": "10.225.0.0/16",

"privateNetworkCluster": "false",

"rackAllocation": {

"numberOfRacks": "3",

"nodesPerRack": "1"

} ]

}

After the Kafka Connect cluster gets to PROVISIONING state, go to the cluster, Details tab and copy the Data Center Id. This Id is also the name of the AWS instance role that the cluster is using

Add a permission to allow access to the S3 bucket
If you use AWS console

From the dashboard, go to IAM services, Roles
Find the instance role by the Data Center Id, click on the role
From here you can either edit the existing policy (s3-access-policy) and add the permission or add a new policy to the role

To edit the existing policy, click on s3-access-policy

You can use the visual editor to add the permission by clicking on Add additional permissions, then add 2 additional permissions:

Service: S3; Action: List – ListBucket, Read – GetBucketLocation
Service: S3, Action: Read → GetObject
and provide the bucket name similar to the screenshots below

OR you can go to JSON tab and add the statements below into the JSON policy

{
    "Sid":"<new unique statement name>",
    "Effect":"Allow",
    "Action":["s3:ListBucket","s3:GetBucketLocation"],
    "Resource":"arn:aws:s3:::<S3 bucket name>"
},
{
    "Sid":"<new unique statement name>",
    "Effect":"Allow",
    "Action":"s3:GetObject",
    "Resource":"arn:aws:s3:::<S3 bucket name>/*"
}

{

"Sid":"<new unique statement name>",

"Effect":"Allow",

"Action":["s3:ListBucket","s3:GetBucketLocation"],

"Resource":"arn:aws:s3:::<S3 bucket name>"

{

"Sid":"<new unique statement name>",

"Effect":"Allow",

"Action":"s3:GetObject",

"Resource":"arn:aws:s3:::<S3 bucket name>/*"

}

If you use AWS CLI

List current existing policy:
aws iam get-role-policy --role-name <Data Center Id> --policy-name s3-access-policy

Copy the policy part from the output of the command to a text editor. The policy should be similar to:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "1",
      "Effect": "Allow",
      "Action": ["s3:ListAllMyBuckets"],
      "Resource": ["arn:aws:s3:::*"]
    },
    {
      "Sid": "2",
      "Effect": "Allow",
      "Action": ["s3:GetBucketLocation"],
      "Resource": ["arn:aws:s3:::$BACKUP_BUCKET"]
    },
    {
      "Sid": "3",
      "Effect": "Allow",
      "Action": ["s3:ListBucket"],
      "Resource": ["arn:aws:s3:::$BACKUP_BUCKET"],
      "Condition": {
        "StringLike": {
          "s3:prefix": ["$CLUSTER_ID/*"]
        }
      }
    },
    {
      "Sid": "allows3",
      "Effect": "Allow",
      "Action": ["s3:*"],
      "Resource": ["arn:aws:s3:::$BACKUP_BUCKET/$CLUSTER_ID/*"]
    }
  ]
}

{

"Version": "2012-10-17",

"Statement": [

{

"Sid": "1",

"Effect": "Allow",

"Action": ["s3:ListAllMyBuckets"],

"Resource": ["arn:aws:s3:::*"]

{

"Sid": "2",

"Effect": "Allow",

"Action": ["s3:GetBucketLocation"],

"Resource": ["arn:aws:s3:::$BACKUP_BUCKET"]

{

"Sid": "3",

"Effect": "Allow",

"Action": ["s3:ListBucket"],

"Resource": ["arn:aws:s3:::$BACKUP_BUCKET"],

"Condition": {

"StringLike": {

"s3:prefix": ["$CLUSTER_ID/*"]

}

{

"Sid": "allows3",

"Effect": "Allow",

"Action": ["s3:*"],

"Resource": ["arn:aws:s3:::$BACKUP_BUCKET/$CLUSTER_ID/*"]

}

]

}

Edit the policy, add the statements that allow bucket access, and save it as a JSON file at FILE_PATH

{
    "Sid":"<new unique statement name>",
    "Effect":"Allow",
    "Action":["s3:ListBucket","s3:GetBucketLocation"],
    "Resource":"arn:aws:s3:::<S3 bucket name>"
},
{
    "Sid":"<new unique statement name>",
    "Effect":"Allow",
    "Action":"s3:GetObject",
    "Resource":"arn:aws:s3:::<S3 bucket name>/*"
}

{

"Sid":"<new unique statement name>",

"Effect":"Allow",

"Action":["s3:ListBucket","s3:GetBucketLocation"],

"Resource":"arn:aws:s3:::<S3 bucket name>"

{

"Sid":"<new unique statement name>",

"Effect":"Allow",

"Action":"s3:GetObject",

"Resource":"arn:aws:s3:::<S3 bucket name>/*"

}

Delete the old policy
aws iam delete-role-policy --role-name $CDC_ID --policy-name s3-access-policy
Add the edited policy
aws iam put-role-policy --role-name $CDC_ID --policy-name new-s3-access-policy --policy-document file://FILE_PATH

After the policies are set up correctly and the cluster hits RUNNING state, head to the Managing Custom Connectors section in Connectors page and press Sync to load the custom connectors.

Once loaded successfully, they should be visible under Available Connectors section.

Provide access to S3 bucket using role

Recommended method for customers who use a RIIA account and have an S3 bucket on their own AWS account, but can also be used for RIYOA clusters. This method uses a separate role with access to the S3 bucket, let’s call it S3 access role, and allows the instance role to assume the S3 access role and gain access to the bucket. It is useful when you want to manage the S3 access role separately from the instance. You can do this using the AWS CLI or the AWS Console.

Using AWS CLI

Create a policy that allow access to the S3 bucket

Prepare a JSON file that contains the policy. It should be similar to:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "1",
      "Effect": "Allow",
      "Action": ["s3:ListBucket","s3:GetBucketLocation"],
      "Resource": "arn:aws:s3:::<S3 bucket name>"
    },
    {
      "Sid": "2",
      "Effect": "Allow",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::<S3 bucket name>/*"
    }
  ]
}

{

"Version": "2012-10-17",

"Statement": [

{

"Sid": "1",

"Effect": "Allow",

"Action": ["s3:ListBucket","s3:GetBucketLocation"],

"Resource": "arn:aws:s3:::<S3 bucket name>"

{

"Sid": "2",

"Effect": "Allow",

"Action": "s3:GetObject",

"Resource": "arn:aws:s3:::<S3 bucket name>/*"

}

]

}

Create the policy:
aws iam create-policy --policy-name <policy-name> --policy-document file://FILE_PATH
Copy the ARN of the policy from the output of the command, which should be similar to:
"arn": "arn:aws:iam::<aws-account-id>:policy/<policy-name>"

Prepare a trust policy document file that allows assumption to S3 access role

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {},
      "Action": "sts:AssumeRole",
      "Condition": {}
    }
  ]
}

{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Principal": {},

"Action": "sts:AssumeRole",

"Condition": {}

}

]

}

Create a role with the trust policy
aws iam create-role --role-name <role-name> --assume-role-policy-document file://FILE_PATH
Attach the S3 access policy:
aws iam attach-role-policy --role-name <role-name> --policy-arn <policy-arn>

Provision the Kafka Connect cluster using your preferred method:

If you use the Instaclustr Console, in the Custom Connector Configuration section,
choose “Use IAM role”. Input the S3 access role ARN

If you use the Provisioning API, specify the S3 bucket name with the S3 access role ARN.
For example:

{
  "clusterName": "<cluster-name>",
  "bundles": [
    {
      "bundle": "KAFKA_CONNECT",
      "version": "3.1.1",
      "options": {
        "targetKafkaClusterId": "<target-kafka-cluster-id>",
        "vpcType": "KAFKA_VPC",
        "s3.bucket.name": "<s3-bucket-name>",
        "aws.s3.role.arn": "<s3-access-role-arn>"
      }
    }
  ],
  "provider": {
    "name": "AWS_VPC",
    "accountName": null
  },
  "nodeSize": "n1-standard-2-10",
  "dataCentre": "us-central1",
  "clusterNetwork": "10.225.0.0/16",
  "privateNetworkCluster": "false",
  "rackAllocation": {
    "numberOfRacks": "3",
    "nodesPerRack": "1"
  }
}

{

"clusterName": "<cluster-name>",

"bundles": [

{

"bundle": "KAFKA_CONNECT",

"version": "3.1.1",

"options": {

"targetKafkaClusterId": "<target-kafka-cluster-id>",

"vpcType": "KAFKA_VPC",

"s3.bucket.name": "<s3-bucket-name>",

"aws.s3.role.arn": "<s3-access-role-arn>"

}

"provider": {

"name": "AWS_VPC",

"accountName": null

"nodeSize": "n1-standard-2-10",

"dataCentre": "us-central1",

"clusterNetwork": "10.225.0.0/16",

"privateNetworkCluster": "false",

"rackAllocation": {

"numberOfRacks": "3",

"nodesPerRack": "1"

}

After the cluster gets to PROVISIONING state, copy the Data Center Id

In your trust policy document, add the Data Center Id of the cluster to allow the instance role to assume the S3 access role

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        <strong>"AWS": "arn:aws:iam::<cluster-aws-account-id>:role/<data-center-id>"</strong>
      },
      "Action": "sts:AssumeRole",
      "Condition": {}
    }
  ]
}

{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Principal": {

"Action": "sts:AssumeRole",

"Condition": {}

}

]

}

Update the trust policy of S3 access role
aws iam update-assume-role-policy --role-name <role-name> --policy-document file://FILE_PATH

After the roles and policies are set up correctly and the cluster hits RUNNING state, head to the Managing Custom Connectors section in Connectors page and press Sync to load the custom connectors.

Once loaded successfully, they should be visible under Available Connectors section.

Using AWS Console

Create a S3 access policy

Go to the IAM dashboard, switch to Policies and click on Create Policy
You can either use the Visual editor to specify 2 permissions:
- Service: S3; Action: List – ListBucket, Read – GetBucketLocation; Resource: <S3-bucket-arn>
- Service: S3; Action: Read – GetObject, Resource: <S3-bucket-arn>/*

OR you can switch to the JSON tab and specify the permission using JSON format. For example:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "1",
      "Effect": "Allow",
      "Action": ["s3:ListBucket","s3:GetBucketLocation"],
      "Resource": "arn:aws:s3:::<S3 bucket name>"
    },
    {
      "Sid": "2",
      "Effect": "Allow",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::<S3 bucket name>/*"
    }
  ]
}

{

"Version": "2012-10-17",

"Statement": [

{

"Sid": "1",

"Effect": "Allow",

"Action": ["s3:ListBucket","s3:GetBucketLocation"],

"Resource": "arn:aws:s3:::<S3 bucket name>"

{

"Sid": "2",

"Effect": "Allow",

"Action": "s3:GetObject",

"Resource": "arn:aws:s3:::<S3 bucket name>/*"

}

]

}

Then click Next:Tags and optionally provide tags for the policy
Then review, name the policy and create it

Create the S3 access role with the S3 access policy and copy its ARN
- Go to the IAM dashboard, switch to Roles and click on Create Role
- Select Custom trust policy as trusted entity
- Click Next, then find and tick select the S3 access policy we just created
- Then click Next to review and create the role

Provision the Kafka Connect cluster using your preferred method with option “Use IAM role” and copy its Data Center Id
If you use the Instaclustr Console, in the Custom Connector Configuration section,
choose “Use IAM role”. Input the S3 access role ARN

If you use the Provisioning API, specify the S3 bucket name with the S3 access role ARN.
For example:

{
  "clusterName": "<cluster-name>",
  "bundles": [
    {
      "bundle": "KAFKA_CONNECT",
      "version": "3.1.1",
      "options": {
        "targetKafkaClusterId": "<target-kafka-cluster-id>",
        "vpcType": "KAFKA_VPC",
        "s3.bucket.name": "<s3-bucket-name>",
        "aws.s3.role.arn": "<s3-access-role-arn>"
      }
    }
  ],
  "provider": {
    "name": "AWS_VPC",
    "accountName": null
  },
  "nodeSize": "n1-standard-2-10",
  "dataCentre": "us-central1",
  "clusterNetwork": "10.225.0.0/16",
  "privateNetworkCluster": "false",
  "rackAllocation": {
    "numberOfRacks": "3",
    "nodesPerRack": "1"
  }
}

{

"clusterName": "<cluster-name>",

"bundles": [

{

"bundle": "KAFKA_CONNECT",

"version": "3.1.1",

"options": {

"targetKafkaClusterId": "<target-kafka-cluster-id>",

"vpcType": "KAFKA_VPC",

"s3.bucket.name": "<s3-bucket-name>",

"aws.s3.role.arn": "<s3-access-role-arn>"

}

"provider": {

"name": "AWS_VPC",

"accountName": null

"nodeSize": "n1-standard-2-10",

"dataCentre": "us-central1",

"clusterNetwork": "10.225.0.0/16",

"privateNetworkCluster": "false",

"rackAllocation": {

"numberOfRacks": "3",

"nodesPerRack": "1"

}

Update the trust policy of S3 access role
- Go to the role and go to the Trust relationships tab. Click Edit trust policy
- Change the policy statement to allow the Kafka Connect cluster to assume the S3 access role
- Click Update policy

After the roles and policies are set up correctly and the cluster hits RUNNING state, head to the Managing Custom Connectors section in Connectors page and press Sync to load the custom connectors.

Once loaded successfully, they should be visible under Available Connectors section.

Custom Kafka Connect Connectors

Using non-Instaclustr Kafka Clusters

Learn about our
Managed platform

Schedule your 1:1 session with one of our open source experts

Schedule a demo

Use IAM roles and policies to provide the Kafka Connect cluster with access to AWS S3 bucket

Provide access to S3 bucket using permission policy

Provide access to S3 bucket using role

Need help withyour Cluster?

Learn about ourManaged platform

Need help with
your Cluster?

Learn about our
Managed platform