AWS FSx for NetApp ONTAP Integration
Overview
NetApp Instaclustr-managed ClickHouse clusters provisioned within AWS RIYOA accounts can be integrated with Amazon FSx for NetApp ONTAP configured to host S3-compatible services. NetApp Instaclustr-managed ClickHouse can be integrated with a newly provisioned or existing FSx for ONTAP.
The FSx for ONTAP can reside in the same VPC as the NetApp Instaclustr-managed ClickHouse cluster, or in a different VPC and, if required, a different AWS account, provided the appropriate networking configuration are in place. Once the FSx for ONTAP is integrated, the NetApp Instaclustr-managed ClickHouse cluster can use S3 table engines and table functions to query data from, and write data to, the file system. You can read more about FSx for ONTAP here.
This guide provides common concepts, requirements, and shared usage instructions for FSx for ONTAP integrations.
Integration works with any FSx for ONTAP configuration
NetApp Instaclustr-managed ClickHouse with FSx for ONTAP is a full lifecycle solution eliminating the self-managed burden for you. The integration supports any reasonable FSx for ONTAP setup, whether fully managed by NetApp Instaclustr Support or customer configured.
It is important to note that implementation steps will differ for Customer-managed FSx for ONTAP. This requires the customer to be responsible for AWS-side configuration of FSx for ONTAP.
Before proceeding, please note the following:
NetApp Instaclustr creates and manages AWS resources on your behalf for ClickHouse FSx for NetApp ONTAP integrations. Please do not modify any AWS resources directly in your AWS account if they were provisioned via the NetApp Instaclustr platform, as changes made outside of the NetApp Instaclustr platform may cause unexpected issues with your cluster.
Limitations
- FSx for NetApp ONTAP integration is only supported for NetApp Instaclustr-managed ClickHouse clusters provisioned within AWS RIYOA accounts. Contact NetApp Instaclustr Support if you would like FSx for NetApp ONTAP integration with NetApp Instaclustr RIIA accounts.
- Ensure that each NetApp Instaclustr-managed ClickHouse cluster and FSx for ONTAP uses a unique CIDR range. The CIDR ranges must not overlap.
Prerequisites
- The AWS account hosting your FSx for ONTAP must be configured in accordance with the RIYOA setup guides available in the NetApp Instaclustr Console.
These guides can be accessed by clicking Directory in the top-left corner, then under Guides in the sidebar expanding RIYOA Setup: - AWS Standard Setup
- AWS IAM Roles Explanation:
The IAM policy required for integrating FSx for ONTAP file systems looks like this:
123456"Sid": "AllowFSX","Effect": "Allow","NotAction": ["fsx:DeleteBackup"],"Resource": "arn:aws:fsx:::*"
How to Enable
First ensure that you have completed pre-requisite instructions and then refer to one of the following guides for step‑by‑step integration instructions:
- How to integrate NetApp Instaclustr-managed ClickHouse with FSx for ONTAP
This guide provides instructions for both NetApp Instaclustr managed ClickHouse and FSx for ONTAP. The AWS-side configuration of FSx for ONTAP is performed via NetApp Instaclustr Console, API or Terraform. - How to integrate customer‑managed FSx for ONTAP with NetApp Instaclustr-managed ClickHouse
This guide provides integration instructions only for NetApp Instaclustr-managed ClickHouse. The AWS-side configuration of FSx for ONTAP is the customer responsibility.
How to Use ClickHouse S3 Table Engine with FSx for ONTAP
The S3 Table Engine allows ClickHouse to query data stored in S3‑compatible object storage, such as FSx for NetApp ONTAP. This makes it easy to query external data directly, without needing to load it into ClickHouse first, and enables flexible data sharing between ClickHouse and other systems that use S3 storage. For detailed information, refer to the official S3 Table Engine documentation.
Creating an S3 Table
Create a table using the S3 Table Engine with the Named Collection of the integration (copy from the S3 FSx for ONTAP Integrations table on the NetApp Instaclustr Console) and file you wish to access through ClickHouse, such as the below example:
|
1 2 |
CREATE TABLE s3_fsx_for_ontap_table (id Int32, name String) ENGINE = s3(<NAMED COLLECTION>, filename='<S3 BUCKET NAME>/<FILE NAME>'); |
Note: S3_BUCKET_NAME should not be provided if the file system is created and managed by Instaclustr.
Loading Data
Load data into the S3 table by inserting data directly, such as the below example:
|
1 |
INSERT INTO s3_table VALUES (1, 'Alice'), (2, 'Bob'); |
Querying Data
Query data from the S3 table as you would with any other table, such as the below example:
|
1 |
SELECT * FROM s3_table; |
As an alternative to first creating a new table using the S3 Table Engine, data can also be queried directly from the file system with the S3 Table Function, such as the below example:
|
1 |
SELECT * FROM s3(<NAMED COLLECTION>, filename=<S3 BUCKET NAME>/<FILE NAME>'); |
Questions
Please contact NetApp Instaclustr Support for any further inquiries.