Complete guide to OpenSearch in 2025

What is OpenSearch

OpenSearch is an open source search and analytics suite forked from Elasticsearch 7.10. Originally maintained by Amazon, it is now a project of the Linux Foundation. It is utilized for numerous purposes, including full-text search, log analytics, and application monitoring. With a RESTful API and extensive JSON support, OpenSearch provides a versatile platform for enterprise-level search and data visualization needs.

It includes features like data processing, indexing, and real-time search capabilities. Its architecture is scalable, allowing it to handle vast amounts of data efficiently. Users can leverage its security model and easy-to-use interface to create dashboards for faster data insights.

This is part of an extensive series of guides about data security.

Brief history of OpenSearch

OpenSearch was born out of a fork of Elasticsearch 7.10 and Kibana 7.10 after Elastic changed the license to a more restrictive one in January 2021. Amazon Web Services (AWS) and several other companies focused on maintaining and advancing search and analytics capabilities created this open-source project.

The project has quickly evolved, incorporating community feedback to offer enhanced features. Its commitment to open-source principles ensures that OpenSearch remains free to use, modify, and distribute, promoting wider adoption and continual improvement.

Related content: Read our guide to OpenSearch tutorial

Key features of OpenSearch

Built-In Search Capabilities

OpenSearch offers built-in search capabilities that support complex queries and real-time indexing. These capabilities enable users to perform efficient full-text searches, aggregations, and geospatial queries quickly. The system’s scalability ensures that it can manage vast datasets without compromising performance.

In addition to its foundational search features, OpenSearch also supports various search plugins and modules, extending its functionality.

Related content: Read our guide to OpenSearch helm chart

Data Prepper

Data Prepper simplifies data ingestion by transforming raw data into structured formats compatible with OpenSearch. This feature streamlines the data onboarding process, making it easier to index and analyze data. Data Prepper offers support for common data sources and preprocessing tasks, enhancing its utility in data workflows.

This feature integrates with various data pipelines, enabling automatic data transformation and normalization. This interoperability ensures that data is consistently prepared and optimized for indexing, reducing the complexity and overhead associated with data management.

Related content: Read our guide to OpenSearch security

Trace Analytics

Trace Analytics in OpenSearch allows for the detailed analysis of distributed trace data, helping users understand application behavior and performance. By collecting and visualizing traces, this feature aids in identifying bottlenecks and improving application debugging and optimization processes.

Trace Analytics is particularly useful in microservices architectures where tracing inter-service communication is challenging.

Related content: Read our guide to OpenSearch Kubernetes

Index Management

Index management in OpenSearch ensures that data indexing remains efficient and organized. This feature includes policies for automated index handling, such as rollover, retention, and deletion, enabling smarter resource utilization and data lifecycle management.

Through configuration rules, users can automate index operations, optimizing system performance. These policies help in managing indices based on size, age, and other metrics, simplifying administrative tasks and maintaining a streamlined search environment.

Related content: Read our guide to OpenSearch cluster

Latest OpenSearch Versions

The latest OpenSearch releases focused on improving vector search, AI-driven capabilities, and observability:

Version 3.6.0 (07 April 2026) adds Agent-v2 with token usage tracking, observability traces, and relevance-tuning capabilities. It also introduces Application Performance Monitoring (APM) features, including service dependencies and performance metrics. Pull-based ingestion is now generally available, with warmup settings and adaptive shard selection. Vector search is expanded with 1-bit scalar quantization, while Query Insights gains more detailed grouping and failure tracking.

Version 3.5.0 (10 February 2026) adds support for conversational memory and context handling, which helps build AI-driven applications. It also improves vector search performance using SIMD optimizations for FP16 operations, increasing throughput significantly. Observability is expanded with better Prometheus integration and enhanced query capabilities. The release also introduces improvements to query insights and experimental HTTP/3 support.

Version 3.4.0 (16 December 2025) introduces alerting features for PPL, allowing users to monitor and manage execution statistics. It enhances anomaly detection with a daily insights view and improves flow frameworks with better agent summaries and server support. It also includes optimizations for vector search, such as memory-efficient warmup and FP16 scoring, along with infrastructure upgrades like JDK 25.

Version 3.3.x focuses on stability and incremental improvements. Version 3.3.0 (14 October 2025) introduces a redesigned Discover interface with support for log analytics and distributed tracing. It also makes agent-based search and memory APIs generally available and improves neural search performance. Minor releases 3.3.1 (22 October 2025) and 3.3.2 (30 October 2025) address compatibility issues and bug fixes while maintaining performance gains.

Earlier in the 3.x series, version 3.2.0 (19 August 2025) and 3.1.0 (24 June 2025) introduced features such as GPU acceleration for vector indexing, semantic search enhancements, and tools for improving search relevance. These versions also expand observability and workload management capabilities.

Version 3.0.0 (06 May 2025) marked a major upgrade with Lucene 10, improving indexing and vector search. It introduces experimental gRPC support, GPU-based vector operations, and new agent-based workflows. It also strengthens security and enhances query capabilities with more advanced commands.

OpenSearch vs Elasticsearch: What are the differences?

While OpenSearch and Elasticsearch share a common origin, significant differences have emerged since the fork in 2021. Here are the key distinctions:

1. Licensing

OpenSearch is fully open source under the Apache License 2.0, including its core search engine, dashboards, and project components. This makes it attractive for organizations that want permissive licensing, broad redistribution rights, and fewer restrictions around managed-service or embedded use cases. Elasticsearch changed its licensing model in 2021, moving Elasticsearch and Kibana from Apache 2.0 to SSPL and Elastic License 2.0, and later added AGPLv3 as an option for the free portions of the source code. However, Elastic’s default distribution remains governed by Elastic’s licensing model rather than Apache 2.0, which can make license review more important for commercial or cloud-service use cases.

Related content: Read our guide to OpenSearch logstash

2. Governance and Community

OpenSearch is now governed through the OpenSearch Software Foundation under the Linux Foundation, with a Technical Steering Committee guiding the project’s technical direction. This gives the project a more vendor-neutral governance structure than its early AWS-led phase and is intended to support broader participation from companies and independent contributors. Elasticsearch, by contrast, is primarily developed and directed by Elastic, which gives it a more centralized product roadmap, tighter integration with Elastic’s commercial platform, and a single-vendor decision-making model.

Related content: Read our guide to OpenSearch Docker

3. Features and Plugins

Both platforms support full-text search, analytics, vector search, observability, dashboards, alerting, and security features, but their feature sets have diverged since the fork. OpenSearch has increasingly emphasized open-source search, vector search, observability, security analytics, anomaly detection, SQL/PPL, ingestion, and AI/agent-oriented workflows through Apache-licensed plugins and project components. Elasticsearch has focused heavily on the Elastic Stack and Search AI Platform, including advanced vector retrieval, semantic search, RAG-oriented tooling, machine learning, observability, and security features that are tightly integrated with Kibana, Elastic Agent, and Elastic Cloud.

Related content: Read our guide to OpenSearch dashboards

4. Compatibility and Ecosystem

OpenSearch began as a fork of Elasticsearch 7.10.2, so many older Elasticsearch APIs, clients, and index patterns are familiar to Elasticsearch users. However, compatibility has decreased over time as both projects have evolved independently.

OpenSearch’s documentation notes that legacy Elasticsearch clients may work with OpenSearch 1.x, but for OpenSearch 2.0 and later, Elasticsearch clients are not fully compatible and mixing client/server versions carries a high risk of errors.

Elasticsearch has the advantage of Elastic’s official client libraries, Kibana integrations, Elastic Agent/Fleet ecosystem, and Elastic Cloud services, while OpenSearch has its own clients, Dashboards, plugins, AWS ecosystem support, and growing Linux Foundation-backed community ecosystem.

Related content: Read our guide to OpenSearch serverless

5. Performance and Scalability

Both OpenSearch and Elasticsearch are distributed search and analytics engines built on Apache Lucene and designed for large-scale indexing, querying, and analytics workloads. Performance differences depend heavily on version, workload type, hardware, shard design, query patterns, vector dimensions, and operational tuning.

OpenSearch 3.x has added major vector-search and indexing improvements such as GPU acceleration, SIMD and FP16 optimizations, scalar quantization, warmup improvements, and adaptive shard selection.

Elasticsearch also continues to optimize for production-scale search, vector retrieval, semantic search, and relevance, with Elastic positioning Elasticsearch as a scalable vector database and search engine for AI applications. In practice, benchmark results should be validated against the specific workload rather than assuming one platform is universally faster.

Related content: Read our guide to OpenSearch pricing

6. Support and Commercial Offerings

OpenSearch can be self-managed as open-source software, adopted through community resources, or consumed through managed and commercial offerings from vendors such as AWS and other OpenSearch ecosystem providers. Amazon OpenSearch Service remains one of the most common managed options and supports upgrades across

OpenSearch and legacy Elasticsearch versions. Elasticsearch is commercially supported by Elastic through Elastic Cloud and self-managed subscription tiers, with commercial packaging around search, observability, security, machine learning, and enterprise support.

The practical difference is that OpenSearch offers a more open, multi-vendor path, while Elasticsearch offers a more integrated first-party commercial platform with Elastic-led support, roadmap, and cloud services.

Learn more in our detailed guide to OpenSearch vs Elasticsearch

How OpenSearch works: Architecture and components

OpenSearch is built on a distributed, scalable architecture that ensures high availability and fault tolerance. Its key components include clusters, nodes, indices, and shards.

Clusters

An OpenSearch cluster is a collection of one or more nodes that work together to store and search data. Clusters provide redundancy and load balancing, ensuring that the failure of a single node does not compromise the system’s overall performance. Each cluster has a unique identifier, known as the cluster name, which helps in managing and connecting to the cluster.

Nodes

Nodes are individual servers that make up a cluster. Each node stores data and participates in the indexing and search processes. There are different types of nodes in OpenSearch:

Master nodes: Responsible for managing the cluster’s metadata and state, including creating and deleting indices and tracking node availability.
Data nodes: Store data and perform data-related operations like indexing and searching.
Client nodes: Act as load balancers that handle search requests and distribute them to the appropriate data nodes. They do not store data themselves.

Indices

Indices are logical namespaces that hold related documents in OpenSearch. Each index can be thought of as a database in a relational database management system (RDBMS). Indices contain multiple types, which are collections of documents sharing a common schema.

Shards

Shards are the fundamental units of storage in OpenSearch. Each index is divided into smaller units called shards, which can be distributed across multiple nodes. Sharding allows OpenSearch to handle large volumes of data efficiently by distributing the load across the cluster.

There are two types of shards:

Primary shards: The original shards where data is initially written.
Replica shards: Copies of primary shards that provide redundancy and increase search performance by allowing read operations to be distributed.

Indexing and Searching

OpenSearch uses an indexing mechanism that allows data to be quickly ingested and made searchable. When a document is indexed, it is parsed and stored in the appropriate index and shard. The indexing process includes tokenization, where text is broken down into searchable terms, and various analyzers can be applied to process and store these terms efficiently.

Searching in OpenSearch is efficient due to its distributed nature. Queries are distributed across the relevant shards, and results are aggregated and returned to the user. OpenSearch supports various types of queries, including term queries, range queries, and full-text searches.

Security and Monitoring

OpenSearch includes security features such as fine-grained access controls, encryption, and auditing. Users can define detailed permissions to control access to indices and documents, ensuring data security. Additionally, OpenSearch provides monitoring and alerting capabilities.

Learn more in our detailed guide to OpenSearch architecture

Tips from the expert

Kassian Wren

Open Source Technology Evangelist

Kassian Wren is an Open Source Technology Evangelist specializing in OpenSearch. They are known for their expertise in developing and promoting open-source technologies, and have contributed significantly to the OpenSearch community through talks, events, and educational content

In my experience, here are tips that can help you better utilize OpenSearch:

Leverage custom analyzers: Create custom analyzers to improve the search relevancy based on your specific data and use cases. Use token filters and character filters to fine-tune the search behavior.
Implement index lifecycle management: Design and implement index lifecycle policies to manage the size and performance of your indices. Automate index rollover, deletion, and other maintenance tasks to optimize storage and performance.
Optimize shard allocation: Balance the number of primary and replica shards based on your query load and data redundancy needs. Avoid over-sharding to reduce overhead and improve search performance.
Utilize bulk indexing: Use the bulk API for indexing large datasets. This approach minimizes the overhead of individual indexing requests and significantly improves indexing speed.
Monitor cluster health with custom dashboards: Create custom dashboards in OpenSearch Dashboards to monitor critical metrics such as indexing rate, query latency, and resource usage. This helps in proactively managing cluster health.
Implement custom PKI for SSL/TLS: For enhanced security in OpenSearch, utilize your own public key infrastructure (PKI) to set up SSL/TLS. This approach, while requiring initial effort, provides flexibility and ensures a more secure and efficient encryption setup for both node-to-node and REST-layer communications

What is Amazon OpenSearch Service?

Amazon OpenSearch Service is a managed service provided by AWS that simplifies deploying, operating, and scaling OpenSearch clusters in the cloud. It eliminates the complexity of managing infrastructure, allowing users to focus on their core applications and data analytics.

The service offers automated provisioning, software patching, backup, recovery, and monitoring, ensuring that OpenSearch clusters are secure and performant. Users can scale their clusters up or down based on demand, taking advantage of AWS’s infrastructure.

Amazon OpenSearch Service also integrates with other AWS services, such as AWS Lambda, Amazon Kinesis, and Amazon CloudWatch, enabling data ingestion, processing, and monitoring workflows. Additionally, it supports features like anomaly detection, alerting, and machine learning integration to enhance data analysis and insights.

What is Amazon OpenSearch Serverless?

Amazon OpenSearch Serverless is a serverless option within the Amazon OpenSearch Service that allows users to run search and analytics workloads without managing any servers. This model abstracts the underlying infrastructure, providing automatic scaling and high availability without the need for manual configuration.

In a serverless setup, users define their data sources and search requirements, and the service automatically allocates resources to meet performance and capacity needs. This approach simplifies operations, reduces costs by charging only for actual usage, and eliminates the overhead of provisioning and maintaining clusters.

Amazon OpenSearch Serverless is ideal for dynamic or unpredictable workloads where traffic patterns can vary significantly. It ensures consistent performance by dynamically adjusting resources in real-time, providing a hassle-free experience for managing search and analytics applications.

Tutorial: Getting started with OpenSearch

In this tutorial, you will learn how to set up and run an OpenSearch cluster using Docker. This guide will take you through the necessary steps, from preparing your environment to accessing OpenSearch Dashboards.

Prerequisites

Before starting, ensure you have Docker and Docker Compose installed on your machine. You can download and install them from their respective websites.

Step 1: Disable Memory Paging and Swapping

To improve performance, you should disable memory paging and swapping on your host machine. Follow these commands:

Disable memory swapping:

sudo swapoff -a

1	sudo swapoff -a

Edit the sysctl configuration file to set the maximum map count:

sudo vi /etc/sysctl.conf

1	sudo vi /etc/sysctl.conf

Add the following line to the file:

vm.max_map_count=262144

1	vm.max_map_count=262144

Reload the kernel parameters:

sudo sysctl -p

1	sudo sysctl -p

Step 2: Download the Docker Compose File

You will need a Docker Compose file to define and create the containers in your cluster. Download the sample Compose file provided by the OpenSearch Project:

Using curl:

curl -O https://raw.githubusercontent.com/opensearch-project/documentation-website/2.15/assets/examples/docker-compose.yml

1	curl -O https://raw.githubusercontent.com/opensearch-project/documentation-website/2.15/assets/examples/docker-compose.yml

Using wget:

wget https://raw.githubusercontent.com/opensearch-project/documentation-website/2.15/assets/examples/docker-compose.yml

1	wget https://raw.githubusercontent.com/opensearch-project/documentation-website/2.15/assets/examples/docker-compose.yml

Step 3: Start Your OpenSearch Cluster

Navigate to the directory containing the downloaded docker-compose.yml file. Set up a custom admin password by editing the docker-compose.yml file and then start the cluster:

Open the docker-compose.yml file and set the admin password:

environment:
- OPENSEARCH_INITIAL_ADMIN_PASSWORD=

1 2	environment: - OPENSEARCH_INITIAL_ADMIN_PASSWORD=

Create and start the cluster as a background process:

docker-compose up -d

1	docker-compose up -d

Step 4: Verify the Cluster

To confirm that the containers are running, use the following command:

docker-compose ps

1	docker-compose ps

You should see output similar to this:

Name                    Command               State           Ports
----------------------------------------------------------------------
opensearch-node1   /usr/local/bin/docker-entr ...   Up      0.0.0.0:9200->9200/tcp,0.0.0.0:9600->9600/tcp
opensearch-dashboards   /usr/local/bin/dumb-init -- / ...   Up      0.0.0.0:5601->5601/tcp

1

2

3

4

Name Command State Ports

----------------------------------------------------------------------

opensearch-node1 /usr/local/bin/docker-entr ... Up 0.0.0.0:9200->9200/tcp,0.0.0.0:9600->9600/tcp

opensearch-dashboards /usr/local/bin/dumb-init -- / ... Up 0.0.0.0:5601->5601/tcp

Step 5: Query the OpenSearch REST API

Verify that the service is running by querying the OpenSearch REST API. Use the -k flag to disable hostname checking and the -u flag to pass the default username and password:

curl https://localhost:9200 -ku admin:

1	curl https://localhost:9200 -ku admin:

The response should confirm the installation was successful:

{
"name" : "opensearch-node1",
"cluster_name" : "docker-cluster",
"cluster_uuid" : "XXXXXXXXXXXXXX",
"version" : {
"number" : "1.2.3",
"build_type" : "tar",
"build_hash" : "XXXXXXXXXXXXXXXXXXX",
"build_date" : "2021-XX-XXTXX:XX:XX.XXXXXXZ",
"build_snapshot" : false,
"lucene_version" : "8.8.2",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "The OpenSearch Project: https://opensearch.org/"
}

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

{

"name" : "opensearch-node1",

"cluster_name" : "docker-cluster",

"cluster_uuid" : "XXXXXXXXXXXXXX",

"version" : {

"number" : "1.2.3",

"build_type" : "tar",

"build_hash" : "XXXXXXXXXXXXXXXXXXX",

"build_date" : "2021-XX-XXTXX:XX:XX.XXXXXXZ",

"build_snapshot" : false,

"lucene_version" : "8.8.2",

"minimum_wire_compatibility_version" : "6.8.0",

"minimum_index_compatibility_version" : "6.0.0-beta1"

},

"tagline" : "The OpenSearch Project: https://opensearch.org/"

}

Step 6: Access OpenSearch Dashboards

Open a web browser and go to https://localhost:5601. Log in using the default username admin and the password you set in the docker-compose.yml file.

By following these steps, you will have a fully operational OpenSearch cluster running on your machine, ready for you to explore and utilize its search and analytics capabilities.

Related content: Read our guide to managed OpenSearch platforms

Empowering organizations with comprehensive support for OpenSearch

At Instaclustr, our mission is to empower organizations with the most comprehensive support for OpenSearch. We believe that successful implementation of this open-source search and analytics engine lies at the intersection of world-class managed services and expert assistance.

Managed Services: We take over the management of your OpenSearch clusters’ underlying infrastructure to ensure high availability, scalability, and security. This means that you can dedicate your resources to your core business objectives instead of infrastructure management.
Expert Assistance: From cluster configuration to performance tuning, our team of experienced engineers is ready to help. We are well-versed in OpenSearch and can provide valuable insights and recommendations to optimize your clusters, whether it’s fine-tuning query performance, optimizing index settings, or resolving stability problems.
24×7 Monitoring and Support: With round-the-clock monitoring and support, we detect and address any potential issues promptly, minimizing downtime and ensuring smooth operation of your OpenSearch clusters.

Experience the Instaclustr difference today. Schedule a free consultation with our OpenSearch experts and let us help you optimize your OpenSearch environment.

For more information please see:

See additional guides on key open source topics

Together with our content partners, we have authored in-depth guides on several other topics that can also be useful as you explore the world of data security.

Apache Spark

Authored by Instaclustr

PostgreSQL