OpenSearch 3.7 released to GA on June 10, 2026, and with it comes a lot of improvements that directly affect AI practitioners in both performance and features. Storage options, query improvements, observability updates: there’s a lot to cover. This post focuses on three changes that directly affect AI search and hybrid retrieval pipelines: improvements to the Search Relevance Workbench, the expansion of the Hybrid Optimizer to include z-score and RRF, and a faster path to vector retrieval via doc values. Each one is usable immediately, without reindexing.
Quick summary : OpenSearch 3.7 expands the Hybrid Optimizer in the Search Relevance Workbench (SRW) from 66 to 82 variants per query, adding z-score normalization and Reciprocal Rank Fusion (RRF) to the automated sweep. It also introduces docvalue_fields retrieval for k-NN vectors. The Hybrid Optimizer requires no index or query changes. The docvalue_fields feature requires no reindexing, but you’ll need to update search requests to fetch vectors via docvalue_fields instead of _source.
What changed in the Search Relevance Workbench in OpenSearch 3.7?
The Search Relevance Workbench (SRW) is a toolset for running labelled query sets against your index, sweeping normalization and combination techniques, and comparing variants on NDCG (Normalized Discounted Cumulative Gain), MRR (Mean Reciprocal Rank), and MAP (Mean Average Precision) without writing dozens of search pipeline configurations by hand.
The main SRW change in 3.7 is in the Hybrid Optimizer experiment type. The core implementation is in search-relevance#465, with the full design in search-relevance#473. The short version: the Hybrid Optimizer now tests 82 variants per query, up from 66, by adding z-score normalization and RRF to the sweep. This can lead to more accurate results in cases where one of the new variants performs better on your data and you adopt it in production. Without running experiments and applying the winning config however, nothing changes.
How does the expanded Hybrid Optimizer test 82 variants per query?
Before 3.7, the optimizer’s default matrix covered:
- Normalizations: min_max, l2
- Combinations: arithmetic_mean, geometric_mean, harmonic_mean
- Weights: 11 points from 0.0 to 1.0 in 0.1 steps
That’s 2 x 3 x 11 = 66 variants per query. Useful coverage, but it left out two techniques that appear regularly in production hybrid search.
Z-score normalization standardizes scores to zero mean and unit variance: (score – μ) / σ. It’s a natural fit when sub-query score distributions differ in shape rather than just scale. There’s an important constraint though: z-score can produce negative values, which breaks both geometric_mean and harmonic_mean. The optimizer handles this correctly and pairs z-score only with arithmetic_mean, adding 11 new variants.
Reciprocal Rank Fusion (RRF) takes a different approach entirely. Rather than normalizing and combining raw scores, RRF fuses ranked lists by document position:
|
1 |
RRF score = Σ 1 / (k + rank_i(d)) |
Where k is the rank constant and rank_i(d) is document d‘s position in the i-th sub-query result. Because RRF is rank-based, it doesn’t use a normalization step. In OpenSearch, RRF routes through the score-ranker-processor rather than the normalization-processor, so SRW now generates two distinct pipeline JSON shapes to cover both paths. This makes tuning coverage more realistic, causes fewer invalid configurations causing alarms, and allows our OpenSearch projects to make clearer, faster decisions.
The optimizer tests five rank_constant values: {1, 5, 10, 20, 60}. These weren’t picked arbitrarily. The SRW team ran retrieval-equivalence testing on the ESCI dataset (150 queries, 100,000 documents) and identified which k values produce meaningfully distinct top-10 result sets. The five chosen constants each come from a behaviorally different region of the curve: k = 1 is the most aggressive setting, k = 5 sits at the sharpest transition point, and k = 60 (the de facto industry default from Cormack, Clarke, and Büttcher’s 2009 paper) represents the entire NDCG equivalence plateau that spans k = 40 through k = 1,000 on the benchmark corpus. From k = 40 onward, zero of the 150 test queries showed any change in NDCG@10. Adding more values in that range would only create duplicate variants.
The full default sweep now looks like this:
| Path | Variants |
| {min_max, l2} x {arithmetic, geometric, harmonic} x 11 weights | 66 |
| {z_score} x {arithmetic_mean} x 11 weights | 11 |
| RRF x 5 rank constants | 5 |
| Total | 82 |
You don’t have to run all 82 every time. The optimizer exposes normalizationTechniques, combinationTechniques, and rankConstants as experiment parameters, so you can scope a run to specific techniques. To benchmark only RRF against your current baseline, set combinationTechniques: [“rrf”] and skip the full sweep. Experiments created under the 66-variant regime remain fully readable; no migration is required.
Why is vector retrieval faster in OpenSearch 3.7?
This improvement lands in the k-NN plugin via k-NN#3315. It addresses a specific speed cost in reranking and batch retrieval pipelines: fetching vector values from search results.
Before 3.7, the typical path to retrieve a knn_vector field used _source. The problem is that _source stores the full original JSON document. To return a single vector field, OpenSearch had to decompress the entire stored document, deserialize it, inject the vector values in the right position, and re-serialize the result. For high-dimensional vectors at scale, especially when _source is large or derived source is in use, that overhead compounds quickly.
OpenSearch 3.7 lets you retrieve k-NN vectors via docvalue_fields instead. Doc values are OpenSearch’s column-oriented, on-disk storage format. When you request a field via docvalue_fields, OpenSearch reads only that column, with a single seek per document using the existing KNNVectorValues iterator. No full-document decompression, no re-serialization of the surrounding document.
What the benchmarks actually show
Benchmarks used 768-dimensional Cohere vectors at k = 1,000, on a single-node MacBook with a 4 GB JVM (k-NN#3315). The speedup depends heavily on response format and client transport:
| Retrieval method | Format | Speedup vs. _source |
| docvalue_fields (JSON array) | JSON / urllib | ~1.4x |
| docvalue_fields (binary, default) | JSON / urllib | ~2.7x |
| docvalue_fields | CBOR / urllib | ~5.6x |
| docvalue_fields | SMILE / urllib | ~5.8x |
| docvalue_fields (JSON array) | JSON / opensearch-py | ~1.1x |
The binary format (base64-encoded, Little Endian) is the default. It encodes each float as four bytes rather than as a JSON number string, which cuts response size by roughly 50% in JSON transport and concentrates the largest speedups. Binary format response looks like this:
|
1 2 3 4 5 |
{ "fields": { "my_vector": ["AACAPwAAAEAAAEBAAACAQA=="] } } |
The opensearch-py result (~1.1x) is a useful reminder: client and transport overhead matter. The _source decompression cost that docvalue_fields avoids is server-side, but if your bottleneck is network transfer or client-side deserialization, the gains will be smaller.
Enabling this feature doesn’t require reindexing. Existing knn_vector indexes support doc value retrieval already. You update the search request to fetch vectors via docvalue_fields instead of _source:
|
1 2 3 4 5 |
{ "query": { "knn": { "embedding": { "vector": [...], "k": 10 } } }, "docvalue_fields": ["embedding"], "": false } |
When does docvalue_fields retrieval help most for k-NN?
Doc value retrieval offers the clearest gains in two patterns:
- Batch nearest-neighbor lookups. When you need vectors for a large result set, say fetching embeddings for the top 1,000 k-NN hits to feed a cross-encoder, the _source decompression cost per document is the bottleneck. Doc values cut that cost significantly.
- Reranking pipelines. The typical RAG flow is: k-NN retrieval, then vector fetch, then reranking, then generation. Doc values reduce the latency of the fetch step, especially at larger k values.
It’s worth benchmarking carefully if you need both vectors and text fields in the same response. When other fields require _source reconstruction anyway, the gains from avoiding _source decompression for the vector field may be smaller. The feature is most impactful when you can set “_source”: false entirely.
What to do next with OpenSearch 3.7
These changes matter most in two places: hybrid relevance tuning and high-throughput vector retrieval. In practice, that means less guesswork when balancing lexical and semantic signals, and lower latency when fetching large candidate sets for reranking in RAG pipelines.
For AI search practitioners, OpenSearch 3.7 makes the iteration loop tighter: SRW now evaluates a broader set of fusion strategies (including z_score and rrf) in one experiment, while docvalue_fields retrieval removes a common vector-fetch bottleneck without requiring reindexing. Together, those upgrades help teams move from “it works” to “it’s tuned” faster and with fewer manual experiments.
Looking ahead, the direction is clear: more data-driven optimization in relevance tooling, and deeper performance work in vector retrieval paths. Expect future releases to keep reducing operational friction between offline relevance experiments and production-grade AI search latency.
Two concrete next steps: First, if you’re running hybrid search in production, rerun SRW’s Hybrid Optimizer with the 3.7 default sweep. Z-score and RRF may outperform your current min_max + arithmetic_mean baseline on specific query segments, and the expanded sweep gives you that comparison without any manual pipeline configuration. Second, if your reranking pipeline fetches vectors at k >= 100, test docvalue_fields with binary format and CBOR or SMILE transport on your target hardware. The published benchmarks are single-node; your production environment will give you the number that actually matters.
For further reading:
- Onboard z-score normalization and RRF to the Hybrid Optimizer (search-relevance#465)
- RFC: Onboard z-score and RRF to SRW Hybrid Optimizer (search-relevance#473)
- Support docvalue_fields for KNN vector retrieval (k-NN#3315)
Frequently asked questions
Do I need to reindex my data to use the OpenSearch 3.7 Hybrid Optimizer changes?
No. The Hybrid Optimizer runs experiments at query time; it doesn’t modify your index. Existing 66-variant experiments remain readable by the new code with no changes.
Do I need to reindex to use docvalue_fields for vector retrieval?
No reindexing is required. knn_vector fields already store doc values. To use the new retrieval path, update your search request to include docvalue_fields for the vector field and, where appropriate, set “_source”: false.
Why is z-score normalization paired only with arithmetic_mean in the Hybrid Optimizer?
Z-score normalizes to zero mean and unit variance, which means scores can be negative. geometric_mean is undefined for negative inputs (it takes an n-th root of a product), and harmonic_mean produces meaningless results when values are near zero or negative. The optimizer filters both combinations at generation time.
Can I run the Hybrid Optimizer with fewer than 82 variants?
Yes. Pass explicit values for normalizationTechniques, combinationTechniques, or rankConstants in the experiment parameters. For example, “combinationTechniques”: [“rrf”] runs only the five RRF variants.
How much does the response format affect docvalue_fields performance?
Significantly. At k = 1,000 with 768-dimensional vectors, JSON array retrieval via urllib shows roughly 1.4x speedup over _source. Binary doc values with CBOR or SMILE transport show 5.6–5.8x. If your client uses opensearch-py with JSON, expect around 1.1x due to transport overhead. Binary format with CBOR or SMILE is the best choice for batch and reranking pip