What is pgvector?
Pgvector provides vector similarity search capabilities for PostgreSQL, and its functionality can be accessed and utilized in Python applications. It is an open source PostgreSQL extension that adds a new column type for storing and querying vectors, typically used in machine learning and AI applications. Pgvector provides the vector data type, which can hold dense arrays of floating-point numbers. These vectors often represent embeddings generated from models such as large language models or image encoders.
The extension includes indexing and similarity search capabilities. It supports operators for cosine similarity, inner product, and Euclidean distance, which allow efficient nearest-neighbor searches. Pgvector integrates directly into PostgreSQL’s query engine, so you can combine vector search with relational queries in a single database.
Because it runs inside PostgreSQL, pgvector avoids the need for a separate specialized vector database. This makes it useful for applications that want to manage structured data and embeddings together, without adding new infrastructure. It is widely used for semantic search, recommendation systems, and retrieval-augmented generation.
This is part of a series of articles about vector database
Benefits of using pgvector with Python
When combined with Python, pgvector becomes easier to integrate into machine learning and AI workflows. Python libraries like psycopg2, SQLAlchemy, or async drivers make it simple to connect to PostgreSQL and run vector queries directly from applications. This setup allows developers to manage embeddings and structured data in one place while keeping the workflow consistent with the rest of the Python ecosystem.
Key benefits include:
- Integration with ML libraries: Python code can generate embeddings using libraries like Hugging Face Transformers, TensorFlow, or PyTorch and store them directly in PostgreSQL with pgvector.
- Unified data and vector search: Both relational data and embeddings can be queried together, removing the need for a separate vector database.
- Efficient similarity search: Operators for cosine similarity, inner product, and Euclidean distance are directly accessible from Python queries.
- Index support for speed: Pgvector provides indexing (IVFFlat, HNSW) for faster nearest-neighbor searches, which can be triggered from Python without extra setup.
- Reduced infrastructure complexity: A single PostgreSQL instance can serve both transactional queries and vector search, simplifying deployment for Python-based applications.
- Flexible query composition: Python applications can combine embeddings with SQL conditions, joins, and filters to build more advanced retrieval pipelines.
What is pgvector-python?
Pgvector-python is the official Python client library for working with pgvector. It provides utilities to map Python data structures, such as lists or NumPy arrays, to the PostgreSQL vector type. This makes it easier to insert, update, and query embeddings from Python code without having to manually handle type conversions.
The library integrates with PostgreSQL drivers like psycopg2 and asyncpg, adding adapters and type casters so that vectors can be passed as query parameters or read as native Python objects. It also supports ORMs such as SQLAlchemy, letting developers define vector columns in models and run similarity searches using familiar query interfaces.
With pgvector-python, developers can focus on generating and querying embeddings while relying on the library to handle database compatibility. It is commonly used in machine learning pipelines where embeddings are stored in PostgreSQL for tasks like semantic search, recommendations, and retrieval-augmented generation.
Tips from the expert
Perry Clark
Professional Services Consultant
Perry Clark is a seasoned open source consultant with NetApp. Perry is passionate about delivering high-quality solutions and has a strong background in various open source technologies and methodologies, making him a valuable asset to any project.
In my experience, here are tips that can help you better apply and optimize pgvector with Python in real-world ML and AI workflows:
- Use typed NumPy arrays to prevent hidden type casting: Always ensure vectors passed into queries are
float32NumPy arrays. This avoids implicit conversion by the driver, improves memory efficiency, and ensures compatibility with pgvector’s expected binary format. - Bulk load embeddings using COPY with binary mode: For high-volume ingestion, use the
COPYcommand in binary mode (viapsycopgorasyncpg) rather thanINSERT. This drastically reduces load time for large embedding datasets, especially during ETL or inference stages. - Pre-normalize embeddings for cosine similarity outside the DB: Normalize vectors in Python before inserting when using cosine similarity. This allows PostgreSQL to skip normalization at query time, speeding up searches and ensuring consistent distance behavior across libraries.
- Isolate embedding logic with repository/service patterns: Abstract pgvector access into repository or service layers in your Python application. This encapsulates vector indexing/querying logic, improves testability, and allows easier substitution if the backend changes (e.g., switching to a vector store).
- Run end-to-end tests with synthetic embedding generators: Use tools like
fakerandsentence-transformersto simulate real embedding workflows for testing. This ensures your vector pipelines behave consistently under various conditions without needing full ML inference.
Quick tutorial: Getting started with pgvector-python and supported libraries
1. Using Django
Start by enabling the pgvector extension with a migration:
|
1 2 3 4 5 6 |
from pgvector.django import VectorExtension class Migration(migrations.Migration): operations = [ VectorExtension() ] |
Define a model with a vector field:
|
1 2 3 4 |
from pgvector.django import VectorField class Item(models.Model): embedding = VectorField(dimensions=3) |
Insert data and run similarity search:
|
1 2 3 4 5 |
item = Item(embedding=[1, 2, 3]) item.save() from pgvector.django import L2Distance neighbors = Item.objects.order_by(L2Distance('embedding', [3, 1, 2]))[:5] |
You can also apply filters, compute averages, or define approximate indexes like HNSW and IVFFlat directly in the model’s Meta class.
2. Using SQLAlchemy
Enable the extension:
|
1 |
session.execute(text('CREATE EXTENSION IF NOT EXISTS vector')) |
Define a mapped class:
|
1 2 3 4 |
from pgvector.sqlalchemy import Vector class Item(Base): embedding = mapped_column(Vector(3)) |
Insert and query:
|
1 2 3 4 5 6 7 |
item = Item(embedding=[1, 2, 3]) session.add(item) session.commit() neighbors = session.scalars( select(Item).order_by(Item.embedding.l2_distance([3, 1, 2])).limit(5) ) |
To speed up queries, add an approximate index:
|
1 2 3 4 5 6 7 8 9 10 |
from sqlalchemy import Index index = Index( 'my_index', Item.embedding, postgresql_using='hnsw', postgresql_with={'m': 16, 'ef_construction': 64}, postgresql_ops={'embedding': 'vector_l2_ops'} ) index.create(engine) |
3. Using Psycopg (v2 or v3)
Enable the extension:
|
1 2 3 4 5 |
conn.execute('CREATE EXTENSION IF NOT EXISTS vector') Register vector types: from pgvector.psycopg import register_vector register_vector(conn) |
Create and use a table:
|
1 2 3 4 5 6 7 |
conn.execute('CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))') conn.execute('INSERT INTO items (embedding) VALUES (%s)', (np.array([1, 2, 3]),)) neighbors = conn.execute( 'SELECT * FROM items ORDER BY embedding <-> %s LIMIT 5', (np.array([3, 1, 2]),) ).fetchall() |
Add an index for performance:
|
1 |
conn.execute('CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)') |
4. Using asyncpg
Enable the extension and register vector types asynchronously:
|
1 2 3 4 |
await conn.execute('CREATE EXTENSION IF NOT EXISTS vector') from pgvector.asyncpg import register_vector await register_vector(conn) |
For connection pools, initialize with:
|
1 2 3 4 |
async def init(conn): await register_vector(conn) pool = await asyncpg.create_pool(..., init=init) |
Create a table and insert vectors:
|
1 2 3 |
await conn.execute('CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))') embedding = np.array([1, 2, 3]) await conn.execute('INSERT INTO items (embedding) VALUES ($1)', embedding) |
Query for nearest neighbors:
|
1 |
await conn.fetch('SELECT * FROM items ORDER BY embedding <-> $1 LIMIT 5', embedding) |
Add an index:
|
1 2 3 |
await conn.execute('CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)') # or await conn.execute('CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)') |
5. Using pg8000
Enable the extension and register types:
|
1 2 3 4 |
conn.run('CREATE EXTENSION IF NOT EXISTS vector') from pgvector.pg8000 import register_vector register_vector(conn) |
Create a table and insert data:
|
1 2 3 4 |
conn.run('CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))') embedding = np.array([1, 2, 3]) conn.run('INSERT INTO items (embedding) VALUES (:embedding)', embedding=embedding) |
Search for nearest vectors:
|
1 |
conn.run('SELECT * FROM items ORDER BY embedding <-> :embedding LIMIT 5', embedding=embedding) |
Create an index for performance:
|
1 2 3 |
conn.run('CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)') # or conn.run('CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)') |
6. Using Peewee
Add a vector field to your model:
|
1 2 3 4 5 6 7 8 9 |
from pgvector.peewee import VectorField class Item(BaseModel): embedding = VectorField(dimensions=3) class Meta: indexes = ( (('embedding',), False, 'USING hnsw (embedding vector_l2_ops)'), ) |
Insert and query:
|
1 2 |
item = Item.create(embedding=[1, 2, 3]) neighbors = Item.select().order_by(Item.embedding.l2_distance([3, 1, 2])).limit(5) |
Other supported distance metrics include cosine, inner product, and Hamming.
Filter by distance:
|
1 |
Item.select().where(Item.embedding.l2_distance([3, 1, 2]) < 5) |
Compute averages:
|
1 2 |
from peewee import fn avg_vector = Item.select(fn.avg(Item.embedding).coerce(True)).scalar() |
7. Using SQLModel
Enable the extension and define your model:
|
1 2 3 4 5 6 7 8 9 |
session.exec(text('CREATE EXTENSION IF NOT EXISTS vector')) from pgvector.sqlalchemy import Vector from sqlmodel import SQLModel, Field from typing import List class Item(SQLModel, table=True): embedding: List[float] = Field(sa_type=Vector(3)) |
Insert and search:
|
1 2 3 4 5 6 7 |
item = Item(embedding=[1, 2, 3]) session.add(item) session.commit() neighbors = session.exec( select(Item).order_by(Item.embedding.l2_distance([3, 1, 2])).limit(5) ) |
Filter and aggregate:
|
1 2 3 |
session.exec(select(Item).filter(Item.embedding.l2_distance([3, 1, 2]) < 5)) from pgvector.sqlalchemy import avg session.exec(select(avg(Item.embedding))).first() |
Add approximate indexes:
|
1 2 3 4 5 6 7 8 9 10 |
from sqlmodel import Index index = Index( 'my_index', Item.embedding, postgresql_using='hnsw', postgresql_with={'m': 16, 'ef_construction': 64}, postgresql_ops={'embedding': 'vector_l2_ops'} ) index.create(engine) |
These integrations make it easy to bring vector search into your existing Python stack without rewriting your database layer or switching to specialized infrastructure.
Unlocking AI potential with Instaclustr for PostgreSQL and pgvector
Vector databases are quickly becoming the backbone of modern AI applications. If you are building generative AI, recommendation engines, or semantic search tools, you need a way to store and query high-dimensional data efficiently. That is exactly where pgvector comes in. By turning the trusted PostgreSQL database into a powerful vector store, pgvector allows you to keep your operational data and your vector embeddings in one place. But managing this at scale can be tricky. This is where Instaclustr for PostgreSQL steps in to simplify the complexity.
Instaclustr takes the heavy lifting out of deploying and managing PostgreSQL with pgvector. We provide a fully managed environment that ensures your database is always performant, secure, and ready to handle the demands of machine learning workloads. Instead of wrestling with infrastructure configurations or worrying about uptime, your team can focus on building the innovative AI features that drive your business forward. We handle the provisioning, monitoring, and automated maintenance so you don’t have to.
Scalability is often the biggest hurdle when moving AI projects from prototype to production. As your dataset of embeddings grows, query performance can suffer if the underlying infrastructure isn’t optimized. Instaclustr for PostgreSQL is built to scale with you. Our platform ensures that as your vector data expands, your search latency remains low and your reliability remains high. We offer robust replication strategies and high-availability configurations that are essential for mission-critical applications relying on real-time similarity search.
Choosing Instaclustr means you aren’t just getting a database; you’re gaining a partner dedicated to your success. We combine the flexibility of open source technology with enterprise-grade support. With our expertise backing your PostgreSQL deployment, you can confidently leverage pgvector to unlock new insights and capabilities, knowing that the foundation of your data architecture is solid, secure, and built for the future.
For more information: