FG
💻 Software🤖 AI & LLMs

Late interaction embedding support

Fresh3 days ago
Mar 14, 20260 views
Confidence Score54%
54%

Problem

Hi, I wanted to ask a question: does pgvector currently support working with late interaction text embeddings, like the ones that come from a ColBERT model for example? This is an example of a vector that I would be referring to: [code block] Currently when inserting such a vector I seem to get `ValueError: expected ndim to be 1`, with this usage: [code block]

Error Output

Error: expected ndim to be 1`, with this usage:

Unverified for your environment

Select your OS to check compatibility.

1 Fix

Canonical Fix
Unverified Fix
New Fix – Awaiting Verification

Enable Late Interaction Embedding Support in pgvector

Medium Risk

The error `ValueError: expected ndim to be 1` occurs because the pgvector library expects input vectors to be one-dimensional arrays. Late interaction embeddings from models like ColBERT may produce multi-dimensional outputs, which are incompatible with the current pgvector implementation. To resolve this, we need to ensure that the input vectors are flattened to a one-dimensional format before insertion.

Awaiting Verification

Be the first to verify this fix

  1. 1

    Flatten the Embedding Vector

    Before inserting the embedding vector into pgvector, ensure that it is flattened to a one-dimensional array. This can be done using numpy's flatten method or similar functionality in your programming environment.

    python
    import numpy as np
    
    # Example of flattening a multi-dimensional vector
    embedding_vector = np.array([[0.1, 0.2], [0.3, 0.4]])
    flattened_vector = embedding_vector.flatten()
  2. 2

    Insert Flattened Vector into pgvector

    Once the vector is flattened, proceed to insert it into the pgvector database. Ensure that the database connection and insertion logic are correctly set up to handle the one-dimensional vector.

    python
    import pgvector
    
    # Assuming a pgvector connection is established
    pgvector.insert(flattened_vector)
  3. 3

    Test Insertion with Sample Data

    Create a test case with a known multi-dimensional embedding and verify that the insertion works without raising any errors. This will help confirm that the flattening process is functioning correctly.

    python
    test_vector = np.array([[0.1, 0.2], [0.3, 0.4]])
    flattened_test_vector = test_vector.flatten()
    pgvector.insert(flattened_test_vector)
  4. 4

    Update Documentation for Future Reference

    Document the changes made to support late interaction embeddings, including the flattening process. This will help future developers understand the requirement for one-dimensional vectors when using pgvector.

Validation

To confirm the fix worked, attempt to insert a late interaction embedding vector into pgvector after flattening it. If the insertion is successful without any errors, the fix is validated. Additionally, run unit tests that include various embedding shapes to ensure robustness.

Sign in to verify this fix

Environment

Submitted by

AC

Alex Chen

2450 rep

Tags

pgvectorembeddingsvector-search