FG
๐Ÿ’ป Software๐Ÿค– AI & LLMs

Question Setting ef_search to different values does not affect number of results retrieved (Django 4.2 + PGVector + HNSW index)

Fresh3 days ago
Mar 14, 20260 views
Confidence Score50%
50%

Problem

Hi all, I have this simple Django model setup: [code block] I am trying to do a basic vector search (cosine) on the model with a simple question/embedding with the following code: [code block] No matter what i set ef_search to, whether it be 1, 40, 60, 100, I always get the same number of results. What is the correct way to set ef_search? Regards, Rob

Unverified for your environment

Select your OS to check compatibility.

1 Fix

Canonical Fix
Unverified Fix
New Fix โ€“ Awaiting Verification

Adjust ef_search Parameter for Effective Vector Search in Django with PGVector

Medium Risk

The ef_search parameter in PGVector's HNSW index controls the number of nearest neighbors considered during the search. If the number of results remains constant regardless of the ef_search value, it may be due to the default behavior of the HNSW algorithm or the way the query is structured. Specifically, if the search is not properly configured to utilize the ef_search parameter, it will not affect the number of results returned.

Awaiting Verification

Be the first to verify this fix

  1. 1

    Verify HNSW Index Configuration

    Ensure that the HNSW index is correctly set up on the vector field in your Django model. The index must be created with the appropriate parameters to utilize ef_search effectively.

    sql
    CREATE INDEX ON your_table USING hnsw(your_vector_column) WITH (M=16, ef_construction=200);
  2. 2

    Update Query to Include ef_search

    Modify your vector search query to explicitly set the ef_search parameter. This ensures that the search algorithm uses the specified value during execution.

    python
    results = YourModel.objects.filter(your_vector_column__distance_lte=your_distance).annotate(
        distance=your_vector_column.cosine_distance(your_embedding)
    ).order_by('distance').extra(
        select={'ef_search': '60'}
    );
  3. 3

    Test with Different ef_search Values

    Run your vector search with varying ef_search values (e.g., 1, 40, 60, 100) and observe the number of results returned. This will help confirm that the parameter is now affecting the results.

    python
    for ef in [1, 40, 60, 100]:
        results = YourModel.objects.filter(...).extra(select={'ef_search': ef})
        print(f'ef_search: {ef}, results: {len(results)})
  4. 4

    Check for Query Optimization

    Review the overall query performance and optimization. Ensure that the database is not caching results or that there are no other constraints limiting the number of results returned.

    sql
    EXPLAIN ANALYZE SELECT * FROM your_table WHERE your_vector_column <-> your_embedding ORDER BY distance LIMIT 10;

Validation

Confirm that varying the ef_search parameter now results in different numbers of results returned. Additionally, check the query execution plan to ensure that the HNSW index is being utilized effectively.

Sign in to verify this fix

Environment

Submitted by

AC

Alex Chen

2450 rep

Tags

pgvectorembeddingsvector-search