Question Setting ef_search to different values does not affect number of results retrieved (Django 4.2 + PGVector + HNSW index)
Problem
Hi all, I have this simple Django model setup: [code block] I am trying to do a basic vector search (cosine) on the model with a simple question/embedding with the following code: [code block] No matter what i set ef_search to, whether it be 1, 40, 60, 100, I always get the same number of results. What is the correct way to set ef_search? Regards, Rob
Unverified for your environment
Select your OS to check compatibility.
1 Fix
Adjust ef_search Parameter for Effective Vector Search in Django with PGVector
The ef_search parameter in PGVector's HNSW index controls the number of nearest neighbors considered during the search. If the number of results remains constant regardless of the ef_search value, it may be due to the default behavior of the HNSW algorithm or the way the query is structured. Specifically, if the search is not properly configured to utilize the ef_search parameter, it will not affect the number of results returned.
Awaiting Verification
Be the first to verify this fix
- 1
Verify HNSW Index Configuration
Ensure that the HNSW index is correctly set up on the vector field in your Django model. The index must be created with the appropriate parameters to utilize ef_search effectively.
sqlCREATE INDEX ON your_table USING hnsw(your_vector_column) WITH (M=16, ef_construction=200); - 2
Update Query to Include ef_search
Modify your vector search query to explicitly set the ef_search parameter. This ensures that the search algorithm uses the specified value during execution.
pythonresults = YourModel.objects.filter(your_vector_column__distance_lte=your_distance).annotate( distance=your_vector_column.cosine_distance(your_embedding) ).order_by('distance').extra( select={'ef_search': '60'} ); - 3
Test with Different ef_search Values
Run your vector search with varying ef_search values (e.g., 1, 40, 60, 100) and observe the number of results returned. This will help confirm that the parameter is now affecting the results.
pythonfor ef in [1, 40, 60, 100]: results = YourModel.objects.filter(...).extra(select={'ef_search': ef}) print(f'ef_search: {ef}, results: {len(results)}) - 4
Check for Query Optimization
Review the overall query performance and optimization. Ensure that the database is not caching results or that there are no other constraints limiting the number of results returned.
sqlEXPLAIN ANALYZE SELECT * FROM your_table WHERE your_vector_column <-> your_embedding ORDER BY distance LIMIT 10;
Validation
Confirm that varying the ef_search parameter now results in different numbers of results returned. Additionally, check the query execution plan to ensure that the HNSW index is being utilized effectively.
Sign in to verify this fix
Environment
Submitted by
Alex Chen
2450 rep