FG
💻 Software🤖 AI & LLMs

HNSW testing

Fresh5 days ago
Mar 14, 20260 views
Confidence Score53%
53%

Problem

_Originally posted by @alanwli in https://github.com/pgvector/pgvector/issues/181#issuecomment-1693821662_ @ankane, not sure if you've run into this in your testing. When I ran it with 86c29b3bf038de50bb2aec21b6d896823ff1fbbe on an usecases, I was hitting what looks like some kind of race condition with the index. I simplified it down to the following python script - where there is a table with 1k vectors, inserts+deletes workload that always keeps the table at 1k vectors. But when doing an index scan with ef_search set to 1k, I see that the index will not have all 1k vectors - quite a number of occasions where it's missing 10%+ of what should be there, the worst I saw was ~24% missing. Note that I don't see this with ivfflat (uncomment the ivfflat line). Is this expected? [code block] Example output: [code block]

Unverified for your environment

Select your OS to check compatibility.

1 Fix

Canonical Fix
Unverified Fix
New Fix – Awaiting Verification

Fix Race Condition in HNSW Indexing for pgvector

Medium Risk

The observed race condition occurs due to concurrent modifications (inserts and deletes) on the HNSW index while performing an index scan. This can lead to inconsistencies in the index state, causing some vectors to be missing during the search. HNSW relies on a multi-threaded architecture which may not handle concurrent updates properly in the current implementation, leading to stale reads during the search operation.

Awaiting Verification

Be the first to verify this fix

  1. 1

    Implement Locking Mechanism

    Introduce a locking mechanism around the index modification operations to ensure that no reads occur while the index is being updated. This can be done using threading locks in Python.

    python
    import threading
    
    index_lock = threading.Lock()
    
    with index_lock:
        # Perform insert/delete operations here
    
  2. 2

    Increase ef_search Parameter

    Temporarily increase the ef_search parameter during testing to see if the issue persists. This can help in diagnosing if the problem is related to insufficient search parameters.

    python
    ef_search = 2000  # Increase from 1000 to 2000 for testing
  3. 3

    Add Logging for Index State

    Add detailed logging around index operations to capture the state of the index before and after modifications. This will help in diagnosing the race condition more effectively.

    python
    import logging
    
    logging.basicConfig(level=logging.INFO)
    
    logging.info('Index state before modification: %s', index_state)
    
  4. 4

    Run Concurrent Tests

    Create a series of concurrent tests that simulate multiple threads performing inserts and deletes while simultaneously running index scans. This will help in reproducing the race condition reliably.

    python
    from concurrent.futures import ThreadPoolExecutor
    
    def concurrent_modifications():
        # Code for concurrent inserts and deletes
    
    with ThreadPoolExecutor(max_workers=5) as executor:
        executor.submit(concurrent_modifications)
    
  5. 5

    Review and Optimize HNSW Parameters

    Review the HNSW parameters such as M and efConstruction to ensure they are optimized for your specific workload. Adjusting these parameters can improve the stability and performance of the index.

    python
    M = 16  # Example parameter
    efConstruction = 200  # Example parameter

Validation

To confirm the fix worked, run the modified script with concurrent inserts and deletes while performing index scans. Monitor the logs for any missing vectors and ensure that the index state is consistent across multiple runs. The missing vector percentage should decrease significantly.

Sign in to verify this fix

Environment

Submitted by

AC

Alex Chen

2450 rep

Tags

pgvectorembeddingsvector-search