FG
💻 Software🤖 AI & LLMs

HNSW + Filter inconsistent result issue (based on HNSW index used vs unused)

Freshover 1 year ago
Mar 14, 20260 views
Confidence Score85%
85%

Problem

Postgres gives inconsistent result count when it uses the HNSW index vs index not being used. I am using PGVector HNSW USING hnsw (embedding vector_cosine_ops) WITH (m='16', ef_construction='32'); IFacing an issue with the below sample query WITH filtered_opportunities AS (SELECT sf.notice_id FROM opportunity_filter sf WHERE sf.full_parent_path_name IN ('ABC SERVICE') //and has_related_award=true // line 2 ) SELECT sv.notice_id FROM semantic_vector sv JOIN filtered_opportunities fo ON fo.notice_id = sv.notice_id ORDER BY sv.embedding <=> cast('[-0.5078125]’) limit 200 From the above query, It gives me a result count of 30 by having one filter condition, if I add another AND condition (uncomment line 2) it is giving me 200 records (ideally adding more where condition should reduce the resultset). The difference in above is, with the first query, it uses HNSW index, so postgres fetches 1000 records from vector table & then apply filter on top of it which reduces the resultset, in the second query postgres doesn't uses HNSW index, so it fetches more records while applying the filter at same time. This is a major issue in our application, where the filters behaves differently based on HNSW index used/not used. Is this is a Postgres bug? any way to solve this?

Unverified for your environment

Select your OS to check compatibility.

1 Fix

Canonical Fix
Moderate Confidence Fix
84% confidence100% success rate1 verificationLast verified Mar 14, 2026

Solution: HNSW + Filter inconsistent result issue (based on HNSW index used vs unused)

Low Risk

There is a long discussion about this here: https://github.com/pgvector/pgvector/issues/244 - in short, what's happening is that the index is returning rows that don't match the filters. In addition to ongoing development, there are a number of whats to handle this, including: 1. Use a different index for the filtering (e.g. B-tree [the default PostgreSQL index] / GIN etc.) - based on selectivity

84

Trust Score

1 verification

100% success
  1. 1

    There is a long discussion about this here: https://github.com/pgvector/pgvector

    There is a long discussion about this here: https://github.com/pgvector/pgvector/issues/244 - in short, what's happening is that the index is returning rows that don't match the filters. In addition to ongoing development, there are a number of whats to handle this, including:

  2. 2

    Use a different index for the filtering (e.g. B-tree [the default PostgreSQL ind

    2. Set `hnsw.ef_search` to a higher value to allow more results ot be returned

Validation

Resolved in pgvector/pgvector GitHub issue #671. Community reactions: 0 upvotes.

Verification Summary

Worked: 1
Last verified Mar 14, 2026

Sign in to verify this fix

Environment

Submitted by

AC

Alex Chen

2450 rep

Tags

pgvectorembeddingsvector-search