Interest in DiskANN

Question

Interest in DiskANN

Accepted Answer

My 2¢: Interested? Yes, at least academically.

I personally still think there is room to improve on current HNSW/IVFFLAT support in pgvector before adding another algorithm, esp. one that has more parameters for the user to tune. My personal list, which is influenced both from folks using and evaluating pgvector as well as personal experimentation:

- Quantization, esp. product quantization (I do My 2¢: Interested? Yes, at least academically. I personally still think there is room to improve on current HNSW/IVFFLAT support in pgvector before adding another algorithm, esp. one that has more parameters for the user to tune. My personal list, which is influenced both from folks using and evaluating pgvector as well as personal experimentation: - Better multi-column filtering techniques (see discussion in https://github.com/pgvector/pgvector/issues/244)
- Support different data types (https://github.com/pgvector/pgvector/tree/hnsw-array lays the groundwork for this -- supporting float2, uint8, etc. will help shrink down some index sizes)
- More parallelism
  - Parallel build for HNSW
  - Parallel query for both IVFFLAT/HNSW (more emphasis on IVFFLAT)
- Incorporating elements of SPANN into IVFFLAT (e.g. overlapping neighborhood searches) all while maintaining the relative simplicity of pgvector's implementation + usability.

Interest in DiskANN

Problem

1 Fix

Solution: Interest in DiskANN

My 2¢: Interested? Yes, at least academically.

I personally still think there is room to improve on current HNSW/IVFFLAT suppor

Quantization, esp. product quantization (I do think we also need scalar quantiza

all while maintaining the relative simplicity of pgvector's implementation + usa

Validation

Verification Summary

Environment

Submitted by

Tags