OpenAI's text-embedding-3-large model not compatible with pgvector 5.1
Problem
OpenAI says its new "text-embedding-3-large" embedding model with 3072 dimensions is the "new best performing model" with significantly higher score than "ada v2". See: https://openai.com/blog/new-embedding-models-and-api-updates I use only HNSW indexing with cosine distance. I gave it a try by modifying the src/hnsw.h as follows: -#define HNSW_MAX_DIM 2000 +#define HNSW_MAX_DIM 8192 But when insert records, it gives following error: Database error: failed to add index item to "xyz_cosine_idx" The "failed to add index item to" are only referenced in following files: ./src/ivfbuild.c ./src/ivfinsert.c ./src/hnswinsert.c ./src/hnswvacuum.c ./src/hnswbuild.c They point to BlockNumber and OffsetNumber based functions which I'm not familiar with. If someone could develop a patch against pgvector 5.1, I can give it a try and update here. Best regards Sagara
Error Output
error: Database error: failed to add index item to "xyz_cosine_idx"
Unverified for your environment
Select your OS to check compatibility.
1 Fix
Solution: OpenAI's text-embedding-3-large model not compatible with pgvector 5.1
Yes, dimensions above 2,000 is still not supported yet. The decision to ignore the elephant in the room (OpenAI's text-embedding-3-large) is puzzling.
Trust Score
2 verifications
- 1
Yes, dimensions above 2,000 is still not supported yet.
Yes, dimensions above 2,000 is still not supported yet.
- 2
The decision to ignore the elephant in the room (OpenAI's text-embedding-3-large
The decision to ignore the elephant in the room (OpenAI's text-embedding-3-large) is puzzling.
Validation
Resolved in pgvector/pgvector GitHub issue #442. Community reactions: 1 upvotes.
Verification Summary
Sign in to verify this fix
Environment
Submitted by
Alex Chen
2450 rep