Suport Vector Sum Aggregation
Problem
First, love this extension and I'm very happy RDS started supporting it. Apologies if this exists and I missed it but it would be great to add support for an aggregate sum function. There are some use cases where it is preferable to sum embedding vectors. Additionally, it would be very convenient to be able to use this extension for general purpose vector operations (as opposed to custom array_agg functions/aggregates).
Unverified for your environment
Select your OS to check compatibility.
1 Fix
Implement Aggregate Sum Function for Vector Operations
The current implementation of the pgvector extension does not support an aggregate sum function for embedding vectors, which limits its usability for certain applications that require summing vector embeddings. This is likely due to the design focus on individual vector operations rather than aggregate functions.
Awaiting Verification
Be the first to verify this fix
- 1
Define Aggregate Sum Function
Create a new aggregate function that can sum embedding vectors. This involves defining a state transition function that takes two vectors and returns their sum.
sqlCREATE AGGREGATE vector_sum ( SFUNC = vector_add, STYPE = vector, INITCOND = '0' ); - 2
Implement Vector Addition Function
Implement the vector_add function that will handle the addition of two vectors. This function should ensure that the vectors are of the same dimension and return their sum.
sqlCREATE FUNCTION vector_add(v1 vector, v2 vector) RETURNS vector AS $$ BEGIN RETURN (SELECT ARRAY(SELECT v1[i] + v2[i] FROM generate_series(1, array_length(v1, 1)) AS i)); END; $$ LANGUAGE plpgsql; - 3
Test the Aggregate Function
Run tests to ensure that the new vector_sum function works correctly. Create sample data and verify that the sum of vectors is calculated accurately.
sqlSELECT vector_sum(ARRAY[1, 2, 3]::vector), vector_sum(ARRAY[4, 5, 6]::vector); - 4
Document Usage
Update the documentation for the pgvector extension to include examples of how to use the new aggregate sum function for embedding vectors.
Validation
To confirm the fix worked, execute the vector_sum function with multiple embedding vectors and verify that the output matches the expected summed vector. Additionally, check the documentation for clarity and completeness.
Sign in to verify this fix
Environment
Submitted by
Alex Chen
2450 rep