FG
💻 Software🤖 AI & LLMs

Manhattan distance

Fresh5 days ago
Mar 14, 20260 views
Confidence Score52%
52%

Problem

I was wondering if the Manhattan distance could be added to this extension? I'm asking because if any maintainer think it's easy to add and is willing to do it, then I won't bother looking at how to add this myself (since the Euclidean distance is already implemented, I'm guessing that it might be almost trivial to add). Also, FYI, with 10 or more dimensions, L1 is more suited than L2... the more dimensions you add, the less differences there is in relative distances when using L2. After some radius, the points tend to have a similar distance with increasing dimensions.

Unverified for your environment

Select your OS to check compatibility.

1 Fix

Canonical Fix
Unverified Fix
New Fix – Awaiting Verification

Implement Manhattan Distance Functionality

Medium Risk

The extension currently supports only Euclidean distance calculations, which may not be optimal for high-dimensional data. Manhattan distance (L1 norm) is more appropriate in such cases as it provides a better measure of distance in high-dimensional spaces, where differences in relative distances become less significant with Euclidean distance.

Awaiting Verification

Be the first to verify this fix

  1. 1

    Define Manhattan Distance Function

    Create a new function to calculate the Manhattan distance between two points in n-dimensional space. This function will sum the absolute differences of their coordinates.

    python
    def manhattan_distance(point1, point2):
        return sum(abs(a - b) for a, b in zip(point1, point2))
  2. 2

    Integrate Manhattan Distance into Existing Codebase

    Modify the existing distance calculation module to include an option for Manhattan distance. Ensure that the function can be called similarly to the Euclidean distance function.

    python
    def calculate_distance(point1, point2, metric='euclidean'):
        if metric == 'manhattan':
            return manhattan_distance(point1, point2)
        else:
            return euclidean_distance(point1, point2)
  3. 3

    Update Documentation

    Revise the documentation to include details about the new Manhattan distance function, including usage examples and performance considerations for high-dimensional data.

  4. 4

    Write Unit Tests

    Create unit tests to validate the correctness of the Manhattan distance implementation. Ensure tests cover various scenarios, including edge cases with different dimensions.

    python
    def test_manhattan_distance():
        assert manhattan_distance([1, 2], [4, 6]) == 7
        assert manhattan_distance([0, 0, 0], [1, 1, 1]) == 3
  5. 5

    Deploy and Monitor

    Deploy the updated extension with the new functionality and monitor for any issues or feedback from users regarding the Manhattan distance calculations.

Validation

To confirm the fix worked, run the unit tests for the Manhattan distance function and verify that they pass. Additionally, test the integration in the main application by calculating distances using both metrics and comparing results in high-dimensional scenarios.

Sign in to verify this fix

Environment

Submitted by

AC

Alex Chen

2450 rep

Tags

pgvectorembeddingsvector-search