Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Adding a custom metric in Rust converts index dimensions to zero #515

Open
3 tasks done
mattheww95 opened this issue Oct 31, 2024 · 2 comments
Open
3 tasks done
Labels
bug Something isn't working

Comments

@mattheww95
Copy link

Describe the bug

When changing the metric used my usearch the index size shrinks resulting in the index dimensions now being 0. I tried doing some digging to find a solution and or an example in another language but sadly I could not identify the issue.

Steps to reproduce

The relevant code to recreate the issue is attached below:

use std::{iter::Enumerate, vec};

use usearch::{Index, new_index, IndexOptions, MetricKind, ScalarKind};

pub fn custom_metric(){
    let example_data: Vec<Vec<f32>> = vec![
        vec![0.0, 0.0, 0.0, 0.0],
        vec![0.0, 0.0, 0.0, 0.0],
        vec![0.0, 0.0, 0.0, 0.0],
        vec![0.0, 0.0, 0.0, 0.0]
    ];
    
    let dimensions = example_data.len();
    let custom_hamming_distance = Box::new(
        move |a: *const f32, b: *const f32| unsafe {
            //let a_slice = std::slice::from_raw_parts(a, dimensions);
            let a_slice = std::slice::from_raw_parts(a, dimensions);
            //let b_slice = std::slice::from_raw_parts(b, dimensions);
            let b_slice = std::slice::from_raw_parts(b, dimensions);
            let mut hamming_dist: f32 = 0.0;
            for (x, y) in a_slice.into_iter().zip(b_slice) {
                if *x != *y {
                    hamming_dist += 1.0;
                }
            }
            hamming_dist
    });

    let index_options = IndexOptions {
        dimensions: dimensions,
        metric: MetricKind::Pearson,
        quantization: ScalarKind::F32,
        connectivity: 0,
        expansion_add: 0,
        expansion_search: 0,
        multi: false, // need to look into these options further
    };

    let mut index: Index = new_index(&index_options).unwrap();
    assert_eq!(index.dimensions(), 4);
    index.change_metric(custom_hamming_distance);
    assert_eq!(index.dimensions(), 4);

    index.reserve(example_data.len()).unwrap();
    for (k, v) in example_data.iter().enumerate(){
        index.add(k as u64, v).unwrap();
    }
}

After changing the metric, dimensions of the index are now 4. This prevents addition of any data to the index afterwards as the original dimensions no longer match.

Expected behavior

Index dimensions to remain constant upon changing the provided metric.

The line from my Cargo.toml detailing addtionally installed features is as follows: usearch = { version = "2.16.0", features = ["simsimd", "openmp"] }

USearch version

2.16.0

Operating System

Ubuntu 22.04.3 LTS on Windows 10 x86_64

Hardware architecture

x86

Which interface are you using?

Other bindings

Contact Details

[email protected]

Are you open to being tagged as a contributor?

  • I am open to being mentioned in the project .git history as a contributor

Is there an existing issue for this?

  • I have searched the existing issues

Code of Conduct

  • I agree to follow this project's Code of Conduct
@mattheww95 mattheww95 added the bug Something isn't working label Oct 31, 2024
@ashvardanian
Copy link
Contributor

@mattheww95, should be an easy patch, will check in a few hours. Thank you!

@mattheww95
Copy link
Author

@ashvardanian Any update on this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants