GithubHelp home page GithubHelp logo

hanabi1224 / ruannoy Goto Github PK

View Code? Open in Web Editor NEW
35.0 2.0 3.0 11.18 MB

Rust port of annoy (https://github.com/spotify/annoy)

License: MIT License

Rust 64.73% C# 13.53% Shell 0.98% Python 2.13% Kotlin 12.31% Java 0.85% Batchfile 0.41% Dart 2.09% JavaScript 0.33% Vue 2.35% HTML 0.19% TypeScript 0.10%
annoy java kotlin dotnet rust nearest-neighbor-search approximate-nearest-neighbor-search

ruannoy's Introduction

RuAnnoy

main MIT License

This library is a rust port of spotify/annoy , currently only index serving is supported.

A live demo using web assembly is available at https://annoy-web-demo.vercel.app/

It also provides FFI bindings for jvm, dotnet and dart

Metric Serve Build jvm binding dotnet binding dart binding WASM support
Angular
Euclidean
Manhattan
Dot
Hamming

Install via crates.io

Crates.io codecov dependency status

# Cargo.toml
[dependencies]
annoy-rs = "0.1"

Usage

use annoy_rs::*;

let index = AnnoyIndex::load(10, "index.ann", IndexType::Angular).unwrap();
let v0 = index.get_item_vector(0);
let nearest = index.get_nearest(v0.as_ref(), 5, -1, true);

SIMD support

SIMD is supported via std::simd on nightly rust. Note that avx intrinsics need to be enabled explicitly by setting your cpu features in RUSTFLAGS environment variable.

RUSTFLAGS="-Ctarget-feature=+avx" cargo +nightly build --release
# or
RUSTFLAGS="-Ctarget-cpu=native" cargo +nightly build --release

WASM support

Install wasm-pack

wasm-pack build
wasm-pack test --node

simd128 is supported in chrome by default.

To enable simd128, build with below command

RUSTFLAGS="-Ctarget-feature=+simd128" cargo +nightly build --release --target wasm32-unknown-unknown

An example site is deployed at https://annoy-web-demo.vercel.app/

Source code is under example/web

FFI support

kotlin/java

It uses JNI bindings to rust crate and is ~5-10x faster than pure java implementation in benchmark scenario

Note that the prebuilt dynamically linked libraries are built with simd support, avx cpu feature is required.

Install via jitpack.io

Release

repositories {
  mavenCentral()
  maven { url 'https://jitpack.io' }
}

dependencies {
  implementation 'com.github.hanabi1224:RuAnnoy:<tag>'
}

Usage

val index = AnnoyIndex.tryLoad("index.5d.ann", 5, IndexType.Angular)

dotnet

Runtimes Nuget package
RuAnnoy NuGet version
RuAnnoy-Batteries-Windows-x64 NuGet version
RuAnnoy-Batteries-Linux-x64 NuGet version
RuAnnoy-Batteries-Darwin-x64 NuGet version

Install via nuget

  <ItemGroup>
    <PackageReference Include="RuAnnoy" Version="*" />
    <PackageReference Include="RuAnnoy-Batteries-Windows-x64" Version="*" />
  </ItemGroup>

Usage

var index = AnnoyIndex.Load("index.5d.ann", 5, IndexType.Angular);

dart

Install via pub.dev

# pubspec.yaml
dependencies:
  dart_native_annoy: ^0.1.0

Usage

import 'dart:ffi';
import 'package:dart_native_annoy/annoy.dart';

/// Creat factory from DynamicLibrary
final indexFactory = AnnoyIndexFactory(lib: DynamicLibrary.open('libannoy_rs_ffi.so'));

/// Load index
final index = indexFactory.loadIndex(
      'index.euclidean.5d.ann', 5, IndexType.Euclidean)!;

print('size: ${index.size}');

final v3 = index.getItemVector(3);

final nearest = index.getNearest(v0, 5, includeDistance: true);

TODO

  • Index building support
  • CLI tool to build index from file

ruannoy's People

Contributors

dependabot[bot] avatar hanabi1224 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

ruannoy's Issues

[feature request] building index on the client

Hello again! I've come across the need again to build an index on the client, and am wondering if there has been any reconsideration of this as a feature?

My particular use case is for implementing a retrieval/memory system for OpenCharacters - it's a free, fully local/client-side application (other than calls which go directly to OpenAI), so there's no server to handle index creation.

It looks like Voy may be an option for this once it stabilises a bit, but for now I haven't been able to get it working.

Seems that currently Voy and RuAnnoy are the only two options in this space - will benchmark once I manage to get Voy working. But for RuAnnoy to be practically useful on the client for many cases, I think it probably needs ability to create/update indices.

If this is completely out of the question, feel free to close this. Thanks!

Wasm/JS build?

Hey, I was just looking into getting Annoy running in the browser and came across your repo. Wondering if you've considered adding a Wasm build? Not sure how much work it'd be, but the first issue I ran into (as a wasm-pack and Rust newbie) was that, IIUC, wasm-pack doesn't have an "out of the box" virtual filesystem that a index file could be loaded into. But if that hurdle can be passed (e.g. with a new AnnoyIndex::load_from_buffer() function?), then perhaps a Wasm build would be quite easy?

No worries if this isn't at all on the road map - just thought I'd ask to see what you think.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.