GithubHelp home page GithubHelp logo

Comments (5)

emmanuel-keller avatar emmanuel-keller commented on May 27, 2024 1

The result you get is consistent with the current query logic, in the sense that each predicate returns a subset of the records, and the result is the intersection of these two subsets:

  • flag = true returns [pts:1, pts:3, pts5, pts7]
  • point <|2|> $pt returns [pts:4, pts5]

The intersection of both returns is [pts:5], which is correct from a technical point of view.

That said, it is true that we don't currently have a way to filter the KNN operation so you can have the N results matching the filter.

As a workaround, you could store two tables. One table would store points where the flag is true, and the other would store the points where the flag is false.

DELETE FROM pts_true;
DELETE FROM pts_false;

DEFINE INDEX mt_pt1 ON pts_true FIELDS point MTREE DIMENSION 1;
DEFINE INDEX mt_pt1 ON pts_false FIELDS point MTREE DIMENSION 1;

INSERT INTO pts_true [
	{ id: pts_true:1, point: [ 1f ], flag: true },
	{ id: pts_true:3, point: [ 3f ], flag: true },
	{ id: pts_true:5, point: [ 5f ], flag: true },
	{ id: pts_true:7, point: [ 7f ], flag: true }
];

INSERT INTO pts_false [
	{ id: pts_false:2, point: [ 2f ], flag: false },
	{ id: pts_false:4, point: [ 4f ], flag: false },
	{ id: pts_false:6, point: [ 6f ], flag: false },
];

LET $pt = [4.5f];

SELECT
    id,
    vector::similarity::cosine(point, $pt) AS similarity
FROM
    pts_true
WHERE
    point <|2|> $pt
ORDER BY
    similarity DESC;

Run it with Surrealist

That would work for this simple example, but I agree that it would not be suitable for something more complex involving more intricate filters.

To meet this requirement we could introduce the following syntax:

SELECT
    id,
    flag,
    vector::similarity::cosine(point, $pt) AS similarity
FROM
    pts
WHERE
    flag = true &&
    point <||> $pt
ORDER BY
    similarity DESC
LIMIT 2

In this case, the KNN operator would not limit the result and would only stop providing results once the limit is reached. That would be compatible with the way SurrealDB executes queries, allowing for any complexity in the filtering as well as pagination.

Would that work for you / Would you be happy with this syntax?

from surrealdb.

orimay avatar orimay commented on May 27, 2024 1

Yes, this would be amazing, thank you! And thank you for mentioning it in the stream :)

Will it be the only syntax instead of the current one? Having it work only by LIMIT may simplify it

from surrealdb.

emmanuel-keller avatar emmanuel-keller commented on May 27, 2024 1

We have started working on the implementation. For this to work efficiently, we need to make sure that the query is primarily ordered by the KNN distance. So, we will add a vector::distance::knn() function that will be mandatory to be placed in the ORDER BY clause. It will also return the computed distance, so the distance does not need to be recomputed.

SELECT
    id,
    flag,
    vector::distance::knn() AS distance
FROM
    pts
WHERE
    flag = true &&
    point <||> $pt
ORDER BY
   vector::distance::knn() DESC
LIMIT 2

I think both syntaxes may coexist.

from surrealdb.

phughk avatar phughk commented on May 27, 2024

Assigning to @emmanuel-keller who is able to answer this better.

from surrealdb.

orimay avatar orimay commented on May 27, 2024

Awesome! Will it be possible to order by distance alias? Will it work for cosine? And will it be possible to build index on multiple fields, e.g. vector + boolean?

from surrealdb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.