GithubHelp home page GithubHelp logo

Comments (10)

mertalev avatar mertalev commented on August 16, 2024 1

You can get it to use the index and still do pre-filtering with a query like this:

SELECT
    segment.*,
    embedding <=> :embedding AS distance
FROM
    segment
INNER JOIN
    asset ON segment.asset_id = asset.id
WHERE
    distance < 0.99
    AND segment.category = 'visual'
    AND asset.collection_id = :collection_id
ORDER BY
    embedding <=> :embedding
LIMIT 100

This will still be slower than the post-filtering approach because it needs to evaluate the cosine distance for (potentially many) assets in the WHERE condition, but the recall should be correct.

I used embedding <=> :embedding instead of distance for ordering because I remember this being necessary for the index to get used. If the query plan is the same with distance, feel free to use that instead.

from pgvecto.rs.

mertalev avatar mertalev commented on August 16, 2024 1

I think it's because the limit is 100, but there are only 96 results that meet the threshold. It will end up checking every embedding in the table to find another 4 matches. Setting the limit to 96 should be much faster.

I wonder if a way to stop early would help performance here. The ordering should be informative of when there's no longer a point to keep searching.

from pgvecto.rs.

VoVAllen avatar VoVAllen commented on August 16, 2024

The last SQL should be the right one. Does it still return only one result? Or multiple results but to be slow? And vector index is an approximate index which means the results might not be the exact same as the exact one (without using the index).'

For your cases, if you don't have too many categories, you can put category_id and vector in one table and create partial index on each category. This will accelerate the filter condition.

from pgvecto.rs.

VoVAllen avatar VoVAllen commented on August 16, 2024

Unfortunately it's hard to co-optimize the vector search and filter condition together. We have some internal experiments and only a very limited cases can have improvments.

from pgvecto.rs.

rsomani95 avatar rsomani95 commented on August 16, 2024

@mertalev Have you had success running that exact query? When I tried running it, I get this error saying "distance" is not defined:

ERROR:  column "distance" does not exist
LINE 10:     distance < 0.99

I also tried another variant of what you suggested but ran into the same error:

WITH distances AS (
    SELECT
        segment.*,
        embedding <=> :embedding
            AS distance
    FROM
        segment
    INNER JOIN
        asset ON segment.asset_id = asset.id
    WHERE
        distance < 0.99
        AND segment.category = 'visual'
        AND asset.collection_id = :collection_id
    ORDER BY
        embedding <=> :embedding
)
SELECT distances.id, distances.distance, distances.category
FROM distances
LIMIT 100
;

from pgvecto.rs.

rsomani95 avatar rsomani95 commented on August 16, 2024

@VoVAllen the last SQL I'd shared in the opening comment does give correct number of results, and yes I'm keeping in mind that the index gives approx matches so the results won't necessarily be identical to brute force.
It takes ~16.2s to run

Here's the query plan visualised:

CleanShot 2024-06-14 at 11 28 54@2x

PS: You can click on "Raw" to see the exact query planner output and "Query" for the query that generated it.
https://explain.dalibo.com/plan/345ga89gc5edefae

Unfortunately it's hard to co-optimize the vector search and filter condition together. We have some internal experiments and only a very limited cases can have improvments.

I see. I'm slightly confused by this, because I was under the impression that this was one of the promises of the VBASE algorithm? Are there ways you would adjust what my query is trying to achieve to leverage the indexes and vbase correctly that I'm missing out on?

from pgvecto.rs.

rsomani95 avatar rsomani95 commented on August 16, 2024

Just a thought: are compound indexes with a vector + non-vector column supported? If so, perhaps that could help?

if you don't have too many categories, you can put category_id and vector in one table and create partial index on each category. This will accelerate the filter condition.

Unfortunately, I don't think this will work for our use case

from pgvecto.rs.

rsomani95 avatar rsomani95 commented on August 16, 2024

Did some further experiments, and we found that returning the distance itself in a new column has negligible cost, and the part of the query responsible for the slowdown is the WHERE distance < 0.99

To be more speicifc, the commented out line is responsible for this query to go from 269ms -> 16200ms:

WITH distances AS (
    SELECT
        segment.id, segment.category,
        embedding <=> '[0.048492431640625,0.004207611083984375,0.020538330078125,-0.06622314453125,-0.009918212890625,0.0819091796875,-0.0247344970703125,-0.01090240478515625,0.0679931640625,-0.040924072265625,-0.0034122467041015625,0.072021484375,0.0008335113525390625,-0.006832122802734375,0.00824737548828125,-0.03314208984375,0.025909423828125,0.04852294921875,-0.0171661376953125,0.0048370361328125,-0.0230712890625,-0.01190185546875,-0.0043792724609375,0.01007080078125,-0.052276611328125,0.0180816650390625,-0.00858306884765625,0.0078582763671875,-0.050811767578125,0.007328033447265625,0.0712890625,0.033416748046875,-0.03240966796875,-0.0029735565185546875,-0.0399169921875,-0.07537841796875,-0.0631103515625,-0.02191162109375,-0.0292510986328125,0.063232421875,0.093017578125,0.022674560546875,-0.0186920166015625,0.0041046142578125,-0.0300140380859375,-0.0941162109375,0.038970947265625,0.036651611328125,-0.0223388671875,0.002079010009765625,0.00818634033203125,-0.0187225341796875,0.031982421875,-0.01050567626953125,-0.00460052490234375,-0.0517578125,-0.0269317626953125,-0.038055419921875,0.007781982421875,0.038482666015625,-0.05145263671875,0.00878143310546875,0.021575927734375,0.053375244140625,0.0005512237548828125,0.03448486328125,-0.01157379150390625,0.0237274169921875,-0.05712890625,0.099365234375,-0.0019664764404296875,0.056243896484375,-0.00019276142120361328,-0.027069091796875,0.1966552734375,-0.01096343994140625,0.00528717041015625,-0.0244140625,-0.0211944580078125,0.0116424560546875,-0.053863525390625,-0.042266845703125,-0.0241546630859375,0.016571044921875,0.01535797119140625,0.0203704833984375,-0.036376953125,0.029815673828125,-0.048858642578125,-0.027984619140625,0.0044097900390625,0.049072265625,-0.06146240234375,0.002292633056640625,-0.01023101806640625,-0.0404052734375,0.03216552734375,-0.0093536376953125,-0.01009368896484375,0.03692626953125,-0.0257568359375,0.0193939208984375,0.004894256591796875,0.005229949951171875,-0.007106781005859375,0.06427001953125,0.007354736328125,0.050048828125,0.05267333984375,0.044952392578125,-0.0908203125,-0.02679443359375,0.01000213623046875,0.0247955322265625,0.0106658935546875,-0.0261688232421875,0.0207672119140625,0.045806884765625,0.035491943359375,-0.014556884765625,0.0421142578125,-0.045654296875,-0.048065185546875,0.0234832763671875,-0.14111328125,-0.004230499267578125,0.024566650390625,-0.0244293212890625,0.00693511962890625,0.01145172119140625,0.040283203125,-0.06060791015625,0.00921630859375,0.029083251953125,-0.09417724609375,0.028656005859375,-0.00201416015625,0.09478759765625,-0.01023101806640625,0.0249786376953125,0.018310546875,-0.02783203125,-0.022979736328125,-0.032073974609375,-0.0009412765502929688,0.00971221923828125,0.0071563720703125,-0.06329345703125,-0.01146697998046875,0.0149078369140625,0.0007505416870117188,0.01476287841796875,-0.0672607421875,-0.05224609375,-0.004695892333984375,0.006256103515625,0.00579833984375,0.0203704833984375,-0.01812744140625,-0.003818511962890625,0.007167816162109375,0.041107177734375,-0.01318359375,0.01480865478515625,-0.05267333984375,-0.0017557144165039062,0.011199951171875,-0.0153656005859375,0.0550537109375,0.03387451171875,-0.0196990966796875,0.033782958984375,-0.0158233642578125,0.0,0.037567138671875,0.0020351409912109375,-0.0206451416015625,-0.0242767333984375,0.037933349609375,0.03271484375,-0.035003662109375,-0.0303802490234375,-0.0282440185546875,0.0411376953125,0.023040771484375,0.183349609375,0.034759521484375,-0.01788330078125,0.07275390625,-0.025360107421875,0.0416259765625,0.037567138671875,-0.04315185546875,-0.06610107421875,0.001766204833984375,0.06866455078125,0.0164642333984375,-0.014373779296875,-0.033905029296875,-0.01079559326171875,-0.03521728515625,0.07550048828125,-0.01229095458984375,0.0278778076171875,0.039764404296875,-0.046722412109375,-0.0186309814453125,0.033660888671875,0.00748443603515625,0.0007295608520507812,-0.0947265625,0.04132080078125,-0.0140380859375,0.01447296142578125,-0.051544189453125,-0.055419921875,0.02691650390625,0.0095062255859375,-0.01568603515625,0.004299163818359375,-0.041656494140625,-0.0280303955078125,0.07684326171875,0.00970458984375,0.04351806640625,-0.078125,-0.0384521484375,0.0222930908203125,-0.003314971923828125,-0.055999755859375,0.01373291015625,0.01485443115234375,-0.04168701171875,0.04571533203125,-0.04693603515625,-0.0166473388671875,-0.072998046875,-0.0240478515625,0.002040863037109375,0.0220794677734375,-0.0009369850158691406,-0.027069091796875,0.04248046875,-0.0750732421875,-0.0960693359375,-0.03765869140625,-0.0173187255859375,-0.0325927734375,0.050750732421875,-0.052520751953125,-0.038818359375,-0.177001953125,-0.0188751220703125,0.0130462646484375,0.04058837890625,0.042236328125,-0.03338623046875,-0.031982421875,0.04229736328125,-0.05474853515625,0.0287017822265625,0.007640838623046875,-0.033447265625,0.0033664703369140625,-0.055694580078125,-0.0714111328125,-0.036346435546875,0.00142669677734375,-0.03070068359375,0.0299530029296875,-0.03369140625,-0.00966644287109375,0.005718231201171875,0.0187225341796875,-0.0213775634765625,-0.0185546875,-0.04180908203125,-0.0450439453125,-0.0229034423828125,0.04693603515625,0.036163330078125,0.061004638671875,-0.0672607421875,0.0009226799011230469,-0.046600341796875,-0.0242767333984375,-0.058624267578125,-0.1256103515625,0.023223876953125,0.0295867919921875,-0.046234130859375,0.0163421630859375,0.050384521484375,-0.025115966796875,0.037994384765625,0.027984619140625,0.0261383056640625,-0.0117645263671875,-0.004352569580078125,0.00960540771484375,0.07293701171875,0.062225341796875,-0.01007080078125,-0.02685546875,-0.0005059242248535156,-0.0122833251953125,-0.024749755859375,0.036529541015625,-0.03192138671875,0.015869140625,0.0777587890625,0.0109405517578125,-0.050018310546875,0.06787109375,0.08477783203125,-0.035736083984375,0.01482391357421875,0.05218505859375,-0.004047393798828125,-0.0165252685546875,0.1019287109375,-0.04510498046875,0.022186279296875,0.005382537841796875,0.004451751708984375,0.01129913330078125,-0.048980712890625,0.047027587890625,-0.0146026611328125,-0.012939453125,-0.0210418701171875,-0.0308990478515625,-0.057220458984375,0.0005764961242675781,0.01168060302734375,-0.004848480224609375,0.01641845703125,-0.0226898193359375,0.0187835693359375,0.1494140625,0.055328369140625,-0.0178985595703125,-0.0169219970703125,-0.043792724609375,-0.0238800048828125,-0.013580322265625,0.0104522705078125,0.0526123046875,-0.033050537109375,0.00775909423828125,-0.025238037109375,0.00724029541015625,-0.025909423828125,-0.00861358642578125,0.040679931640625,0.1619873046875,0.0263519287109375,-0.0229034423828125,0.0101165771484375,-0.041107177734375,-0.0301361083984375,0.006916046142578125,-0.0175323486328125,0.01502227783203125,-0.0161895751953125,-0.01323699951171875,-0.004383087158203125,-0.0087432861328125,-0.0511474609375,0.00356292724609375,0.011383056640625,-0.0548095703125,-0.039947509765625,0.045318603515625,-0.01392364501953125,-0.01788330078125,0.025634765625,0.00868988037109375,-0.00244903564453125,0.0030956268310546875,0.0032634735107421875,0.06744384765625,0.0064697265625,0.005268096923828125,-0.014190673828125,-0.01392364501953125,0.01105499267578125,0.0307769775390625,-0.017425537109375,0.002033233642578125,0.034698486328125,-0.0250244140625,-0.00201416015625,-0.0233154296875,-0.019287109375,0.0181884765625,0.008575439453125,-0.03564453125,0.045379638671875,0.003162384033203125,0.0053863525390625,0.011322021484375,-0.01247406005859375,-0.04608154296875,0.02166748046875,0.0238189697265625,0.022125244140625,0.0035915374755859375,0.022064208984375,0.00432586669921875,-0.03546142578125,0.01241302490234375,0.02740478515625,0.004619598388671875,-0.09027099609375,-0.052764892578125,0.0325927734375,-0.02593994140625,0.0094146728515625,-0.01071929931640625,0.01453399658203125,0.00960540771484375,0.05364990234375,0.003635406494140625,-0.020416259765625,-0.0279083251953125,-0.004726409912109375,-0.007053375244140625,0.004291534423828125,-0.00629425048828125,-0.0200958251953125,-0.049163818359375,0.02899169921875,-0.06494140625,-0.0117034912109375,-0.04913330078125,0.0148468017578125,-0.070556640625,0.04052734375,0.0009059906005859375,0.021759033203125,-0.0352783203125,-0.0341796875,-0.050262451171875,0.00033593177795410156,0.043792724609375,-0.024658203125,0.0182952880859375,-0.01245880126953125,-0.01078033447265625,-0.0157012939453125,0.045257568359375,0.0224609375,-0.0308074951171875,0.06298828125,0.0051727294921875,0.00559234619140625,-0.025604248046875,-0.0242919921875,0.01508331298828125,-0.01375579833984375,0.0262298583984375,-0.043365478515625,-0.032958984375,-0.0101776123046875,-0.0250244140625,-0.0118408203125,0.0208282470703125,-0.006626129150390625,0.047698974609375,0.0027217864990234375,-0.023956298828125,-0.0016622543334960938,0.051513671875,-0.001529693603515625,-0.0250396728515625,-0.0016460418701171875,-0.0206451416015625,0.047576904296875,-0.06134033203125,0.03338623046875,-0.0251922607421875,0.17041015625,0.04351806640625,-0.07464599609375,0.03369140625,1.4543533325195312e-05,-0.0011463165283203125,-0.048004150390625,0.00016164779663085938,0.0247344970703125,-0.0215301513671875,0.03076171875,-0.016937255859375,-0.0943603515625,-0.026824951171875,0.01163482666015625,0.039276123046875,-0.01502227783203125,0.00400543212890625,0.01165771484375,0.0169677734375,-0.033050537109375,-0.03863525390625,-0.09918212890625,0.04339599609375,-0.04229736328125,-0.005825042724609375,-0.030731201171875,-0.0289306640625,0.0068206787109375,0.0158538818359375,0.043853759765625,0.005344390869140625,-0.0231170654296875,0.0287017822265625,0.01654052734375,0.041961669921875,0.005596160888671875,0.0307769775390625,0.0102081298828125,0.03472900390625,0.0164337158203125,0.00878143310546875,0.061492919921875,-0.0435791015625,-0.08380126953125,0.0120391845703125,-0.020660400390625,-0.049346923828125,0.03900146484375,-0.03497314453125,0.00788116455078125,-0.053192138671875,0.0191497802734375,0.02227783203125,0.0333251953125,-0.00919342041015625,-0.104248046875,-0.049774169921875,-0.0645751953125,0.028350830078125,-0.03839111328125,-0.01654052734375,-0.00263214111328125,-0.007259368896484375,0.017120361328125,-0.00391387939453125,-0.0848388671875,0.006214141845703125,-0.0175018310546875,-0.051971435546875,-0.048828125,-0.08807373046875,0.00550079345703125,0.030303955078125,0.00983428955078125,-0.0084686279296875,0.01213836669921875,0.01532745361328125,0.003978729248046875,0.03924560546875,-0.02044677734375,0.0391845703125,-0.051513671875,-0.02532958984375,-0.0233612060546875,0.03936767578125,-0.0286712646484375,-0.037811279296875,0.05712890625,0.0418701171875,0.0294036865234375,0.006137847900390625,-0.0252532958984375,0.003887176513671875,-0.0084991455078125,0.003604888916015625,-0.01824951171875,0.04150390625,-0.08380126953125,-0.00284576416015625,0.01535797119140625,0.0009307861328125,-0.00376129150390625,0.01319122314453125,-0.057586669921875,0.0261383056640625,0.01068878173828125,0.0295867919921875,-0.00908660888671875,-0.0201873779296875,-0.01419830322265625,0.0159149169921875,0.06597900390625,-0.0137176513671875,0.014495849609375,0.01540374755859375,0.01279449462890625,-0.0262298583984375,-0.0182952880859375,0.04644775390625,-0.00547027587890625,-0.0021648406982421875,-0.03570556640625,-0.04400634765625,0.00457000732421875,0.01441192626953125,0.0352783203125,0.043182373046875,-0.0010404586791992188,-0.032196044921875,-0.0079345703125,-0.043792724609375,-0.016204833984375,0.05029296875,0.004558563232421875,-0.055328369140625,-0.0265960693359375,0.028350830078125,0.03192138671875,0.029693603515625,0.040435791015625,0.004878997802734375,-0.034515380859375,0.025146484375,-0.01326751708984375,-0.0140838623046875,-0.0008788108825683594,-0.0230865478515625,0.0088653564453125,0.00522613525390625,-0.0179290771484375,0.0303802490234375,0.032928466796875,-0.0184326171875,0.033935546875,-0.002887725830078125,-0.0090484619140625,0.0176849365234375]'
            AS distance
    FROM
        segment
    INNER JOIN
        asset ON segment.asset_id = asset.id
    WHERE
        segment.category = 'visual'
        AND asset.collection_id = 'ed0bbe8d-ce06-4b49-9e01-0d763439fa8a'
    ORDER BY
        distance
)
SELECT distances.id, distances.distance, distances.category
FROM distances
-- WHERE distances.distance < 0.99
LIMIT 100
;

from pgvecto.rs.

mertalev avatar mertalev commented on August 16, 2024

@rsomani95 Sorry, the earlier one was from memory. The corrected query is this:

SELECT
    segment.*
FROM
    segment
INNER JOIN
    asset ON segment.asset_id = asset.id
WHERE
    embedding <=> :embedding < 0.99
    AND segment.category = 'visual'
    AND asset.collection_id = :collection_id
ORDER BY
    embedding <=> :embedding
LIMIT 100;

from pgvecto.rs.

rsomani95 avatar rsomani95 commented on August 16, 2024

@mertalev thank you.

This query seems to have similar results to my last comment, where it takes ~16.2s, and removing the embedding <=> :embedding < 0.99 takes ~375ms

from pgvecto.rs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.