GithubHelp home page GithubHelp logo

Comments (3)

danbohus avatar danbohus commented on June 13, 2024 1

I think I have an idea of what might be going on. I suspect the images are getting queued inside the Join component. In order to perform synchronization, join internally holds its own queues with the messages it receives on both the primary and secondary streams (you can take a look at the implementation here). This internal queue is inside the component, and different from the normal psi delivery queue on each stream (which is controlled with the delivery policy). This is necessary to be able to correctly synchronize based on originating times. Imagine you have images (your secondary stream) arriving with originating times 0, 1, 2, 3, ... 100, and the primary (clock in your case) stream has originating times 0, 10, 20, 30, ... 100. Then, after first pairing (0, 0), join will internally queue all the images for originating time 1, 2, 3, ... and so on up until it receives the next clock message. That would be the clock message with originating time 10 (so we need to hold on to messages 1, 2, 3, etc. since we don't know what the next clock that arrives is). Theoretically the clock message with originating time 10 could arrive even later than "time 10", b/c of latencies. Join has to queue and operate that way in order to correctly synchronize. It will release the secondary messages from its internal queue only once it can prove that those messages will never be needed to synchronize with any other primary message.

Now, as to how to change your code to address this increasing memory issue, my suggestion would be to use a dense clock stream. (And this is a more general comment, generally you want to join dense streams). From your comments, it looks like your clock stream is a sparse stream perhaps signaling that a face was detected. Instead, can you construct a dense boolean stream (false when no face, true when face is detected) and do the join, and then filter out on the resulting tuples the ones that do not correspond to a face by using a Where() operator.

Another alternative you can use, but this loses the exact match synchronization based on originating times is the Pair operator, or a Fuse with the Available.LastOrDefault interpolator. You can read more about it here, if you look for explanations about Pair and Available.LastOrDefault. In this case the secondary messages are not queued internally, but rather only the last secondary messages is memorized. However, that also means that the pairing of messages that happens is no longer a guaranteed synchronization on originating times, as the results of the pairing will depend on the wall clock time of arrival (and hence latency) of the messages.

Hope this helps, but let us know what you find.

from psi.

sandrist avatar sandrist commented on June 13, 2024

One approach for getting to the bottom of this is to use Pipeline Diagnostics. Create your pipeline with enableDiagnostics set to true, persist the Diagnostics stream to a store, and then visualize it in PsiStudio. You'll be able to visualize all stream connections to quickly pinpoint, e.g., if you're missing a LatestMessage delivery policy anywhere. You can also inspect all the delivery queue sizes to see exactly where things are filling up. Take a look and let us know if that reveals anything.

from psi.

KanaHayama avatar KanaHayama commented on June 13, 2024

Thank you. Missing a LatestMessage might be a reason. I will reply with my findings.

from psi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.