GithubHelp home page GithubHelp logo

Comments (7)

veltman avatar veltman commented on May 27, 2024

After testing out some contrived audio + frames with FFmpeg, the sync seems fine even for very long audio, which leads me to believe the problem is with waveform or the math of splitting up the samples. I'm not sure how it picks samples when the number of samples to pull isn't an even factor of the total number of samples, which could explain some of the drift.

Going to try extracting the raw PCM data and manually getting the samples to see whether that fixes the alignment.

from audiogram.

veltman avatar veltman commented on May 27, 2024

Another test confirms the drift is coming from waveform - even with a number of samples that divides evenly, it seems to stretch a bit, where after a few minutes the lag is 10 - 20 frames.

from audiogram.

veltman avatar veltman commented on May 27, 2024

Tried this with raw PCM data, skipping waveform entirely. Will take some testing for mono/stereo, etc. but it looks promising. There's still a little bit of drift depending on how the rounding/sample rate, but much less. At 20 fps, waveform is drifting by about 1.5s after 5 minutes, whereas this drifts by about 0.1s, and we might be able to reduce that a bit further with smart frame math.

from audiogram.

markusvoelter avatar markusvoelter commented on May 27, 2024

0.1s after a few minutes would certainly be perfectly ok, that's much better than what we have now. I don't think that further optimization is required beyond that.

from audiogram.

veltman avatar veltman commented on May 27, 2024

The pcm branch takes a crack at this by removing the waveform library entirely and replacing it with a bespoke PCM data sampler that segments samples piped from FFmpeg. The alignment can still be off by up to 1 frame total, since it doesn't try to reallocate slop evenly across frames, but lag shouldn't accumulate over time.

A few notes:

  • Removing the waveform dependency makes the installation a lot simpler and makes Windows native installation possible (last I checked, libgroove was the holdup). Need to test the updated installation in various environments.
  • The vertical positioning puts the zero baseline in the middle, where it currently is, but now that it's assymetrical, that leaves some unused space. It might be preferable to always scale it so that the minimum value is at waveBottom and the maximum is at waveTop, even if that means the baseline will vary from video to video.
  • Waveform data is now being pulled as -1 to 1 instead of scaled to positive and mirrored, so the shape is somewhat different. It would be easy enough to rescale to match the old look but that may or may not be desirable.
  • For now, the bricks/equalizer patterns are using only the positive peak for each point, but it may make more sense to take the max of the two absolute values or something.
  • Also need to benchmark how long the new waveforming takes - it's probably still a small fraction compared to the frame rendering, but it's somewhat slower than the old method. It would potentially be possible to segment the samples into frames as they're coming into the stream instead of at the end, which would be a lot more efficient.

from audiogram.

veltman avatar veltman commented on May 27, 2024

Related to the vertical positioning above, we'll likely want to switch to a logarithmic scale to better fill the space and translate back into perceived loudness (which libgroove was handling before).

from audiogram.

veltman avatar veltman commented on May 27, 2024

This has been merged to master.

from audiogram.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.