A few reports of audio getting gradually out of sync with the waveform in a long video

Audio sync issues on long videos about audiogram HOT 7 CLOSED

nypublicradio commented on May 27, 2024

Audio sync issues on long videos

from audiogram.

Comments (7)

veltman commented on May 27, 2024

After testing out some contrived audio + frames with FFmpeg, the sync seems fine even for very long audio, which leads me to believe the problem is with waveform or the math of splitting up the samples. I'm not sure how it picks samples when the number of samples to pull isn't an even factor of the total number of samples, which could explain some of the drift.

Going to try extracting the raw PCM data and manually getting the samples to see whether that fixes the alignment.

from audiogram.

veltman commented on May 27, 2024

Another test confirms the drift is coming from waveform - even with a number of samples that divides evenly, it seems to stretch a bit, where after a few minutes the lag is 10 - 20 frames.

from audiogram.

veltman commented on May 27, 2024

Tried this with raw PCM data, skipping waveform entirely. Will take some testing for mono/stereo, etc. but it looks promising. There's still a little bit of drift depending on how the rounding/sample rate, but much less. At 20 fps, waveform is drifting by about 1.5s after 5 minutes, whereas this drifts by about 0.1s, and we might be able to reduce that a bit further with smart frame math.

from audiogram.

markusvoelter commented on May 27, 2024

0.1s after a few minutes would certainly be perfectly ok, that's much better than what we have now. I don't think that further optimization is required beyond that.

from audiogram.

veltman commented on May 27, 2024

The pcm branch takes a crack at this by removing the waveform library entirely and replacing it with a bespoke PCM data sampler that segments samples piped from FFmpeg. The alignment can still be off by up to 1 frame total, since it doesn't try to reallocate slop evenly across frames, but lag shouldn't accumulate over time.

A few notes:

Removing the waveform dependency makes the installation a lot simpler and makes Windows native installation possible (last I checked, libgroove was the holdup). Need to test the updated installation in various environments.
The vertical positioning puts the zero baseline in the middle, where it currently is, but now that it's assymetrical, that leaves some unused space. It might be preferable to always scale it so that the minimum value is at waveBottom and the maximum is at waveTop, even if that means the baseline will vary from video to video.
Waveform data is now being pulled as -1 to 1 instead of scaled to positive and mirrored, so the shape is somewhat different. It would be easy enough to rescale to match the old look but that may or may not be desirable.
For now, the bricks/equalizer patterns are using only the positive peak for each point, but it may make more sense to take the max of the two absolute values or something.
Also need to benchmark how long the new waveforming takes - it's probably still a small fraction compared to the frame rendering, but it's somewhat slower than the old method. It would potentially be possible to segment the samples into frames as they're coming into the stream instead of at the end, which would be a lot more efficient.

from audiogram.

veltman commented on May 27, 2024

Related to the vertical positioning above, we'll likely want to switch to a logarithmic scale to better fill the space and translate back into perceived loudness (which libgroove was handling before).

from audiogram.

veltman commented on May 27, 2024

This has been merged to master.

from audiogram.

Audio sync issues on long videos about audiogram HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs