GithubHelp home page GithubHelp logo

Comments (6)

hughrawlinson avatar hughrawlinson commented on June 23, 2024

Hi! Meyda operates on the buffers you give it, so if you load the audio from disk in your code and pass the appropriate buffers, that’s what Meyda will calculate the features on. Or are you looking for this feature in the CLI?

from meyda.

tmbops1991 avatar tmbops1991 commented on June 23, 2024

Thank you for your reply. It is very encouraging.

The execution environment is not CLI.
My execution environment is a bit special.
I am running within Max8 node.js in Ableton.
It is called Max for live.

My goal is to do DeepLearning while DJing. To do this, I am creating an environment for music analysis. Then I came across this library.
When doing DeepLearning, I want to convert the music for input into an image.
I want to MFCC the music and convert it into an image.

I would like to build this environment, but I am encountering some problems.

  1. we are doing Meyda.extract(['mfcc'],buffer) using the node wav library you recommend, but music longer than 45 seconds takes too long to decode and the process does not finish.
  2. decode by AudioContext.decodeAudioData is not possible, ERROR: not power of two occurs.
  3. why is there no error when using node wav, but when using AudioContext.decodeAudioData, an error occurs and I cannot proceed?

For 1, we think we can solve this problem by specifying some time periods in the music and decode them.
As for 2 and 3, I have spent hours exploring this git forum trying to solve it, but can't. I have changed Meyda.buffersize and tried many times, but to no avail.
I think the story on this link is applicable to my current problem.

I would really appreciate any good advice you can give me.

I'm sorry if my English is not good enough to read it.
Translated DeepL

from meyda.

hughrawlinson avatar hughrawlinson commented on June 23, 2024

Sorry that the error you encountered wasn't clear enough! The issue is that the length of your buffer always needs to be a power of 2. So, if you can cut your audio into a length of that size, you will be able to use this method.

An important piece of context is that for the purposes of audio analysis, you can add silence to the start and end of your buffer and you will and up with the same result - so if you have 45 seconds of audio at a CD quality sample rate, you have 45*44100 samples (buffer length 1,984,500). The next power of 2 is 2^21, 2,097,152. This means that you can add 2^21-45*44100 zeros to the end of your signal and then you will be able to extract audio features for this audio.

Another approach would be to take your 45 seconds of audio and cut it into chunks, then extract the features from these chunks and take the average result as representative of the whole sound. For example, splitting the signal into chunks that are of length 2^16 would result in chunks that are roughly 1.5 seconds long. An additional benefit of this approach is that you will have information about how the audio features change over the course of the sound, which you can use in machine learning applications as a feature in and of itself.

I hope that helps, let me know if I can clarify further!

from meyda.

tmbops1991 avatar tmbops1991 commented on June 23, 2024

I would love to know how to split audio into chunks, decode only the specified chunk portion and pass it to meyda. how do I chunk it after reading in fs?

from meyda.

hughrawlinson avatar hughrawlinson commented on June 23, 2024

If you had an array that was 11 elements long (like an audio buffer with 11 samples), and you wanted to chunk that buffer up into arrays that are 4 elements long, you could do something like:

const myArray = [0,0,0,0,0,0,0,0,0,0,0];


// array is the buffer, and n is the size of the chunks
function chunk(array, n) {
  // Copy the array to avoid modifying the original
  let myArrayCopy = [...array];

  // Create a second array to store each of the chunks
  let chunks = [];
  // While there are still enough elements in the array to create a chunk of the right size...
  while (myArrayCopy.length >= n) {
    // Take the first n elements, remove them from the array, and push them as an array onto the chunks we'll return
    chunks.push(myArrayCopy.splice(0,n));
  }

  // Here you may be left with some remaining samples. You can decide whether to discard them, to add zeros
  // to the end of the array as I described in my previous comment, or to return an incomplete chunk - whatever is
  // appropriate for your code. Just make sure to only pass buffers to meyda of the correct length.

  return chunks;
};

That code would return an array as follows:

const chunkedArray = chunk(myArray, 3);
console.log(chunkedArray);
[
  [0,0,0],
  [0,0,0],
  [0,0,0],
]

So now you end up with multiple buffers representing a shorter part of the signal. To go back to my example above, if you have an audio recording that you loaded from disk that is 1,984,500 samples long, you can chunk that array of samples by doing chunk(buffer, Math.pow(2, 16)). This will leave you with an array of 30 new buffers, each of which is 65536 samples long (about 1.5 seconds of audio each). There's a remainder of 18420 samples at the end of the original recording. You can choose to either discard these samples or to add 47116 zeroes (approximately a second of silence) to the end of your signal to pad out the buffer to contain the same number of elements as the other buffers.You can then run each of these through meyda, because the buffer size of each is a power of two. You can then take the average of the resulting audio features, and take that to represent your original recording.

from meyda.

SUDDSDUDDS avatar SUDDSDUDDS commented on June 23, 2024

Okay that's okay I'll let you know if I want it lol

from meyda.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.