GithubHelp home page GithubHelp logo

superkogito / pydiogment Goto Github PK

View Code? Open in Web Editor NEW
85.0 9.0 16.0 90.81 MB

:mega: Python library for audio augmentation

Home Page: https://superkogito.github.io/pydiogment/

License: BSD 3-Clause "New" or "Revised" License

Python 81.42% TeX 15.77% Makefile 2.82%
audio augmentation audio-processing sound sound-processing audio-effects machine-learning deep-learning python

pydiogment's Introduction

πŸ”” Pydiogment

Build Status Build status Documentation Status License Python Coverage Status Coverage Status CodeFactor

Pydiogment aims to simplify audio augmentation. It generates multiple audio files based on a starting mono audio file. The library can generates files with higher speed, slower, and different tones etc.

πŸ“₯ Installation

Dependencies

Pydiogment requires:

On Linux

On Linux you can use the following commands to get the libraries:

  • Numpy: pip install numpy
  • Scipy: pip install scipy
  • FFmpeg: sudo apt install ffmpeg

On Windows

On Windows you can use the following installation binaries:

On MacOS

On MacOs, use homebrew to install the packages:

  • Numpy: brew install numpy --with-python3
  • Scipy: You need to first install a compilation tool like Gfortran using homebrew brew install gfortran when it's done, install Scipy pip install scipy for more information and guidelines you can check this link: https://github.com/scipy/scipy/blob/master/INSTALL.rst.txt#mac-os-x
  • FFmpeg: brew install ffmpeg

Installation

If you already have a working installation of NumPy and SciPy , you can simply install Pydiogment using pip:

pip install pydiogment

To update an existing version of Pydiogment, use:

pip install -U pydiogment

πŸ’‘ How to use

  • Amplitude related augmentation

    • Apply a fade in and fade out effect

      from pydiogment.auga import fade_in_and_out
      
      test_file = "path/test.wav"
      fade_in_and_out(test_file)
    • Apply gain to file

      from pydiogment.auga import apply_gain
      
      test_file = "path/test.wav"
      apply_gain(test_file, -100)
      apply_gain(test_file, -50)
    • Add Random Gaussian Noise based on SNR to file

      from pydiogment.auga import add_noise
      
      test_file = "path/test.wav"
      add_noise(test_file, 10)
  • Frequency related augmentation

    • Change file tone

      from pydiogment.augf import change_tone
      
      test_file = "path/test.wav"
      change_tone(test_file, 0.9)
      change_tone(test_file, 1.1)
  • Time related augmentation

    • Slow-down/ speed-up file

      from pydiogment.augt import slowdown, speed
      
      test_file = "path/test.wav"
      slowdown(test_file, 0.8)
      speed(test_file, 1.2)
    • Apply random cropping to the file

      from pydiogment.augt import random_cropping
      
      test_file = "path/test.wav"
      random_cropping(test_file, 1)
    • Change shift data on the time axis in a certain direction

      from pydiogment.augt import shift_time
      
      test_file = "path/test.wav"
      shift_time(test_file, 1, "right")
      shift_time(test_file, 1, "left")
  • Audio files format

This library currently supports mono WAV files only.

πŸ“‘ Documentation

A thorough documentation of the library is available under pydiogment.readthedocs.io.

πŸ‘· Contributing and bugs report

Contributions are welcome and encouraged. To learn more about how to contribute to Pydiogment please refer to the Contributing guidelines

To report bugs, request a feature or just ask for help you can refer to the issues section. Before reporting a bug please make sure it is not addressed by an older issue and make sure to add your operating system type, its version number and the versions of the dependencies used.

πŸŽ‰ Acknowledgment and credits

pydiogment's People

Contributors

danielskatz avatar hmmalek avatar superkogito avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pydiogment's Issues

[JOSS Review] unit tests and output files

reference review issue

In hitting the "automated tests" section, I noticed a couple of things about the unit test framework that could be improved.

  1. All tests seem to detect existence of the output file, but this does not confirm correct behavior. For many of the tests, it would be simple enough to detect, eg, changes in length or gain. For others (eg IR) you might need to do regression testing against known correct outputs.
  2. Minor thing, but your tests don't clean up after themselves. If you're using something like travis with build caching enabled, outputs from old runs could linger, and lead to spurious passing tests (because the files exist from a previous run). Since you're using pytest already, it wouldn't be too much work to use a tmp_path fixture to prevent this kind of behavior: https://docs.pytest.org/en/latest/tmpdir.html , or otherwise clean up old outputs after tests execute (pass or fail).

Point (2) would be much easier if the API was extended to allow the user to specify a target output path (#11). I see that you use the output filename to encode the deformation parameters, and letting a user specify the path exactly might make that difficult. I have some thoughts about how you might be able to accomplish both things (ie via string interpolation), but that might be out of scope for the testing issue.

[JOSS Review] Paper details, related work, experiments

review issue

I read through the paper, and have a few comments for improving it.

Related work

A few projects that aren't, but should be cited:

  • Mauch, Matthias, and Sebastian Ewert. "The audio degradation toolbox and its application to robustness evaluation." (2013). (And python port: https://github.com/sevagh/audio-degradation-toolbox )
  • SchlΓΌter, Jan, and Thomas Grill. "Exploring Data Augmentation for Improved Singing Voice Detection with Neural Networks." ISMIR. 2015.

More generally, it's not clear in the writeup how this project compares to the existing alternatives, functionality-wise. (Full disclosure, there's a bit of an awkward situation here as I'm the author of one of these alternative toolboxes, but I'll try to be objective 😁.) I know these papers are meant to be brief, but it's also important to properly establish context, and make clear to readers how this package differs from others.

(As an aside: I don't think the characterization of muda is entirely accurate: we use it for all sorts of things outside of music, notably environmental sound and bioacoustics.)

Experiment details

It's not clear in the writeup whether your experiment includes augmentation during testing, or only during training.

Paper review

Proposed modifications to the paper :

  • Page 1: Summary-line 6: "and most deformation..." -> "as most deformation..." (rather an explanation)
  • Page 1 : Summary-line 11: "the scipy..." -> "the Scipy..."
  • Page 1 : Amplitude based augmentation.Add Fade-line 1: " and a fade-out effects..." -> "and fade-out effect..." / "and a fade-out effects..."
  • Page 2: Add Noise: move the entire equation on a separate line.
  • Page 3-line 2: "a separating features..." -> "a separating feature..."
  • Page 3: Conclusion-line 2: "These strategies aims.." -> "These strategies aim..."

[JOSS review] supported audio formats

The README, docs, and docstrings don't specify which audio formats are supported. Please add information to all three about which audio formats are supported.

[JOSS review] Docstrings missing information for output parameters

The docstrings are missing information about what the augmentation functions actually return.

For example, the docs for fade_in_and_out say:

pydiogment.auga.fade_in_and_out(infile)
Add a fade in and out effect to the audio file.

Args:
infile (str) : input filename/path.

This doesn't explain where the processed audio goes (saved to disk? returned as numpy array? as some other data format?), and whether there are other output parameters. The same applies to the other augmentation functions.

[JOSS review] installation instructions for macOS and Windows

The installation instructions in the README and the docs assume the user is using a linux OS that supports apt install, which isn't the case for windows and macOS.

Please add instructions for installing non-python dependencies on macOS and windows, unless this package only supports linux, in which case it should be clearly stated in the READM/docs.

I have an example of instructions for installing ffmpeg across all 3 OSs here:
https://github.com/justinsalamon/scaper/#non-python-dependencies

[JOSS review] fade_in_and_out doesn't work as expected

Just tried fade_in_and_out() and got unexpected behavior.

Here are my input/output files (the input is called input.wav): fade_bug.zip

The input is a 2-second, 16-bit, mono, WAV file. The output is heavily distorted and doesn't include the expected fade-in and fade-out behavior.

While on this function, also note it'd be convenient for the user to be able to choose the destination folder for the output as well the filename, or at least a custom suffix as opposed to the hard coded output path and filename.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.