harmonai-org / sample-generator Goto Github PK

View Code? Open in Web Editor NEW

1.1K 45.0 178.0 59.56 MB

Tools to train a generative model on arbitrary audio samples

License: MIT License

Python 28.52% Jupyter Notebook 71.48%

sample-generator's Introduction

sample-generator

Tools to train a generative model on arbitrary audio samples

Dance Diffusion notebook:

Dance Diffusion fine-tune notebook:

Prerequisites

Dance Diffusion requires Python 3.7+

You can install the required packages by running pip install . from the root of the repo

Todo

Add inference notebook
Add interpolations to nobebook
Add fine-tune notebook
Add guidance to notebook

sample-generator's People

Stargazers

Watchers

Forkers

dmarx hardsteppl k-nar fyremael zaptrem drscotthawley zvk lyghtcode techthiyanes ovuruska nicholasbulka reedmayhew18 andrewparkerresearch birgermoell sebamacchia koreteknology deefourcee-exe kanapazombie neonsecret ipsoblender ivoider phi-line nopeanuts moiseshorta jamesthesnake fastflair un1tz3r0 adrianwedd neuralnotw0rk morganmcg1 twobob hirajanwin mathyouf shaun95 spkprav knaik fastrocket edwios thelustriva julians89 pollinations donstroganotti bmorphism kyrillosl hai-labs nbiish flyingdisc shioshosho asloan7 jamesparsloe yoshimario miblue119 abehmiel un-bias zachatoch1 srikalyan cosmicbboy moredatarequired brevdev fluential patrickvonplaten ivan-verges marcus-arcadius matvogel stephenroddy bromomaster c00renut moonspirit genesisz ferasalsaab yuan-manx koker007 zmzlois johnpaulbin hdparmar bahattab chenchy cokeroluwafemi wwerkk misctyler yoavz mehmetcanbudak nordseele sevilevol baronrustamov serp-ai federicovisi phoenixdigitalfx gacwr remi9martin jtatman nisaaragharia neuroidss pratik-behera realfolkcode tonetechnician chhaviilli stanley-gabriel rocketgod-git indietechteam

sample-generator's Issues

Error when trying to run "Imports and definitions"

I've got this error

`ModuleNotFoundError Traceback (most recent call last)
in <cell line: 12>()
10 import gc
11
---> 12 from diffusion import sampling
13 import torch
14 from torch import optim, nn

ModuleNotFoundError: No module named 'diffusion'`

I tried manually installing diffusion by doing
!pip install diffusion

but then get
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
jsonschema 4.19.2 requires attrs>=22.2.0, but you have attrs 21.4.0 which is incompatible.
referencing 0.34.0 requires attrs>=22.2.0, but you have attrs 21.4.0 which is incompatible.
Successfully installed attrs-21.4.0 cbor2-5.6.2 diffusion-6.10.2 diffusion-core-0.0.65 stringcase-1.2.0 structlog-21.5.0
WARNING: The following packages were previously imported in this runtime:
[attr,attrs]
You must restart the runtime in order to use newly installed versions.

and together with it i get

`TypeError Traceback (most recent call last)
in <cell line: 3>()
1 #@title Imports and definitions
2 get_ipython().system('pip install diffusion')
----> 3 from prefigure.prefigure import get_all_args
4 from contextlib import contextmanager
5 from copy import deepcopy

26 frames
/usr/local/lib/python3.10/dist-packages/prefigure/init.py in
1 from .prefigure import *
----> 2 from .ofc import *

/usr/local/lib/python3.10/dist-packages/prefigure/ofc.py in
10 from prefigure import get_all_args, arg_eval
11 import configparser
---> 12 import gradio as gr
13 import gradio.blocks as gb
14 import itertools

/usr/local/lib/python3.10/dist-packages/gradio/init.py in
1 import json
2
----> 3 import gradio._simple_templates
4 import gradio.image_utils
5 import gradio.processing_utils

/usr/local/lib/python3.10/dist-packages/gradio/_simple_templates/init.py in
----> 1 from .simpledropdown import SimpleDropdown
2 from .simpleimage import SimpleImage
3 from .simpletextbox import SimpleTextbox
4
5 all = ["SimpleDropdown", "SimpleTextbox", "SimpleImage"]

/usr/local/lib/python3.10/dist-packages/gradio/_simple_templates/simpledropdown.py in
4 from typing import Any, Callable
5
----> 6 from gradio.components.base import FormComponent
7 from gradio.events import Events
8

/usr/local/lib/python3.10/dist-packages/gradio/components/init.py in
1 from gradio.components.annotated_image import AnnotatedImage
2 from gradio.components.audio import Audio
----> 3 from gradio.components.bar_plot import BarPlot
4 from gradio.components.base import (
5 Component,

/usr/local/lib/python3.10/dist-packages/gradio/components/bar_plot.py in
5 from typing import Any, Callable, Literal
6
----> 7 import altair as alt
8 import pandas as pd
9 from gradio_client.documentation import document

/usr/local/lib/python3.10/dist-packages/altair/init.py in
2 version = "4.2.2"
3
----> 4 from .vegalite import *
5 from . import examples
6

/usr/local/lib/python3.10/dist-packages/altair/vegalite/init.py in
1 # flake8: noqa
----> 2 from .v4 import *

/usr/lib/python3.10/importlib/_bootstrap.py in find_and_load(name, import)

/usr/lib/python3.10/importlib/_bootstrap.py in find_and_load_unlocked(name, import)

/usr/lib/python3.10/importlib/_bootstrap.py in _load_unlocked(spec)

/usr/lib/python3.10/importlib/_bootstrap.py in _load_backward_compatible(spec)

/usr/local/lib/python3.10/dist-packages/google/colab/_import_hooks/_altair.py in load_module(self, fullname)
36 """Loads Altair normally and runs pre-initialization code."""
37 previously_loaded = fullname in sys.modules
---> 38 altair_module = imp.load_module(fullname, *self.module_info)
39
40 if not previously_loaded:

/usr/lib/python3.10/imp.py in load_module(name, file, filename, details)
243 return load_dynamic(name, filename, file)
244 elif type_ == PKG_DIRECTORY:
--> 245 return load_package(name, filename)
246 elif type_ == C_BUILTIN:
247 return init_builtin(name)

/usr/lib/python3.10/imp.py in load_package(name, path)
215 return _exec(spec, sys.modules[name])
216 else:
--> 217 return _load(spec)
218
219

/usr/local/lib/python3.10/dist-packages/altair/vegalite/v4/init.py in
1 # flake8: noqa
----> 2 from .schema import *
3 from .api import *
4
5 from ...datasets import list_datasets, load_dataset

/usr/local/lib/python3.10/dist-packages/altair/vegalite/v4/schema/init.py in
1 # flake8: noqa
----> 2 from .core import *
3 from .channels import *
4 SCHEMA_VERSION = 'v4.17.0'
5 SCHEMA_URL = 'https://vega.github.io/schema/vega-lite/v4.17.0.json'

/usr/local/lib/python3.10/dist-packages/altair/vegalite/v4/schema/core.py in
2 # tools/generate_schema_wrapper.py. Do not modify directly.
3
----> 4 from altair.utils.schemapi import SchemaBase, Undefined, _subclasses
5
6 import pkgutil

/usr/local/lib/python3.10/dist-packages/altair/utils/init.py in
----> 1 from .core import (
2 infer_vegalite_type,
3 infer_encoding_types,
4 sanitize_dataframe,
5 parse_shorthand,

/usr/local/lib/python3.10/dist-packages/altair/utils/core.py in
11 import warnings
12
---> 13 import jsonschema
14 import pandas as pd
15 import numpy as np

/usr/local/lib/python3.10/dist-packages/jsonschema/init.py in
11 import warnings
12
---> 13 from jsonschema._format import FormatChecker
14 from jsonschema._types import TypeChecker
15 from jsonschema.exceptions import SchemaError, ValidationError

/usr/local/lib/python3.10/dist-packages/jsonschema/_format.py in
9 import warnings
10
---> 11 from jsonschema.exceptions import FormatError
12
13 _FormatCheckCallable = typing.Callable[[object], bool]

/usr/local/lib/python3.10/dist-packages/jsonschema/exceptions.py in
13
14 from attrs import define
---> 15 from referencing.exceptions import Unresolvable as _Unresolvable
16
17 from jsonschema import _utils

/usr/local/lib/python3.10/dist-packages/referencing/init.py in
3 """
4
----> 5 from referencing._core import Anchor, Registry, Resource, Specification
6
7 all = ["Anchor", "Registry", "Resource", "Specification"]

/usr/local/lib/python3.10/dist-packages/referencing/_core.py in
84
85 @Frozen
---> 86 class Specification(Generic[D]):
87 """
88 A specification which defines referencing behavior.

/usr/local/lib/python3.10/dist-packages/referencing/_core.py in Specification()
110 [Specification[D], D],
111 Iterable[AnchorType[D]],
--> 112 ] = field(alias="anchors_in")
113
114 #: An opaque specification where resources have no subresources

TypeError: field() got an unexpected keyword argument 'alias'`

name 'args' is not defined

When trying to create the model, I can't seem to make it work, there is the following message: name 'args' is not defined

Epoch Demo Quality versus Normal Inference

Hi,

I have been finetuning my own model with your finetuning colab notebook and the demos I am hearing on wanb are of much higher quality than when I try and generate audio from the same models through your 'dance diffusion' colab. Do you know what parameters the previews are generated with and how I can recreate them when generating audio from my models.

Where to find the model for the finetune notebook and can we use other models from huggingface spaces?

For the finetuning notebook:

I don't find from where I need to download jmann-small-190k.ckpt from and
as jmann-small-190k.ckpt gives poor quality outputs on hugging face spaces I was curious whether I could use jmann-large-580k directly through the same code. (assuming it's the .bin file).

Thank you so much for releasing this research effort online!

VRAM requirements?

How much VRAM does it need for either training or just inference?

Please help with (probably) a simple error

I've just started playing around with Dance Diffusion, and I'm having the time of my life!
I've trained a model and had no problems with the Fine Tuning Notebook. However, I've encountered an error with 'arg' on the main Dance Diffusion Notebook.

Please see bottom panel for the final error.

Please forgive my ignorance...I suspect it's a simple problem, but I'm not a Python coder. Thank you in advance for your help.

Error: Tensor type unknown to einops

This is a new one that I haven't seen before.

`RuntimeError Traceback (most recent call last)
in
42
43 print("Regenerated audio samples")
---> 44 plot_and_hear(rearrange(generated, 'b d n -> d (b n)'), args.sample_rate)
45
46 for ix, gen_sample in enumerate(generated):

3 frames
/usr/local/lib/python3.8/dist-packages/einops/_backends.py in get_backend(tensor)
50 return backend
51
---> 52 raise RuntimeError('Tensor type unknown to einops {}'.format(type(tensor)))
53
54

RuntimeError: Tensor type unknown to einops <class 'NoneType'>`

Training from scratch

Hi, thanks for the excellent library! How would I train the model from scratch from a folder full of .wav files?

Paper?

Is this code based on some paper?

Does it work?

I havent seen people posting ways to generate music using AI, locally hosted that is.

Curious about issues encountered during training

Hi there

Cool project! Was just watching your interview with weights and biases.

I tried to do a similar thing with StyleGAN back in 2018 or so. Basically changed 2D to 1D everywhere and that was about it. Trained on raw waveforms of around 10,000 kick drum samples at I believe 44.1khz.

The results sounded pretty good, but I was always getting these high frequency artifacts. Sounded like a very light bitcrusher effect. I always thought it was some by-product of the convolution and upsampling layers. It seems like your results don't have this problem. I wonder if you encountered anything like this (or any other problems) and how you might have overcome them?

Would be great if you write a blog post or paper actually!

Cheers,
Liam

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs

Jooble