parlance-zz / g-diffuser-bot Goto Github PK

Discord bot and Interface for Stable Diffusion

License: MIT License

Python 100.00%

discord-bot stable-diffusion diffusers ai-art artificial-intelligence generative-art image-generation img2img inpainting latent-diffusion

g-diffuser-bot's Introduction

Listed below is a collection of solo projects undertaken purely for my personal interest. I will be uploading some of these projects as a demonstration of my abilities to potential employers, however, I did not have the confidence that any of these projects were worth preserving at the time. There may be other versions or unfinished breaking changes in what is published to my personal repository. If there is interest I will look into my old files to try to find other versions to publish.

Pre-2004:

Simple real-time fluid simulation and software renderer (BlitzPlus)
Partially functional NES emulator; CPU, PPU, debugger and basic memory mapper. Able to display Zelda title screen and run some homebrew demos without sound (BlitzPlus, 6502 ASM) (https://github.com/parlance-zz/parlance-zz/tree/main/pnes)
8kb "demo" real-time music synth with DSP, offline MIDI conversion tools (C++, OpenGL)
Multiplayer Xbox homebrew game (C++ with Xbox SDK / DirectX) (not my video, a user managed to get it running on the Xbox 360 through emulation https://www.youtube.com/watch?v=YwC_8DD_GJg)
Unfinished clone of Quake 3 game engine (before the source was public) on Xbox with BSP and MD3 rendering, native Xbox shader compiler from Q3 materials. (C++, XDK / DirectX and shader assembly). At the end of development it was capable of loading and rendering any Q3 map with all Q3 shader features supported; rendering maps in 4x splitscreen with dozens of animated characters at 60 fps on original Xbox. (https://github.com/parlance-zz/parlance-zz/tree/main/projectx)

Post 2004:

Bitmap -> Level conversion tool for Worms Armageddon (Win32 Game) (C++)
3-player patch / ROM hack for Seiken Densetsu 3 (SNES Game) (65C816 ASM) (https://www.romhacking.net/hacks/179/)
Small bug fix / feature patch to libopenmpt project for their interactive API (C++)
Compact contiguous heap allocators for LuaJIT (C++)
Simple lossy audio codec (basic MDCT quantization and entropy coding. Written in BlitzPlus)
Content pre-processing / resource compilation tool for simple 2D game engine (Python)
Experimental software GPU renderer for point clouds with post-processing (C++, CUDA, Nvidia CG) (https://github.com/parlance-zz/parlance-zz/tree/main/vtrace)
Proof-of-concept keylogger with remote control and optional code injection (C++, x86 ASM, PHP and SQL server backend) (https://github.com/parlance-zz/parlance-zz/tree/main/keylogger)
x64 protected mode boot loader with basic IO and memory management (x64 ASM, C) (https://github.com/parlance-zz/parlance-zz/tree/main/snsos)
Random tilemap generator for Unity from example maps with enforced constraints - variation on wave-function-collapse, basically a CSP solver with bitfields and fast intrinsics. (C++ and C#) (https://github.com/parlance-zz/parlance-zz/tree/main/rmx)
DSP pre-processing tool for machine learning to convert raw audio to and from quantized power-spectral-density log-spectrograms (C++, AVX2) (https://github.com/parlance-zz/Pulse)
Generative neural network for audio using power-spectral-density log-spectrograms (Python, Keras) (https://github.com/parlance-zz/parlance-zz/tree/main/keras)
Discord bot, interface and utilities for the diffusers library (stable-diffusion) - This project is where Fourier shaped noise out-painting was first developed and explored (https://github.com/parlance-zz/g-diffuser-bot)

While working for my current employer:

Conversion tool for Windows Server 2003 scheduled tasks exported binary format to Server 2008 XML format (C++)
Windows service for configurable wake-on-LAN proxy / broadcast (C++, WinPCAP)
CGI <-> Powershell interface for Microsoft IIS (Powershell)
Active Directory, Microsoft Exchange, SCCM, and OOB server management tools with web interface (Powershell)
Automated user / student change management tools (Powershell, XML, REST APIs)
Dynamically generated web phonebook, integrating Active Directory and Cisco Unified Communications (PHP, SQL, SOAP, Cisco AXL)
Key Module programming web app for Cisco Unified Communication environments (PHP, SOAP, AXL)
Powershell API and tools for Xerox Docushare (Powershell, SOAP)

Languages:

C++
C
Assembly
Python
C#
Powershell
PHP
SQL
Javascript
Java
Lua

Other Relevant Skills:

Math skills - Linear algebra, basic calculus, some complex analysis
Deep knowledge and experience with networking and common network protocols
Deep knowledge and experience with Windows client and server; OS internals and system administration
Project management and organization
Good written and oral communication skills

Caveats:

I do not have a computer science degree; dropped out in 2nd year for non-academic reasons
I do not have any other certifications related specifically to programming or coding, most of what I know is self-taught from an early age
Compared to other areas, my web dev experience (especially front end) is somewhat limited
At least for the time being I am not able to relocate, I can only accept remote positions

https://www.stablecabal.org

g-diffuser-bot's People

Stargazers

Watchers

g-diffuser-bot's Issues

Add Negative Prompts support

I have seen many webUIs have it, but as far as I know no bot, or frontend has implemented it yet. The closest thing I could find is: https://github.com/invoke-ai/InvokeAI/search?q=negative

add output_filename=etc

allow to specify filenames, including ability to insert values from arguments as part of the filename
eg sample('prompt', output_filename=f"{prompt}.{seed}.{steps}")

Unbreak shaped noise after code overhaul

This is broken again now, need to fix this after cleaning up code.

Remote command server separation and clustering

Add “--remote” server option to command_server to allow accepting connections from non-localhost using a pre-shared secret token for auth.

In remote mode the out_attachments list will use URLs instead of local file paths

G_diffuser_bot.py should support this and download from those URLs in remote mode

You should be able to specify a list of nodes to connect to your g_diffuser_bot_config.py and have the discord bot use all those nodes, robustly distributing commands

Better acknowledgement messages for discord bot

Alter ‘gimme a sec’ message to include acknowledgement of attached image (“Okay @lootsorrow, generating with unmasked image” or “generating with alpha masked image” or “generating with no image input”).

Notify user in response when they exceed param limits

Add bot command to show all default params and limits / ranges

command line option for single pipe

--single_pipe "txt2img"
--single_pipe "img2img"
--single_pipe "txt2music" etc
COOL?!!!!1

feature idea: panoramic or multi-stage outpainting mode

    feature idea: panoramic or multi-stage outpainting mode

basically if you take a 512x512 image where one half is image and the right half is erased (like the ghibli shack pic you were using during testing) and outpaint a new half for it, it would be nice if there were an easy way (from inside the gallery viewer even?) to take that new half and place it in its own new image all the way on the left so that the right side is once again blank/erased, and outpaint again, eventually creating a panorama (could also be done vertically of course, or maybe even in multiple directions, but that would require more RAM?).
For double extra bonus credits, the gallery viewer should be able to take the original starting image and automatically append it to the left side of the outputs from the second stage (so the second stage would be generating a 512x512 and the gallery would be taking your original 256x512 chunk and glueing it to the side) so you can more easily/quickly see which new images best 'match' the shape or feel of the original.

Originally posted by @lootsorrow in #36 (comment)

eliminate txt2img pipeline

create an identity image/seed/noise for use with img2img/inpainting to functionally act like txt2img, thus removing the need to have/load txt2img pipeline

replacement for 'for x in range' in cli

accept multivalued parameters/arguments in sample(), sample every combination

Give error / warning when model-name in args does not match the loaded model name

At least until the issue of loading models more dynamically is solved, not practical at the moment due to memory constraints.

move queue logic from discord bot to cmd server

this code is also very ripe for a big cleanup

Add operator in discord bot syntax to range parameter values over the number of samples

command param ranges with : operator

distributes reps over the defined (multi-dimensional if more than 1) param space ranges

Setup contribution guide as per github community guidelines

gdl.get_default_args improvement

get_default_args should take optional keyword args that will be injected into the namespace after parse_args()

Add the new arg/param system to discord bot to remember args and input images, allow saving and loading args

Integrate the new savable / loadable arg set system in the discord bot in a friendly way. Preserve saved args and input images by making paths for them under the inputs path.

Add select buttons using discord gui thing in addition to !select for grids, for outputs that are older or made by other users.

convert CKPT models to diffusers

[enhancement] RAM management

I noticed that 4 pipelines are loaded when the bots starts.
Namely: diffuser, txt2img, img2img, img_inp
Using the optimized mode, it took around 8.5 GB of Memory to load the bot.

possible solution:

import gc
example_model = ExampleModel().cuda()

del example_model

gc.collect()
# The model will normally stay on the cache until something takes it's place
torch.cuda.empty_cache()

https://gist.github.com/ejmejm/1baeddbbe48f58dbced9c019c25ebf71

Create an easy-to-read change log file after merging beta2 into main

Add automatic steps / scale calculation

If scale or steps is omitted but not the other, use steps ~= scale * 4.2 to derive the other

Command server robustness improvements

Have the command server check for cancellation after every sample to waste less time until diffusers pipes can actually be aborted

Prevent the command server from starting 2 commands simultaneously (as when discord bot is on multiple servers)

queue cmds when cmd server not ready in discord bot

auto restart cmd server if unresponsive

new admin cmd to restart cmd server

Cmd server should dynamically re-import g_diffuser_lib in DEBUG_MODE

Quick restart for CLI

add a command to restart the CLI with optional parameter to change models in the process

Create custom gallery viewer for outputs

The new output system is finally complete.

Build a custom output gallery browser that slurps up all the json in the outputs path and can easily browse, sort, filter, organize / tag / save, and one-click delete images.

Re-add k_diffusion samplers using existing code extensions to diffusers lib

At least until they are integrated in the core diffusers lib.

Create manual and wiki page for frequently asked questions

start_interactive_cli.bat doesn't have working command history

pushd %0\..\
cmd /k "conda run -n g_diffuser --no-capture-output python g_diffuser_cli.py --interactive"

I've removed this file for now because it is extremely annoying to not have command history. The issue is due to a conda bug which can easily be reproduced (at least on Windows) by running:

conda run -n some_env --no-capture-output python

.. then trying to use the up arrow to browse command history.

add the ability to explicitly specify output folder when using CLI

outputfolder= : create (if it doesn’t exist) a folder named , and place all outputs for the current batch into that folder. Allow for /?

Dynamic file name creation that the user can specify what pieces of data are used to create the file, eg filename= would create a file named 00001_12_dd.mm.yy, 00002_12_dd.mm.yy, 00003_.. etc
Eg filename= → 00001_hh.mm.ss_dd.mm.yy
-folder optional parameter would append the exact string value from filename= into the end of the output folder name
Eg outputfolder=Sexy Dogs
filename=
Would create a file called “00001_9/19/22.png” in a folder called “Sexy Dogs ”

Add fine-tuning / training utilities to g-diffuser-lib

discord bot !top command fails when message is too long

truncate top command output to 2000 chars

Important enhancements / bugfixes to outpainting

Is it possible to develop a latent space encoding / decoding for sparse non-linear data? (as opposed to dense linear data). If so, you could use diffusion models for things like text and tilemaps.

Try using the same techniques in _get_shaped_noise on the latent space representations of the src, noise, and masks, maybe try varying str or scale over steps for better annealing

Clean up DEBUG_MODE

cleanup debug printing, logging, exceptions, DEBUG_MODE

add global catch-all exception handler

move model args to sub-namespace in args

model_name
use_optimized
loaded_pipes
pipe_list

after this is done amend load_pipelines to save these globally, and amend get_samples to overwrite / re-attach these args to incoming args (with warning if any mismatch)

Add CLI command to process an entire folder of json arg files, regenerating each sample

This is imminently needed because this will form the basis of repeatable "test suites" when optimizing out-painting and other experimental tools.

Add new g-diffuser command "enhance"

Rescale the input image to a higher resolution and use inpainting with a constant mask of some opacity, effectively using SD for super-resolution. The same function could be aliased as a style transfer function, since it would do the same thing depending on opacity value and the prompt supplied.

Rename project to g-diffuser-lib

The scope of the project has expanded significantly (the discord bot will remain included and under active development)

Redo readme with gallery, better instructions, explanations

make a much nicer readme with images gallery and stuff

Seeds are broken

Need to either submit a patch to diffusers to find a workaround

Fix prompt folder naming truncation

g_diffuser_lib.get_filename_from_prompt needs to detect if truncation is occurring and if so append a short hash of the entire prompt. This will outputs from different prompts from going into the same folder without making the folder names excessively long.

Add fields to Command class

Command class should include a target_pipe field, this message should reflect whatever the target pipe is (important when mixed modalities come)

Also should have a used_pipe field filled out by the command server

Command class should have a used_args dictionary filled out by the command server with the complete list of all final used params, including any clipped params, adjusted resolution, un-changed default params, etc.

Change -x param in discord bot

Negative values should trigger repeat mode ad infinitum (or up to max repeat limit), until !stop is run.

Do not show repeated commands in the queue more than once.

Separate models into a models folder, add better output file / folder naming

add kwargs to user cli scripts

Add new g-diffuser command "expand"

Augments source image by putting automatically shifting / shrinking it into a larger frame with a generated mask, and uses in-painting in into the new area.

This command will be available in the CLI, discord bot, and http command server.

Upgrade discord bot to (optionally) use application or "slash" commands instead of a command prefix

Portable version?

Once install on one PC, the root folder could be copied to another PC and it should work without the need to install anything.

OSError: It looks like the config file at 'J:/SD/models/stable-diffusion-v1-4.ckpt' is not a valid JSON file.

clean installation, i got this error with my model i usually use with other repo.
Here's the full output
Traceback (most recent call last):
File "J:\MINICONDA\envs\g_diffuser\lib\site-packages\diffusers\configuration_utils.py", line 272, in get_config_dict
config_dict = cls._dict_from_json_file(config_file)
File "J:\MINICONDA\envs\g_diffuser\lib\site-packages\diffusers\configuration_utils.py", line 324, in _dict_from_json_file
text = reader.read()
File "J:\MINICONDA\envs\g_diffuser\lib\codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 64: invalid start byte

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "J:\SD\g_diffuser_cli.py", line 206, in
main()
File "J:\SD\g_diffuser_cli.py", line 89, in main
gdl.load_pipelines(args)
File "J:\SD\g_diffuser_lib.py", line 639, in load_pipelines
pipe = pipe_map[pipe_name].from_pretrained(
File "J:\MINICONDA\envs\g_diffuser\lib\site-packages\diffusers\pipeline_utils.py", line 290, in from_pretrained
config_dict = cls.get_config_dict(
File "J:\MINICONDA\envs\g_diffuser\lib\site-packages\diffusers\configuration_utils.py", line 274, in get_config_dict
raise EnvironmentError(f"It looks like the config file at '{config_file}' is not a valid JSON file.")
OSError: It looks like the config file at 'J:/SD/models/stable-diffusion-v1-4.ckpt' is not a valid JSON file.

More control params for out-painting

Add brightness, contrast, and color tone adjustment params for shaped noise outpainting

Build a webui-esque front-end that runs on top of g-diffuser-lib command server

Try to make it actually look and run nice.

Javascript and CSS is an extremely weak area for me, so if anyone is willing to step up and take a crack at this it would be extremely appreciated. The json command format is finally solidified enough that any work built on this won't need to be changed much if at all.

music2music jam session

take a ~30 second backing track (or generate one with txt2music or whatever2music!)
User listens to it through headphones while playing an instrument into a microphone
take the input from the microphone (Stream A) and combine it with the backing track (Stream B) into a single stream (Stream C)
music2music with Stream C as the input, creating more of the same/similar backing track (Stream D)
5a) if music2music can generate 1 second of music in under 1 second, then just play Stream D live to the user's headphones
5aii) continue feeding Stream D into step 3 and then step 4?
5b) if music2music cannot generate 1 second of music in under 1 second, append Stream D to the end of Stream A (The Song)
5bii) play The Song to the audio output device while continuously generating new chunks (Stream E, F, G, etc) and appending them to the end of The Song
???
Virtual Live Improvisational Jam Session Band For People With No Friends!

Create docker image or something more easily deployable

Never done this before, will investigate later unless someone else would like to do the honors.

parlance-zz / g-diffuser-bot Goto Github PK

g-diffuser-bot's Introduction

g-diffuser-bot's People

Stargazers

Watchers

Forkers

g-diffuser-bot's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs