GithubHelp home page GithubHelp logo

parlance-zz / g-diffuser-bot Goto Github PK

View Code? Open in Web Editor NEW
278.0 9.0 23.0 8.82 MB

Discord bot and Interface for Stable Diffusion

Home Page: https://www.g-diffuser.com

License: MIT License

Python 100.00%
discord-bot stable-diffusion diffusers ai-art artificial-intelligence generative-art image-generation img2img inpainting latent-diffusion

g-diffuser-bot's Introduction

Listed below is a collection of solo projects undertaken purely for my personal interest. I will be uploading some of these projects as a demonstration of my abilities to potential employers, however, I did not have the confidence that any of these projects were worth preserving at the time. There may be other versions or unfinished breaking changes in what is published to my personal repository. If there is interest I will look into my old files to try to find other versions to publish.

Pre-2004:

  • Simple real-time fluid simulation and software renderer (BlitzPlus)
  • Partially functional NES emulator; CPU, PPU, debugger and basic memory mapper. Able to display Zelda title screen and run some homebrew demos without sound (BlitzPlus, 6502 ASM) (https://github.com/parlance-zz/parlance-zz/tree/main/pnes)
  • 8kb "demo" real-time music synth with DSP, offline MIDI conversion tools (C++, OpenGL)
  • Multiplayer Xbox homebrew game (C++ with Xbox SDK / DirectX) (not my video, a user managed to get it running on the Xbox 360 through emulation https://www.youtube.com/watch?v=YwC_8DD_GJg)
  • Unfinished clone of Quake 3 game engine (before the source was public) on Xbox with BSP and MD3 rendering, native Xbox shader compiler from Q3 materials. (C++, XDK / DirectX and shader assembly). At the end of development it was capable of loading and rendering any Q3 map with all Q3 shader features supported; rendering maps in 4x splitscreen with dozens of animated characters at 60 fps on original Xbox. (https://github.com/parlance-zz/parlance-zz/tree/main/projectx)

Post 2004:

While working for my current employer:

  • Conversion tool for Windows Server 2003 scheduled tasks exported binary format to Server 2008 XML format (C++)
  • Windows service for configurable wake-on-LAN proxy / broadcast (C++, WinPCAP)
  • CGI <-> Powershell interface for Microsoft IIS (Powershell)
  • Active Directory, Microsoft Exchange, SCCM, and OOB server management tools with web interface (Powershell)
  • Automated user / student change management tools (Powershell, XML, REST APIs)
  • Dynamically generated web phonebook, integrating Active Directory and Cisco Unified Communications (PHP, SQL, SOAP, Cisco AXL)
  • Key Module programming web app for Cisco Unified Communication environments (PHP, SOAP, AXL)
  • Powershell API and tools for Xerox Docushare (Powershell, SOAP)

Languages:

  • C++
  • C
  • Assembly
  • Python
  • C#
  • Powershell
  • PHP
  • SQL
  • Javascript
  • Java
  • Lua

Other Relevant Skills:

  • Math skills - Linear algebra, basic calculus, some complex analysis
  • Deep knowledge and experience with networking and common network protocols
  • Deep knowledge and experience with Windows client and server; OS internals and system administration
  • Project management and organization
  • Good written and oral communication skills

Caveats:

  • I do not have a computer science degree; dropped out in 2nd year for non-academic reasons
  • I do not have any other certifications related specifically to programming or coding, most of what I know is self-taught from an early age
  • Compared to other areas, my web dev experience (especially front end) is somewhat limited
  • At least for the time being I am not able to relocate, I can only accept remote positions

https://www.stablecabal.org

g-diffuser-bot's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

g-diffuser-bot's Issues

add output_filename=etc

allow to specify filenames, including ability to insert values from arguments as part of the filename
eg sample('prompt', output_filename=f"{prompt}.{seed}.{steps}")

Remote command server separation and clustering

Add “--remote” server option to command_server to allow accepting connections from non-localhost using a pre-shared secret token for auth.

In remote mode the out_attachments list will use URLs instead of local file paths

G_diffuser_bot.py should support this and download from those URLs in remote mode

You should be able to specify a list of nodes to connect to your g_diffuser_bot_config.py and have the discord bot use all those nodes, robustly distributing commands

Better acknowledgement messages for discord bot

Alter ‘gimme a sec’ message to include acknowledgement of attached image (“Okay @lootsorrow, generating with unmasked image” or “generating with alpha masked image” or “generating with no image input”).

Notify user in response when they exceed param limits

Add bot command to show all default params and limits / ranges

feature idea: panoramic or multi-stage outpainting mode

    feature idea: panoramic or multi-stage outpainting mode

basically if you take a 512x512 image where one half is image and the right half is erased (like the ghibli shack pic you were using during testing) and outpaint a new half for it, it would be nice if there were an easy way (from inside the gallery viewer even?) to take that new half and place it in its own new image all the way on the left so that the right side is once again blank/erased, and outpaint again, eventually creating a panorama (could also be done vertically of course, or maybe even in multiple directions, but that would require more RAM?).
For double extra bonus credits, the gallery viewer should be able to take the original starting image and automatically append it to the left side of the outputs from the second stage (so the second stage would be generating a 512x512 and the gallery would be taking your original 256x512 chunk and glueing it to the side) so you can more easily/quickly see which new images best 'match' the shape or feel of the original.

Originally posted by @lootsorrow in #36 (comment)

eliminate txt2img pipeline

create an identity image/seed/noise for use with img2img/inpainting to functionally act like txt2img, thus removing the need to have/load txt2img pipeline

[enhancement] RAM management

I noticed that 4 pipelines are loaded when the bots starts.
Namely: diffuser, txt2img, img2img, img_inp
Using the optimized mode, it took around 8.5 GB of Memory to load the bot.

image

possible solution:

import gc
example_model = ExampleModel().cuda()

del example_model

gc.collect()
# The model will normally stay on the cache until something takes it's place
torch.cuda.empty_cache()

https://gist.github.com/ejmejm/1baeddbbe48f58dbced9c019c25ebf71

Command server robustness improvements

Have the command server check for cancellation after every sample to waste less time until diffusers pipes can actually be aborted

Prevent the command server from starting 2 commands simultaneously (as when discord bot is on multiple servers)

queue cmds when cmd server not ready in discord bot

auto restart cmd server if unresponsive

new admin cmd to restart cmd server

Cmd server should dynamically re-import g_diffuser_lib in DEBUG_MODE

Quick restart for CLI

add a command to restart the CLI with optional parameter to change models in the process

Create custom gallery viewer for outputs

The new output system is finally complete.

Build a custom output gallery browser that slurps up all the json in the outputs path and can easily browse, sort, filter, organize / tag / save, and one-click delete images.

start_interactive_cli.bat doesn't have working command history

pushd %0\..\
cmd /k "conda run -n g_diffuser --no-capture-output python g_diffuser_cli.py --interactive"

I've removed this file for now because it is extremely annoying to not have command history. The issue is due to a conda bug which can easily be reproduced (at least on Windows) by running:

conda run -n some_env --no-capture-output python

.. then trying to use the up arrow to browse command history.

add the ability to explicitly specify output folder when using CLI

outputfolder= : create (if it doesn’t exist) a folder named , and place all outputs for the current batch into that folder. Allow for /?

Dynamic file name creation that the user can specify what pieces of data are used to create the file, eg filename= would create a file named 00001_12_dd.mm.yy, 00002_12_dd.mm.yy, 00003_.. etc
Eg filename= → 00001_hh.mm.ss_dd.mm.yy
-folder optional parameter would append the exact string value from filename= into the end of the output folder name
Eg outputfolder=Sexy Dogs
filename=
Would create a file called “00001_9/19/22.png” in a folder called “Sexy Dogs ”

Important enhancements / bugfixes to outpainting

Is it possible to develop a latent space encoding / decoding for sparse non-linear data? (as opposed to dense linear data). If so, you could use diffusion models for things like text and tilemaps.

Try using the same techniques in _get_shaped_noise on the latent space representations of the src, noise, and masks, maybe try varying str or scale over steps for better annealing

Clean up DEBUG_MODE

cleanup debug printing, logging, exceptions, DEBUG_MODE

add global catch-all exception handler

move model args to sub-namespace in args

model_name
use_optimized
loaded_pipes
pipe_list

after this is done amend load_pipelines to save these globally, and amend get_samples to overwrite / re-attach these args to incoming args (with warning if any mismatch)

Add new g-diffuser command "enhance"

Rescale the input image to a higher resolution and use inpainting with a constant mask of some opacity, effectively using SD for super-resolution. The same function could be aliased as a style transfer function, since it would do the same thing depending on opacity value and the prompt supplied.

Seeds are broken

Need to either submit a patch to diffusers to find a workaround

Fix prompt folder naming truncation

g_diffuser_lib.get_filename_from_prompt needs to detect if truncation is occurring and if so append a short hash of the entire prompt. This will outputs from different prompts from going into the same folder without making the folder names excessively long.

Add fields to Command class

Command class should include a target_pipe field, this message should reflect whatever the target pipe is (important when mixed modalities come)

Also should have a used_pipe field filled out by the command server

Command class should have a used_args dictionary filled out by the command server with the complete list of all final used params, including any clipped params, adjusted resolution, un-changed default params, etc.

Change -x param in discord bot

Negative values should trigger repeat mode ad infinitum (or up to max repeat limit), until !stop is run.

Do not show repeated commands in the queue more than once.

Add new g-diffuser command "expand"

Augments source image by putting automatically shifting / shrinking it into a larger frame with a generated mask, and uses in-painting in into the new area.

This command will be available in the CLI, discord bot, and http command server.

Portable version?

Once install on one PC, the root folder could be copied to another PC and it should work without the need to install anything.

OSError: It looks like the config file at 'J:/SD/models/stable-diffusion-v1-4.ckpt' is not a valid JSON file.

clean installation, i got this error with my model i usually use with other repo.
Here's the full output
Traceback (most recent call last):
File "J:\MINICONDA\envs\g_diffuser\lib\site-packages\diffusers\configuration_utils.py", line 272, in get_config_dict
config_dict = cls._dict_from_json_file(config_file)
File "J:\MINICONDA\envs\g_diffuser\lib\site-packages\diffusers\configuration_utils.py", line 324, in _dict_from_json_file
text = reader.read()
File "J:\MINICONDA\envs\g_diffuser\lib\codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 64: invalid start byte

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "J:\SD\g_diffuser_cli.py", line 206, in
main()
File "J:\SD\g_diffuser_cli.py", line 89, in main
gdl.load_pipelines(args)
File "J:\SD\g_diffuser_lib.py", line 639, in load_pipelines
pipe = pipe_map[pipe_name].from_pretrained(
File "J:\MINICONDA\envs\g_diffuser\lib\site-packages\diffusers\pipeline_utils.py", line 290, in from_pretrained
config_dict = cls.get_config_dict(
File "J:\MINICONDA\envs\g_diffuser\lib\site-packages\diffusers\configuration_utils.py", line 274, in get_config_dict
raise EnvironmentError(f"It looks like the config file at '{config_file}' is not a valid JSON file.")
OSError: It looks like the config file at 'J:/SD/models/stable-diffusion-v1-4.ckpt' is not a valid JSON file.

music2music jam session

  1. take a ~30 second backing track (or generate one with txt2music or whatever2music!)
  2. User listens to it through headphones while playing an instrument into a microphone
  3. take the input from the microphone (Stream A) and combine it with the backing track (Stream B) into a single stream (Stream C)
  4. music2music with Stream C as the input, creating more of the same/similar backing track (Stream D)
    5a) if music2music can generate 1 second of music in under 1 second, then just play Stream D live to the user's headphones
    5aii) continue feeding Stream D into step 3 and then step 4?
    5b) if music2music cannot generate 1 second of music in under 1 second, append Stream D to the end of Stream A (The Song)
    5bii) play The Song to the audio output device while continuously generating new chunks (Stream E, F, G, etc) and appending them to the end of The Song
  5. ???
  6. Virtual Live Improvisational Jam Session Band For People With No Friends!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.