GithubHelp home page GithubHelp logo

nimadez / mental-diffusion Goto Github PK

View Code? Open in Web Editor NEW
78.0 2.0 9.0 4.54 MB

Stable diffusion command-line interface

License: MIT License

Batchfile 0.85% Python 86.91% JavaScript 12.24%
stable-diffusion img2img safetensors txt2img realesrgan upscaler sdxl diffusers command-line websockets cli headless inpaint machine-learning linux shell terminal-based

mental-diffusion's Introduction

Mental Diffusion

Stable diffusion command-line interface
For debian-based linux distributions
Powered by Diffusers

Version 0.7.7 alpha
Python 3.11 / 3.12
Torch 2.2.2 +cu121

Features
Installation
Help
Known Issues
FAQ

Recent changes

- Update config.json
- Remove true/false from boolean arguments (--arg is enough)
- Update websockets server and add node_module for node.js clients
- New curses terminal menu (support arrow keys)
- Experimental client (webui) has been removed
- New python venv package installer
- Add support for python 3.12
- Update pytorch to 2.2.2+cu121
- Update all python dependencies
- Electron was removed in favor of a simple http address
- New --pipeline argument (txt2img, img2img, inpaint...)
- Fix latent preview for sdxl
- Simplified seed generation for maximum compatibility
- New progress callback
- New real-esrgan downloader with progress bar
- Bug fixes and code management

Features

  • Fast startup
  • Command-line interface
  • Websockets server
  • Curses terminal menu
  • SD 1.5, SDXL, SDXL-Turbo
  • VAE, TAESD, LoRA
  • Text-to-Image, Image-to-Image, Inpaint
  • Latent preview for SD/SDXL (bmp/webp)
  • Upscaler Real-ESRGAN x2/x4/x4anime
  • Read and write PNG with metadata
  • Optimized for low specs
  • Support CPU and GPU
  • Support http proxy
  • JSON config file
  • Extras: ComfyUI Bridge for VS Code

Installation

  • Install Python 3.11 / 3.12
  • Install Python packages in a venv (see requirements.txt)
  • Terminal menu requirements: nano, node, xdg-open
git clone https://github.com/nimadez/mental-diffusion.git
source install.sh
nano src/config.json

Start headless

SD15: mdx.py -p "prompt" -c /sd15.safetensors -st 20 -g 7.5 -f img_{seed}
SDXL: mdx.py -p "prompt" -mode xl -c /sdxl.safetensors -w 1024 -h 1024 -st 30 -g 8.0 -f img_{seed}
Img2Img: mdx.py -p "prompt" -pipe img2img -i image.png -sr 0.5
Inpaint: mdx.py -p "prompt" -pipe inpaint -i image.png -m mask.png
Upscale: mdx.py -upx4 image.png

Start terminal menu

~/.venv/python3 src/mdx.py

Start server

~/.venv/python3 src/mdx.py -serv

Show preview

open http://localhost:port

Run node.js client

nano tests/ws-client.js && node tests/ws-client.js
These models are downloaded as needed after launch:
openai/clip-vit-large-patch14 (hf-cache)
laion/CLIP-ViT-bigG-14-laion2B-39B-b160k (hf-cache)
madebyollin/taesd (hf-cache)
madebyollin/taesdxl (hf-cache)
RealESRGAN_x2plus.pth (optional, src/models/realesrgan)
RealESRGAN_x4plus.pth (optional, src/models/realesrgan)
RealESRGAN_x4plus_anime_6B.pth (optional, src/models/realesrgan)

Help

--help                     show this help message and exit

--server     -serv         start websockets server (port: config.json)
--upscaler   -upx4  str    /path-to-image.png, upscale image x4
--metadata   -meta  str    /path-to-image.png, extract metadata from PNG

--model      -mode  str    sd/xl, set checkpoint model type (def: config.json)
--pipeline   -pipe  str    txt2img/img2img/inpaint, define pipeline (def: txt2img)
--checkpoint -c     str    checkpoint .safetensors path (def: config.json)
--vae        -v     str    optional vae .safetensors path (def: null)
--lora       -l     str    optional lora .safetensors path (def: null)
--lorascale  -ls    float  0.0-1.0, lora scale (def: 1.0)
--scheduler  -sc    str    ddim, ddpm, lcm, pndm, euler_anc, euler, lms (def: config.json)
--prompt     -p     str    positive prompt text input (def: sample)
--negative   -n     str    negative prompt text input (def: empty)
--width      -w     int    width value must be divisible by 8 (def: config.json)
--height     -h     int    height value must be divisible by 8 (def: config.json)
--seed       -s     int    seed number, -1 to randomize (def: -1)
--steps      -st    int    steps from 1 to 100+ (def: 20)
--guidance   -g     float  0.0-20.0+, how closely linked to the prompt (def: 8.0)
--strength   -sr    float  0.0-1.0, how much respect the image should pay to the original (def: 1.0)
--image      -i     str    PNG file path or base64 PNG (def: null)
--mask       -m     str    PNG file path or base64 PNG (def: null)
--base64     -64           do not save the image to a file, get base64 only
--filename   -f     str    filename prefix (no png extension)
--batch      -b     int    enter number of repeats to run in batch (def: 1)
--preview    -pv           stepping is slower with preview enabled
--upscale    -up    str    auto-upscale x2, x4, x4anime (def: null)

[http server requests]
curl http://localhost:port/preview --output preview.bmp
curl http://localhost:port/replay --output preview.webp
curl http://localhost:port/progress (return { step: int, timestep: int })
curl http://localhost:port/interrupt
curl http://localhost:port/config (return config.json)

* --server or --batch is recommended because there is no need to reload the checkpoint
* add "{seed}" to --filename, which will be replaced by seed later
* ~/ and $USER accepted for file and directory paths
[config.json]
use_cpu = use cpu instead of gpu
http_proxy = bypass censorship (e.g. http://localhost:8118)
host = localhost, 192.168.1.10, 0.0.0.0
port = a valid port number
output = image output directory (e.g. ~/img_output)
onefile = save original image when auto-upscale is enabled?
interrupt_save = save image after interrupt?

server

Test Websockets Client

The experimental websockets client has been removed, see "tests/ws-client.js" for example.

Test Latent Preview

Test LoRA + VAE


* Juggernaut Aftermath, TRCVAE, World of Origami

Test SDXL


* OpenDalleV1.1

Test SDXL-Turbo


* A cinematic shot of a baby racoon wearing an intricate italian priest robe.

Known Issues

:: Stepping is slower with preview enabled
We used the BMP format which has no compression.
Reminder: one solution is to set "pipe._guidance_scale" to 0.0 after 40%

:: Interrupt operation does not work
You have to wait for the current step to finish,
the interrupt operation is applied at the end of each step.

:: The server does not start from the terminal menu
If you are running MD for the first time and the
huggingface cache is missing, start the server using:
~/.venv/python3 src/mdx.py -serv

:: CUDA out of memory error
nvidia-driver does not support system memory fallback

FAQ

:: How to load SDXL with 3.5 GB VRAM?
To load SDXL, you need at least 16 GB RAM and a swap file/partition.

:: How to download HuggingFace models in a specific path?
Use symlink or export path:
ln -s /path-to-huggingface-cache ~/.cache/huggingface
export HF_HOME=d:\path-to-huggingface-cache

History

↑ Migration to linux environment
↑ Back to the roots (diffusers)
↑ Ported to VS Code
↑ Switch from Diffusers to ComfyUI
↑ Upgrade from sdkit to Diffusers
↑ Undiff renamed to Mental Diffusion
↑ Undiff started with "sdkit"
↑ Created for my personal use

"AI will take us back to the age of terminals."

License

Code released under the MIT license.

Credits

mental-diffusion's People

Contributors

nimadez avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

mental-diffusion's Issues

Python path broken under Linux

This line fails on Linux due to the path:

term.sendText(`${pathComfy}\\python_embeded\\python.exe -s ${pathComfy}/ComfyUI/main.py --enable-cors-header --preview-method auto ${args}`);

My ComfyUI install doesn't have a python_embeded folder. Ideally it should be using the venv's python. If this could be specified by setting variable, it'd work for both Linux and Windows.

Issue with loading checkpoints list

I am using ComfyUI windows portable version (latest). The get_object_info endpoint returns just checkpoint names and it doesn't have the checkpoints\\ prefix. As of now checkpoints list is loaded only if it has that prefix. So, for me the checkpoints list is always empty.

BTW, love this interface. I think it's much better than ComfyBox and StableSwarmUI

Support for dynamic tags/inputs similar to StableSwarmUI

StableSwarmUI has a feature where you can prefix a node's title with SwarmUI: and it's automatically converted and available as an input. I am assuming that the current tags feature works similar to that except that we need to edit the JSON workflow to tag the inputs. Is it possible to do something similar to StableSwarmUI where we can prefix the node title with MD: or MentalDiffusion: and it would automatically be converted to an input in the UI?

But eventually, we would need some sort of metadata supported for each node(changes required in ComfyUI) and we should use the metadata to automatically recognise the inputs

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.