GithubHelp home page GithubHelp logo

jozsefszalma / intranet_image_generator Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 0.0 261 KB

Generating images with diffusion models on a mobile device, with an intranet GPU box as backend

License: Other

Jupyter Notebook 24.30% JavaScript 2.37% TypeScript 21.97% Ruby 7.29% Java 22.51% Objective-C 7.03% Objective-C++ 3.48% Python 11.05%
backend fun gpu-inference huggingface-diffusers image-generation mobile-app python pytorch react-native rest-api prompt-engineering diffusion

intranet_image_generator's Introduction

Intranet Image Generator

I wanted to show my family what I do for a living and what better way to make Computer Vision interesting than diffusion models?

I could have just shown them DALL-E 2, Midjourney, or the million mobile apps built on SD already out there, however if I built it myself then I can run it for free and retain end-to-end control over all aspects, e.g. which model I use, possibility to add parental controls to the prompts etc.

So, I built:

  • a simple React Native mobile app as frontend, that takes a prompt as input and displays the generated images
  • a Python backend, with a Flask-based API and a diffusion model running inference on an RTX 3090 GPU, with plans to containerize using Docker

Work in progress!

How it works:

screenshot of the mobile client, scary dragon breathing fire screenshot of the mobile client, magical unicorn with wings screenshot of the mobile client, pirate ship at sea, black sails, stormy weather

Set up:

  1. Environment variables on the backend (e.g. in a .env file)
  • HF_KEY: Your Hugging Face API key
  • IMG_DIR_WIN and IMG_DIR_DOCKER: Location to store the generated images
  • PROMPT_PREFIX and PROMPT_SUFFIX: Optional, if you want to prefix or suffix the prompt with anything (e.g. cartoonish, kid-friendly)
  • NEGATIVE_PROMPT: Optional, but should be used for parental controls (e.g. add "scary" to prevent convergence on scary images, the same with NSFW concepts, etc.)
  • MODEL_ID: Optional, Hugging Face model ID, using SD 2.1 if not defined
  1. set a fixed LAN IP address on the machine running the backend and expose port 5000 to your intranet

  2. set up the IP address of the backend on the mobile app under the kebab menu (look for โ‹ฎ in the upper right corner)

  3. As of now, to get the mobile app running, you need to set up a React Native development environment, compile the app from source and load the .apk onto an Android device using developer mode.
    Here is a handy guide: https://reactnative.dev/docs/environment-setup?guide=native

Known issues and Disclaimers:

  • This is a hobby prototype that takes quite a bit of tech skills to get to work and is not production ready. You shouldn't use it if you don't understand the technology involved.
    Read the license terms, especially Section 5 โ€“ Disclaimer of Warranties and Limitation of Liability.
  • I couldn't test if Docker works at all, as my NVIDIA drivers do not want to play with Docker in my Windows Linux Subsystem
  • The mobile app still has the default Android icon and is named "mobile_client"
  • Minimal security (not making any attempts to sanitize inputs or authenticate clients), the backend is only intended to be used behind a NAT router for demo purposes, not ready to be exposed to the Internet.
  • I recommend setting up an extensive negative prompt as parental controls, in addition to using the Stability safety filter, and not letting kids play with diffusion models without adult supervision, as most of these models will produce age-inappropriate content with minimal effort and curiosity.

License:

Copyright 2023, Jozsef Szalma
Creative Commons Attribution-NonCommercial 4.0 International Public License
https://creativecommons.org/licenses/by-nc/4.0/legalcode

intranet_image_generator's People

Contributors

jozsefszalma avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.