GithubHelp home page GithubHelp logo

nguyentran6698 / fastapi_ml_stablediffusion Goto Github PK

View Code? Open in Web Editor NEW

This project forked from laurauzcategui/fastapi_ml_stablediffusion

0.0 0.0 0.0 1.93 MB

A Stable Diffusion Demo with React and FastAPI

JavaScript 63.84% Python 20.75% HTML 15.41%

fastapi_ml_stablediffusion's Introduction

Stable Diffusion Application using React with FastAPI

Author: Laura Uzcategui

Hello, I would like to share with you this small project that involves creating an Stable Diffusion App with React and FastAPI ๐Ÿ˜.

Let's dive in by introducing what is the project about and why I have done it.

1. What is this project about?

On this project you can checkout how you can build an basic React Application using FastAPI as backend to be able to generate images with Stable Diffusion using the Diffusers library from Hugging Face.

2. Why I did built it?

For the past couple of months I've been hearing a lot about Stable Diffusion but I haven't had the time to explore it myself, then I saw an opportunity to get to know it but also do a refresh on my frontend skills which were a bit out-of-date by building a React App.

Additionally, I wanted to learn more about FastAPI and this project is built on top of the webinar provided by the author of the library Sebastian Ramirez.

3. Concepts

Before diving into the project, I'll give you an overview of the main concepts and technologies that have been used, and pointers to resources where you can learn more about it.

3.1. What is Stable Diffusion?

Stable Diffusion is an AI technique comprised of a set of components to perform Image Generation from Text. What make it so great is that is available to everyone compared to other models such as Dall-e.

In a short summary about Stable Diffusion, what happens is as follows:

  • You write a text that will be your prompt to generate the image you wish for.
  • This text is passed to the first component of the model a Text understander or Encoder, which generates token embedding vectors.
  • Then this representation is received by a UNet along with a Tensor made up of noise and after a series of steps it will generate a Processed image tensor array.
  • The processed image tensor array is received then by an Image Decoder (Autoencoder decoder) that will generate the final image.

3.1.1 Resources

3.2 React as Front-end

Perhaps you have heard already about React, so you can skip this section, if not as a short summary about React.

React is a JavaScript library for building UI. It was created by Facebook (aka Meta) around 2013.

As far as I could learn while working on this project, React works in declarative way, and one of the things that make it cool is that you can delcare components that has their own state and later on can be re-used across the application you are building.

Coming from a purely backend experience I can say it wasn't super difficult to get and incorporate it to what I wanted to achieve which was to build a simple UI that will allow me to write text and wait for the backend to generate an image.

3.2.1 Resources

3.3 FastAPI as backend

In here, things become more interesting to me when backend comes in and I discovered all you can do working with FastAPI.

As their website state:

FastAPI is a modern, fast (high-performance), web framework for building APIs with Python 3.7+ based on standard Python type hints.

What I like from FastAPI is the ability to create API's quickly without too much hassle. I loved the fact I could define the routes and immediately check them out by looking at the API docs provided by Swaggeer.

Additionally to this, you could define your data model as a Class by using pydantic inheriting from BaseModel.

3.3.1 Resources

Project Structure

โ”œโ”€โ”€ backend                          ----> Contains all the backend FastAPI work
โ”‚ย ย  โ”œโ”€โ”€ dev-requirements.txt
โ”‚ย ย  โ”œโ”€โ”€ main.py                      ----> Endpoints definition
โ”‚ย ย  โ”œโ”€โ”€ requirements.txt
โ”‚ย ย  โ”œโ”€โ”€ run_uvicorn.py
โ”‚ย ย  โ”œโ”€โ”€ schemas.py                   ----> Define your data models here
โ”‚ย ย  โ””โ”€โ”€ services.py                  ----> Define all the heavy load work to be done here.
|                                          in this case all the HuggingFace setup and work for generating the 
|                                          stable diffusion image
โ””โ”€โ”€ frontend
    โ”œโ”€โ”€ README.md
    โ”œโ”€โ”€ package-lock.json
    โ”œโ”€โ”€ package.json
    โ”œโ”€โ”€ public
    โ”‚ย ย  โ”œโ”€โ”€ index.html
    โ”‚ย ย  โ”œโ”€โ”€ manifest.json
    โ”‚ย ย  โ””โ”€โ”€ robots.txt
    โ””โ”€โ”€ src
        โ”œโ”€โ”€ App.jsx                 ---> Main App definition where you will embed your components as well
        โ”œโ”€โ”€ components
        โ”‚ย ย  โ”œโ”€โ”€ ErrorMessage.jsx
        โ”‚ย ย  โ”œโ”€โ”€ GenImage.jsx        ---> Definition of the UI components as well as the call to the backend 
                                         using fetch API
        โ”‚ย ย  โ””โ”€โ”€ Header.jsx          ---> Minimal Header definition
        โ””โ”€โ”€ index.js

Setup and run it :)

These are the steps to see it running :)

From your backend folder:

  1. You need a HugginFace token. Checkout how to create one here
  2. Once you have your token created it follow the next steps
cd backend 
touch .env 
  1. Open the .env file and add your token there like this:
HF_TOKEN=MY_FANCY_TOKEN

WARNING: Make sure you never commit your keys or tokens file :) Add this file to your .gitignore

  1. Create your environment and activate your environment
python -m venv venv 
source venv/bin/activate
pip install -r requirements.txt
  1. Startup your backend
uvicorn main:app --port 8885

From your frontend folder

cd frontend
npm install 
npm start

How to Generate an Image?

Fill the parameters as follows:

  • Prompt: Text to express the wish for your image.

    Example: A red racing car winning the formula-1

  • Seed: a number indicating the seed so that your output will be deterministic.

  • Guidance scale: it's a float that basically try to enforce that the generation of the image better match the prompt.

    Side note: if you are curious about this parameter, you can do a deep dive by reading the paper Classifier-free Diffusion Guidance

  • Number of Inference Steps: It's a number usually between 50 and 100. This number indicates de amount of denoising steps. Keep in mind the higher the number the more time the inference ( image generation ) will take.

You should be able to see your frontend like this one below:

Demo example

Resources I've used to build the project

ToDo

  • Add endpoint that shows a grid of images instead of one
  • Update the service backend with a new function to return multiple images.
  • Improve the UI by customizing Bulma with a different style.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.