GithubHelp home page GithubHelp logo

ttl-text2img's Introduction

TTL text2img

Text-to-image generation using diffusion models involves iteratively refining images based on textual descriptions. This approach gradually enhances image quality, ensuring coherence with the given text. It enables the generation of visually appealing and contextually relevant images that accurately represent the textual input.
Demo in huggingface: TTL-text2img

Installation

Instructions on how to install and set up TTL-text2img

git clone https://github.com/TranafLee/TTL-text2img.git
cd TTL-text2img/
# create a virtual environment to keep global install clean
python -m venv .venv 
source .venv/bin/activate
# install all necessary packages
pip install -r requirements.txt

Usage

Stage 1: Fine-tune the base model

Exact path leading to your dataset folder has to be ./data. Images and text files has to be all together in the folder. If an image is named 001.jpg its relative txt file should be named 001.txt and so on.

python train.py \
  --data_dir './data' \
  --train_upsample False \
  --project_name 'base_tuning_wandb' \
  --batch_size 4 \
  --learning_rate 1e-04 \
  --side_x 64 \
  --side_y 64 \
  --resize_ratio 1.0 \
  --uncond_p 0.2 \
  --resume_ckpt 'ckpt_to_resume_from.pt' \
  --checkpoints_dir 'my_local_checkpoint_directory' \

Stage 2: Fine-tune the super-resolution model

python train.py \
  --data_dir '/userdir/data/mscoco' \
  --train_upsample True \
  --image_to_upsample './images/low_res_img.png' \
  --upscale_factor 4 \
  --side_x 64 \
  --side_y 64 \
  --uncond_p 0.0 \
  --resume_ckpt 'ckpt_to_resume_from.pt' \
  --checkpoints_dir 'my_local_checkpoint_directory' \

Full Usage

usage: train.py [-h] 
                [--data_dir DATA_DIR] 
                [--batch_size BATCH_SIZE]
                [--learning_rate LEARNING_RATE]
                [--adam_weight_decay ADAM_WEIGHT_DECAY] 
                [--side_x SIDE_X]
                [--side_y SIDE_Y] 
                [--resize_ratio RESIZE_RATIO]
                [--uncond_p UNCOND_P] 
                [--train_upsample]
                [--resume_ckpt RESUME_CKPT]
                [--checkpoints_dir CHECKPOINTS_DIR] [--use_fp16]
                [--device DEVICE] 
                [--log_frequency LOG_FREQUENCY]
                [--freeze_transformer] 
                [--freeze_diffusion]
                [--project_name PROJECT_NAME] [--activation_checkpointing]
                [--use_captions] 
                [--epochs EPOCHS] 
                [--test_prompt TEST_PROMPT]
                [--test_batch_size TEST_BATCH_SIZE]
                [--test_guidance_scale TEST_GUIDANCE_SCALE] 
                [--use_webdataset]
                [--wds_image_key WDS_IMAGE_KEY]
                [--wds_caption_key WDS_CAPTION_KEY]
                [--wds_dataset_name WDS_DATASET_NAME] 
                [--seed SEED]
                [--cudnn_benchmark] 
                [--upscale_factor UPSCALE_FACTOR]

Reference

OpenAI/glide-text2im
OpenAI/guided-diffusion

ttl-text2img's People

Contributors

tranaflee avatar

Stargazers

Cao Nhu Dat avatar Phạm Sơn avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.