GithubHelp home page GithubHelp logo

brade31919 / srgan-tensorflow Goto Github PK

View Code? Open in Web Editor NEW
848.0 41.0 285.0 14.97 MB

Tensorflow implementation of the SRGAN algorithm for single image super-resolution

License: MIT License

Python 96.06% Shell 3.94%
deep-learning tensorflow super-resolution generative-adversarial-network srgan vgg19 pretrained-models cnn tf-slim

srgan-tensorflow's Introduction

SRGAN-tensorflow

Introduction

This project is a tensorflow implementation of the impressive work Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network.
The result is obtained following to same setting from the v5 edition of the paper on arxiv. However, due to limited resources, I train my network on the RAISE dataset which contains 8156 high resoution images captured by good cameras. As the results showed below, the performance is close to the result presented in the paper without using the imagenet training set.
The result on BSD100, Set14, Set5 will be reported later. The code is highly inspired by the pix2pix-tensorflow.

Some results:

  • The comparison of some result form my implementation and the paper
Inputs Our result SRGAN result Original
Inputs Our result SRGAN result Original

Denpendency

  • python2.7
  • tensorflow (tested on r1.0, r1.2)
  • Download and extract the pre-trained model from my google drive
  • Download the VGG19 weights from the TF-slim models
  • The code is tested on:
    • Ubuntu 14.04 LTS with CPU architecture x86_64 + Nvidia Titan X
    • Ubuntu 16.04 LTS with CPU architecture x86_64 + Nvidia 1080, 1080Ti or Titan X

Recommended

  • Ubuntu 16.04 with tensorflow GPU edition

Getting Started

Throughout the project, we denote the directory you cloned the repo as SRGAN-tensorflow_ROOT

  • Run test using pre-trained model

# clone the repository from github
git clone https://github.com/brade31919/SRGAN-tensorflow.git
cd $SRGAN-tensorflow_ROOT/

# Download the pre-trained model from the google-drive
# Go to https://drive.google.com/a/gapp.nthu.edu.tw/uc?id=0BxRIhBA0x8lHNDJFVjJEQnZtcmc&export=download
# Download the pre-trained model to SRGAN-tensorflow/
tar xvf SRGAN_pre-trained.tar

# Run the test mode
sh test_SRGAN.sh

#The result can be viewed at $SRGAN-tensorflow_ROOT/result/images/

  • Run the inference using pre-trained model on your own image

cd $SRGAN-tensorflow_ROOT/

# Download the pre-trained model from the google-drive
# Go to https://drive.google.com/a/gapp.nthu.edu.tw/uc?id=0BxRIhBA0x8lHNDJFVjJEQnZtcmc&export=download
# Download the pre-trained model to SRGAN-tensorflow/
tar xvf SRGAN_pre-trained.tar

# put your png images in the your own directory
# For example
mkdir myImages
# put some images in it

modify the path in inference_SRGAN.sh

#!/usr/bin/env bash
CUDA_VISIBLE_DEVICES=0 python main.py \
    --output_dir ./result/ \
    --summary_dir ./result/log/ \
    --mode inference \
    --is_training False \
    --task SRGAN \
    --input_dir_LR ./data/myImages/ \        # Modify the path to your image path
    --num_resblock 16 \
    --perceptual_mode VGG54 \
    --pre_trained_model True \
    --checkpoint ./SRGAN_pre-trained/model-200000
# Run the test mode
sh inference_SRGAN.sh

#The result can be viewed at $SRGAN-tensorflow_ROOT/result/images/

  • Run the training process

Data and checkpoint preparation

To run the training process, things will become a little complicated. Follow the steps below carefully!!
Go to the project root directory. Download the vgg weight from TF-silm model

# make the directory to put the vgg19 pre-trained model
mkdir vgg19/
cd vgg19/
wget http://download.tensorflow.org/models/vgg_19_2016_08_28.tar.gz
tar xvf ./vgg19_2016_08_28.tar.gz

Download the training dataset. The dataset contains the 8156 images from the RAISE dataset. I preprocess all the TIFF images into png with 5x downscale as the high-resolution images. The low-resolution image is obtained by 4x downscale of the high-resolution image.
Download the two file from the google drive link:
High-resolution images
Low-resolution images
Put the two .tar files to SRGAN/data/. Go to project root (SRGAN/)

Typically, we need to follow the training process in the paper

  1. Train the SRResnet with 1000000 iterations
  2. [optional] Train the SRGAN with the weights from the generator of SRResnet for 500000 iterations using the MSE loss.
  3. Train the SRGAN with the weights from the generator and discriminator of SRGAN (MSE loss) for 200000 iterations using the VGG54 perceptual loss.

Train SRResnet

Edit the train_SRResnet.sh

#!/usr/bin/env bash
CUDA_VISIBLE_DEVICES=0 python main.py \  #Set CUDA devices correctly if you use multiple gpu system
    --output_dir ./experiment_SRResnet/ \  #Set the place to put the checkpoint and log. You can put it anywhere you like.
    --summary_dir ./experiment_SRResnet/log/ \
    --mode train \ 
    --is_training True \ 
    --task SRResnet \
    --batch_size 16 \
    --flip True \                        #flip and random_crop are online data augmentation method 
    --random_crop True \
    --crop_size 24 \
    --input_dir_LR ./data/RAISE_LR/ \    #Set the training data path correctly
    --input_dir_HR ./data/RAISE_HR/ \
    --num_resblock 16 \
    --name_queue_capacity 4096 \
    --image_queue_capacity 4096 \
    --perceptual_mode MSE \              #We use MSE loss in SRResnet training
    --queue_thread 12 \                  #Cpu threads for the data provider. We suggest >4 to speedup the training 
    --ratio 0.001 \
    --learning_rate 0.0001 \
    --decay_step 500000 \
    --decay_rate 0.1 \
    --stair True \
    --beta 0.9 \
    --max_iter 1000000 \
    --save_freq 20000

After ensuring the configuration. execute the script:

# Executing the script
cd $SRGAN-tensorflow_ROOT/
sh train_SRResnet.sh

Launch tensorboard to monitor the training process

# Launch the tensorboard
cd ./experiment_SRResnet/log/
tensorboard --logdir . 
# Now you can navigate to tensorboard in your browser

The training process in the tensorboard should be like this

PSNR content loss

Train SRGAN with MSE loss

Edit the train_SRGAN.sh

#!/usr/bin/env bash
CUDA_VISIBLE_DEVICES=0 python main.py \
    --output_dir ./experiment_SRGAN_MSE/ \      #Set the place to put the checkpoint and log. You can put it anywhere you like.
    --summary_dir ./experiment_SRGAN_MSE/log/ \
    --mode train \
    --is_training True \
    --task SRGAN \
    --batch_size 16 \
    --flip True \
    --random_crop True \
    --crop_size 24 \
    --input_dir_LR ./data/RAISE_LR/ \
    --input_dir_HR ./data/RAISE_HR/ \
    --num_resblock 16 \
    --perceptual_mode MSE \                    #Set the perceptual mode to MSE
    --name_queue_capacity 4096 \
    --image_queue_capacity 4096 \
    --ratio 0.001 \
    --learning_rate 0.0001 \
    --decay_step 250000 \                       #Set the decay step to 250000
    --decay_rate 0.1 \
    --stair True \
    --beta 0.9 \
    --max_iter 500000 \                         #Set max iteration to 500000
    --queue_thread 10 \
    --vgg_scaling 0.0061 \
    --pre_trained_model True \                  #Use the pre-trained model
    --checkpoint ./experiment_SRGAN_MSE/model-500000       #Set the pre-trainde model you want to load

After ensuring the configuration. execute the script:

# Executing the script
cd $SRGAN-tensorflow_ROOT/
sh train_SRGAN.sh

Launch the tensorboard to monitor the training process

# Launch the tensorboard
cd ./experiment_SRGAN_MSE/log/
tensorboard --logdir . 
# Now you can navigate to tensorboard in your browser

The training process in the tensorboard should be like this

PSNR content loss adversarial loss discriminator loss

Train SRGAN with VGG54 perceptual loss

modify the train_SRGAN.sh

#!/usr/bin/env bash
CUDA_VISIBLE_DEVICES=0 python main.py \              #Set the place to put the checkpoint and log. You can put it anywhere you like.
    --output_dir ./experiment_SRGAN_VGG54/ \
    --summary_dir ./experiment_SRGAN_VGG54/log/ \
    --mode train \
    --is_training True \
    --task SRGAN \
    --batch_size 16 \
    --flip True \
    --random_crop True \
    --crop_size 24 \
    --input_dir_LR ./data/RAISE_LR/ \
    --input_dir_HR ./data/RAISE_HR/ \
    --num_resblock 16 \
    --perceptual_mode VGG54 \                         # Set the perceptual mode to VGG54
    --name_queue_capacity 4096 \
    --image_queue_capacity 4096 \
    --ratio 0.001 \
    --learning_rate 0.0001 \
    --decay_step 100000 \                             # Set the decay step to 100000 to follow the settings on the paper
    --decay_rate 0.1 \
    --stair True \
    --beta 0.9 \
    --max_iter 200000 \                               # Set the max_iter to 200000 to follow the settings on the paper
    --queue_thread 10 \
    --vgg_scaling 0.0061 \
    --pre_trained_model True \
    --checkpoint ./experiment_SRGAN_MSE/model-500000     #Load the weights of the model from the SRGAN with MSE loss

After ensuring the configuration. execute the script:

# Executing the script
cd $SRGAN-tensorflow_ROOT/
sh train_SRGAN.sh

Launch tensorboard to monitor the training process

# Launch the tensorboard
cd ./experiment_SRGAN_VGG54/log/
tensorboard --logdir . 
# Now you can navigate to tensorboard in your browser

The training process in the tensorboard should be like this

PSNR content loss adversarial loss discriminator loss

More result on benchmark

Coming soon!!!

Reference

srgan-tensorflow's People

Contributors

brade31919 avatar geekan avatar omegacoleman avatar several27 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

srgan-tensorflow's Issues

Successfully ran test mode in MacOS, where to put Instructions?

Thank you for making this repo, it's really valuable.

Basically, I tried to use this repo in my Macbook Pro and after a few bumps along the way
I was able to successfully run it! (test mode not training mode)

I tried to run this on MacOS Mojave Version 10.14.2 This is a Mid 2014 Macbook Pro with 2.6 GHz Intel Core i5 Processor and a Memory of 8 GB 1600 MHz DDR3 so it ran extremely slow I had to close all my running apps and wait about 10 - 20 minutes to enhance 4 images. And an hour to enhance my own image.

So my issue is that I think this can help someone, but I don't know where I should put this information.

Should I make a pull request to your Readme or should I just create a Medium article?


Here's what I did:

I know that this project is tested on Ubuntu but I wanted to try my luck anyway.

I created a conda environment (Miniconda) to install dependencies.

Instead of Python 2.7 I used the defaults of Anaconda which is:

Python 3.6.8 |Anaconda, Inc.| (default, Dec 29 2018, 19:04:46)

with TensorFlow version: '1.12.0'

I wont be able to install tensow-gpu IE conda install -c anaconda tensorflow-gpu because I am in OSX

Essentially, this is what I did:

$ conda update conda
$ conda create -n enhance
$ source activate enhance
$ conda install tensorflow
$ conda install -c anaconda pillow
$ conda install nomkl
$ git clone https://github.com/brade31919/SRGAN-tensorflow.git

# Downloaded and pretrained models from Google Drive to root directory
# https://drive.google.com/uc?id=0BxRIhBA0x8lHNDJFVjJEQnZtcmc&export=download
$ tar xvf SRGAN_pre-trained.tar

I tried:

python main.py \
    --output_dir ./result/ \
    --summary_dir ./result/log/ \
    --mode inference \
    --is_training False \
    --task SRGAN \
    --batch_size 1 \
    --input_dir_LR ./data/test_LR/ \
    --input_dir_HR ./data/test_HR/ \
    --num_resblock 16 \
    --checkpoint ./SRGAN_pre-trained/model-200000 \
    --perceptual_mode VGG54 \
--pre_trained_model True

I got:

***
 snip
 snip
***

End of configuration
Finish building the network
2019-01-20 15:01:31.548365: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-01-20 15:01:31.551312: I tensorflow/core/common_runtime/process_util.cc:69] Creating new thread pool with default inter op setting: 4. Tune using inter_op_parallelism_threads for best performance.
Loading weights from the pre-trained model
Evaluation starts!!
evaluate image img_001
evaluate image img_003
evaluate image img_002
evaluate image img_005

And the test is successful! Yay!

The graph couldn't be sorted in topological order

When training SRGAN_MSE,I got the error before training steps.How should I correct this mistake?
`
Instructions for updating:
Use standard file APIs to check for files with this prefix.
Optimization starts!!!
2019-04-25 22:19:05.498392: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:704] Iteration = 0, topological sort failed with message: The graph couldn't be sorted in topological order.
2019-04-25 22:19:05.573480: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:704] Iteration = 1, topological sort failed with message: The graph couldn't be sorted in topological order.
2019-04-25 22:19:06.019904: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:704] Iteration = 0, topological sort failed with message: The graph couldn't be sorted in topological order.
2019-04-25 22:19:06.074874: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:704] Iteration = 1, topological sort failed with message: The graph couldn't be sorted in topological order.
2019-04-25 22:20:13.189917: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:704] Iteration = 0, topological sort failed with message: The graph couldn't be sorted in topological order.
2019-04-25 22:20:13.261664: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:704] Iteration = 1, topological sort failed with message: The graph couldn't be sorted in topological order.
2019-04-25 22:20:13.684844: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:704] Iteration = 0, topological sort failed with message: The graph couldn't be sorted in topological order.
2019-04-25 22:20:13.736548: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:704] Iteration = 1, topological sort failed with message: The graph couldn't be sorted in topological order.

`

Any training trick for SRGAN(VGG54)?

Follow the training steps, both SRResNet and SRGAN(MSE) are easy to train, but SRGAN(VGG54) is very hard to train on my private dataset..

Is the any training trick here?

hi

could you tell me the PSNR of Set5 and Set14 (your SRResnet)!
thank you!

Can not run test or inference

Hi!

I tried both on cpu and gpu for test_SRGAN.sh and inference_SRGAN.sh, but it is frozen at the step starting to evaluate, in which the data would be fed to the generator and output the results. I took forever to run a single pass of the generator and I have to kill it before it run out my computer memory.

I am using amazon AWS deep learning AMI version 4, p2.xlarge instance. Tensorflow version is 1.5.0. I tried both on python 2.7 and python 3.6.

The memory usage of the process on my instance(61GB mem total) keeps growing and and I can not even make a single pass for one single test_image(I used the test_image in the repo).......

Issue training SRGAN with perceptual loss

Hi, I got this issue while training SRGAN with perceptual loss

Traceback (most recent call last):
File "main.py", line 231, in
Net = SRGAN(data.inputs, data.targets, FLAGS)
File ".../SRGAN-tensorflow/lib/model.py", line 372, in SRGAN
extracted_feature_gen = VGG19_slim(gen_output, FLAGS.perceptual_mode, reuse=False, scope=scope)
File ".../SRGAN-tensorflow/lib/model.py", line 341, in VGG19_slim
output = output[target_layer]
KeyError: 'vgg19_1/vgg_19/conv5/conv5_4'

Denoise task

Hi. I saw in the task parameter, there is "denoise task". The input and target in the "denoise task" have the same size. Do you still use the same generator for denoise task. The generator for targets with 4x size bigger than input seems to have the upscale 2x layer, which is not suitable for denoise task.

test time is too high

When testing, if I change the size of the input image, the test time on the GPU will be very long, do not change the size of the image, the second start time is normal; the problem was not found on the CPU.is this a problem for tensorflow?

Can't resume training (The passed save_path is not a valid checkpoint: ./experiment_SRResnet/))

Hello.
Thanks for this great piece of software.
I have an issue.
I'm begining the training with:
"python main.py --output_dir ./experiment_SRResnet/ --summary_dir ./experiment_SRResnet/log/ --mode train --is_training True --task SRResnet --batch_size 8 --flip True --random_crop True --crop_size 24 --input_dir_LR ./data/test_LR/ --input_dir_HR ./data/test_HR/ --num_resblock 16 --name_queue_capacity 6144 --image_queue_capacity 6144 --perceptual_mode MSE --queue_thread 16 --ratio 0.001 --learning_rate 0.0001 --decay_step 400000 --decay_rate 0.1 --stair False --beta 0.9 --max_iter 1000000 --save_freq 5000"

training's working great, the tensorboard update's well.

but when i want to resume with:
"python main.py --output_dir ./experiment_SRResnet/ --summary_dir ./experiment_SRResnet/log/ --mode train --is_training True --task SRResnet --batch_size 8 --flip True --random_crop True --crop_size 24 --input_dir_LR ./data/test_LR/ --input_dir_HR ./data/test_HR/ --num_resblock 16 --name_queue_capacity 6144 --image_queue_capacity 6144 --perceptual_mode MSE --queue_thread 16 --ratio 0.001 --learning_rate 0.0001 --decay_step 400000 --decay_rate 0.1 --stair False --beta 0.9 --max_iter 1000000 --save_freq 5000 --pre_trained_model false --checkpoint ./experiment_SRResnet/"
i get the following error "ValueError: The passed save_path is not a valid checkpoint: ./experiment_SRResnet/"

Am i doing something wrong ?

Thanks

can't find varibles in checkpoint

I want to add some units in task SRGAN and those units contain new varibles both in generator and discriminator,when I trained SRGAN with the pre_trained model which is task SRResnet and doesn't contain the new units,the error told me can't find the varibles in checkpoint ,how can i handle this problems?thx~~

Images not transforming correctly

For some reason images gets odd results when I try to alter them
https://imgur.com/a/Q4wO3j9
Any ideas as to why? I use the pre-trained model with tensorflow CPU edition

Edit: I have an Intel Movidius NCS at home, can that be used to process or is it just used for training? I was thinking if the CPU was the one causing the problem

ParametricRelU after pixelshuffler

Hello,
In Generator, I am trying to understand why do we need parametric ReLU functions after each pixelshuffler layer? And what is the use of final Conv layer in generator? Can you please explain?

ValueError: too many values to unpack

Traceback (most recent call last):
  File "main.py", line 168, in <module>
    inference_data = inference_data_loader(FLAGS)
  File "/home/ubuntu/SRGAN-tensorflow/lib/model.py", line 210, in inference_data_loader
    image_LR = [preprocess_test(_) for _ in image_list_LR]
  File "/home/ubuntu/SRGAN-tensorflow/lib/model.py", line 202, in preprocess_test
    h, w = im.shape
ValueError: too many values to unpack

What's this error caused by, and how do I resolve it?

Reduce batch_size for inference

Hi @brade31919
thanks for sharing your great work in srgan, could you please explan how to use less batch_size when i inference, when i try to change it from 16 to 4 it seem like no effect in memory usage and it freez the pc after 8g memory is full, i only use small photo 512x384 when this happen.

test_SRGAN.sh does not work

When running the test script I get the following error message:

Traceback (most recent call last):
File "main.py", line 88, in
FLAGS.crop_size = None
[...]
AttributeError: Flag --crop_size must be specified.

RAM usage is really high.

Is there any way to limit RAM usage on this program? I have only 4GB of RAM and it uses all the RAM and makes my PC freezes and eventually, it crashed before it finishes processing.

Some png image didn't work well

First of all,thanks for your model!
And there is a question:when I use my own png image,I found it didn't work as well as the default image,I thought maybe it related with the png's format,I found there are png-8(index),png-24(rgb),I converted to both of them,but still the same,do you know how to convert a picture with other format into a useable png image?

Why do I need to provide the path to HR when testing the data?

Hi,

I might be missing something, but why do I need to provide the path to HR when testing the data (from test_SRGAN.sh as well as main.py). I thought for scoring the use case would be given a low-res picture, we can create a high-res one?

Thanks!

Why is this repo's result better than the SRGAN paper?

I tried this repo and found that the result of SRGAN is better than in the paper.
Not JUST better, it is abnormally better regarding the performance of GAN algorithm.

Does this repo really has better performance than the paper?

Or, would there be any missing part in this repo?

How do you think brade31919?

Data Normalization

Why do you normalize the low-resolution image to [0, 1] and the high-resolution image to [-1, 1]

While feeding data into VGG network, should it be normalized to [0, 1]?

KeyError: 'vgg19_1/vgg_19/conv5/conv5_4'

Traceback (most recent call last):
File "main.py", line 231, in
Net = SRGAN(data.inputs, data.targets, FLAGS)
File "/home/zhou/SRGAN-tensorflow/lib/model.py", line 372, in SRGAN
extracted_feature_gen = VGG19_slim(gen_output, FLAGS.perceptual_mode, reuse=False, scope=scope)
File "/home/zhou/SRGAN-tensorflow/lib/model.py", line 341, in VGG19_slim
output = output[target_layer]
KeyError: 'vgg19_1/vgg_19/conv5/conv5_4'

segmentation fault

Hi, I copied my dataset to ./data/train folder, then I ran the sh inference_SRGAN command, but the code not work, it reminded "segmentaion fault" and no other error. The detail is as follows, can you tell what the problem may be, thank you. @brade31919

/PycharmProjects/SRGAN-tensorflow$ sh inference_SRGAN.sh
/home/menghua/anaconda3/lib/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
[Configurations]:
output_dir: <absl.flags._flag.Flag object at 0x7f0e9163d1d0>
summary_dir: <absl.flags._flag.Flag object at 0x7f0e50c65c18>
mode: <absl.flags._flag.Flag object at 0x7f0e50c65c88>
checkpoint: <absl.flags._flag.Flag object at 0x7f0e50c74048>
pre_trained_model: <absl.flags._flag.BooleanFlag object at 0x7f0e50c74128>
pre_trained_model_type: <absl.flags._flag.Flag object at 0x7f0e50c869b0>
is_training: <absl.flags._flag.BooleanFlag object at 0x7f0e50c86e48>
vgg_ckpt: <absl.flags._flag.Flag object at 0x7f0e50c22630>
task: <absl.flags._flag.Flag object at 0x7f0e50c226a0>
batch_size: <absl.flags._flag.Flag object at 0x7f0e50c22748>
input_dir_LR: <absl.flags._flag.Flag object at 0x7f0e50c227f0>
input_dir_HR: <absl.flags._flag.Flag object at 0x7f0e50c22860>
flip: <absl.flags._flag.BooleanFlag object at 0x7f0e50c22898>
random_crop: <absl.flags._flag.BooleanFlag object at 0x7f0e50c22908>
crop_size: <absl.flags._flag.Flag object at 0x7f0e50c229b0>
name_queue_capacity: <absl.flags._flag.Flag object at 0x7f0e50c22a58>
image_queue_capacity: <absl.flags._flag.Flag object at 0x7f0e50c22b00>
queue_thread: <absl.flags._flag.Flag object at 0x7f0e50c22ba8>
num_resblock: <absl.flags._flag.Flag object at 0x7f0e50c22c50>
perceptual_mode: <absl.flags._flag.Flag object at 0x7f0e50c22cf8>
EPS: <absl.flags._flag.Flag object at 0x7f0e50c22dd8>
ratio: <absl.flags._flag.Flag object at 0x7f0e50c22e80>
vgg_scaling: <absl.flags._flag.Flag object at 0x7f0e50c22f28>
learning_rate: <absl.flags._flag.Flag object at 0x7f0e50c22f98>
decay_step: <absl.flags._flag.Flag object at 0x7f0e50c2b048>
decay_rate: <absl.flags._flag.Flag object at 0x7f0e50c2b0b8>
stair: <absl.flags._flag.BooleanFlag object at 0x7f0e50c2b0f0>
beta: <absl.flags._flag.Flag object at 0x7f0e50c2b1d0>
max_epoch: <absl.flags._flag.Flag object at 0x7f0e50c2b278>
max_iter: <absl.flags._flag.Flag object at 0x7f0e50c2b2e8>
display_freq: <absl.flags._flag.Flag object at 0x7f0e50c2b358>
summary_freq: <absl.flags._flag.Flag object at 0x7f0e50c2b400>
save_freq: <absl.flags._flag.Flag object at 0x7f0e50c2b4a8>
End of configuration
Finish building the network
2018-09-04 16:08:21.109872: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-09-04 16:08:21.114524: I tensorflow/core/common_runtime/process_util.cc:69] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
Loading weights from the pre-trained model
Evaluation starts!!
Segmentation fault

Flag --crop_size must be specified

While running sh test_SRGAN.sh I encountered blew error:
I didn't change anything since downloaded the package from github, did I miss anything?

[Configurations]: perceptual_mode: VGG54 EPS: 0.000000 is_training: False display_freq: 20 pre_trained_model: True vgg_scaling: 0.006100 num_resblock: 16 crop_size: 24 save_freq: 10000 name_queue_capacity: 2048 ratio: 0.001000 stair: False max_epoch: None queue_thread: 10 vgg_ckpt: ./vgg19/vgg_19.ckpt checkpoint: ./SRGAN_pre-trained/model-200000 decay_rate: 0.100000 output_dir: ./result/ summary_dir: ./result/log/ input_dir_HR: ./data/test_HR/ decay_step: 500000 learning_rate: 0.000100 max_iter: 1000000 input_dir_LR: ./data/test_LR/ batch_size: 16 beta: 0.900000 pre_trained_model_type: SRResnet task: SRGAN image_queue_capacity: 2048 flip: True summary_freq: 100 mode: test random_crop: True End of configuration Traceback (most recent call last): File "main.py", line 88, in <module> FLAGS.crop_size = None File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/flags.py", line 66, in __setattr__ self._assert_required(name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/flags.py", line 74, in _assert_required raise AttributeError('Flag --%s must be specified.' % flag_name) AttributeError: Flag --crop_size must be specified.

Error while training

Ok, say I want to train SRGAN, not SRResnet, the checkpoint here points to ./experiment_SRGAN_MSE/model-500000 but it doesn't exist! Is it created when we finished training the SRResnet?

I got this error : KeyError: 'vgg19_1/vgg_19/conv5/conv5_4'
How do I resolve it?

Can I get output_image without target_image?

Hi. I'm Heidi.
Can I get output_image without target_image?

In main.py, the results are:

results = sess.run (save_fetch, feed_dict = {inputs_raw: input_im, targets_raw: target_im, path_LR: path_lr, path_HR: path_hr}

When I do not know targets_raw, can I get ouput image?

[ERROR] input_dir_LR not found -- possible user error

So I have managed to get the test script running without dependency errors which is awesome for my standards but instead I get a directory error.
This is the terminal output. The test_SRGAN.sh script is UNmodified and the data/input* folders are present. I also extracted the weighted model with :
~/SRGAN-tensorflow_ROOT$ tar xvf vgg_19_2016_08_28.tar.gz

My complete folder structure is: /home/user/castle/SRGAN-tensorflow_ROOT/ .

This is my terminal output:

(venv) root@toor: ~ /SRGAN-tensorflow_ROOT$ sh test_SRGAN.sh
[Configurations]:
perceptual_mode: VGG54
EPS: 0.000000
is_training: False
display_freq: 20
pre_trained_model: False
vgg_scaling: 0.006100
num_resblock: 16
crop_size: 24
save_freq: 10000
name_queue_capacity: 2048
ratio: 0.001000
stair: False
max_epoch: None
queue_thread: 10
vgg_ckpt: ./vgg19/vgg_19.ckpt
checkpoint: None
decay_rate: 0.100000
output_dir: ./result/
summary_dir: ./result/log/
input_dir_HR: None
decay_step: 500000
learning_rate: 0.000100
max_iter: 1000000
input_dir_LR: None
batch_size: 16
beta: 0.900000
pre_trained_model_type: SRResnet
task: SRGAN
image_queue_capacity: 2048
flip: True
summary_freq: 100
mode: test
random_crop: True
End of configuration
Traceback (most recent call last):
File "main.py", line 81, in
raise ValueError('The checkpoint file is needed to performing the test.')
ValueError: The checkpoint file is needed to performing the test.
test_SRGAN.sh: 9: test_SRGAN.sh: --input_dir_LR: not found
(venv) root@toor:~/SRGAN-tensorflow_ROOT$


I tried to manually start it with python main.py ++options as stated in the .sh script. Did not work.

Resume training not working?

So I set up an instance with 16 cores, and 32GB RAM with Tesla P100. Anyways to increase training speed by editing the shell script for SRResNet and SRGAN? Right now the performance is the same and unbearable as my GTX 1060. Right now I've tried increasing batch_size to 48, both queue capacity to 16384. Am I doing it wrong? Only 3GB RAM are used.
Update: I've reduced batch_size to 16 and increase queue capacity to 32768 and the speed increase 1.5x but still not what I expect from Tesla P100.

Issue with perceptual loss

Hi,I have a question when using perceptual loss, what kind of input data does the function VGG19_slim accept?

elif FLAGS.perceptual_mode == 'VGG22':
with tf.name_scope('vgg19_1') as scope:
extracted_feature_gen = VGG19_slim(gen_output, FLAGS.perceptual_mode, reuse=False, scope=scope)
with tf.name_scope('vgg19_2') as scope:
extracted_feature_target = VGG19_slim(targets, FLAGS.perceptual_mode, reuse=True, scope=scope)

In this part of code , should I normalize gen_output and targets to [-1,1], or [0, 1], or [0, 255]? The code only told me to resize input to (224*224), but didn't show the type of input data. And it seems important how to normalize the input data.

Looking forward to your response, thanks a lot!

RAM Issue

I'm interested in this project so I try this on Windows, but when I started the program, my RAM spike to 95% and my laptop crash. I thought it's a Windows problem so I tried Ubuntu and getting the same issues.
My hardware: GTX 1060, 8Gb RAM, i7-7700HQ
My software: CUDA 9.0, Python 3.6, CUDNN 7 (Supported by Tensorflow), Tensorflow 1.5
Any Ideas where I'm doing wrong? Or perhaps I should upgrade my RAM (hopefully not cause I'm a poor student :)

Retraining SRGAN

So i wanted to if it would be a good idea to first train the model on the dataset initially trained and then using those weights and selected hyperparameters to train on my dataset, or is there any way that i can fine tune the end layers to fit to my model?

How to make this work on Windows. Please.

I want to use this software but I can't make it run on Windows OS.

Am I need to prepare something to make it work? And just in case... I'll use another way, it's running it in Ubuntu on VirtualBox. Is it gonna work? What do I need to prepare after I finish installing Ubuntu on Virtual machine?

I don't want to pay for LetsEnhance.io if there's something that can use it for free is available.

[GENERAL]Resume training ?

As I am being new to this and want to try this out I wonder if it is possible to pause and resume the training process?
I have read that training models could take very long times and I am not sure to keep the training process running for let's say days. Is it possible to ctrl-C out of the process and resume it at a later time?
Also this question is very general as I think this applies to almost all DL trainings.

2x resolution instead of 4x

Hi,

First of all, thanks.
I'm getting good results for 4x upscale factors.
However, I wanted to know what changes would I have to the model to prepare a 2x upscale model.

I know that one can always downscale. However, I'm hoping to get better performance (in terms of accuracy and speed) if I change the model to resolve for a 2x upscale factor.

Thanks,
Shubham

Run the training SRResNet on test_HR/LR(only 4 images) success ,but failed on RAISE_HR/LR.(I use google colab Platform)

Using TensorFlow backend.
[Configurations]:
output_dir: <absl.flags._flag.Flag object at 0x7fe00a84d9e8>
summary_dir: <absl.flags._flag.Flag object at 0x7fdfaa9dee10>
mode: <absl.flags._flag.Flag object at 0x7fdfaa9dee80>
checkpoint: <absl.flags._flag.Flag object at 0x7fdfaa9ed208>
pre_trained_model: <absl.flags._flag.BooleanFlag object at 0x7fdfaa9ed240>
pre_trained_model_type: <absl.flags._flag.Flag object at 0x7fdfaa84db38>
is_training: <absl.flags._flag.BooleanFlag object at 0x7fdfaa84db70>
vgg_ckpt: <absl.flags._flag.Flag object at 0x7fdfaa8716a0>
task: <absl.flags._flag.Flag object at 0x7fdfaa871710>
batch_size: <absl.flags._flag.Flag object at 0x7fdfaa8717b8>
input_dir_LR: <absl.flags._flag.Flag object at 0x7fdfaa871860>
input_dir_HR: <absl.flags._flag.Flag object at 0x7fdfaa8718d0>
flip: <absl.flags._flag.BooleanFlag object at 0x7fdfaa871908>
random_crop: <absl.flags._flag.BooleanFlag object at 0x7fdfaa871978>
crop_size: <absl.flags._flag.Flag object at 0x7fdfaa871a20>
name_queue_capacity: <absl.flags._flag.Flag object at 0x7fdfaa871ac8>
image_queue_capacity: <absl.flags._flag.Flag object at 0x7fdfaa871b70>
queue_thread: <absl.flags._flag.Flag object at 0x7fdfaa871c18>
num_resblock: <absl.flags._flag.Flag object at 0x7fdfaa871cc0>
perceptual_mode: <absl.flags._flag.Flag object at 0x7fdfaa871d68>
EPS: <absl.flags._flag.Flag object at 0x7fdfaa871e48>
ratio: <absl.flags._flag.Flag object at 0x7fdfaa871ef0>
vgg_scaling: <absl.flags._flag.Flag object at 0x7fdfaa871f98>
learning_rate: <absl.flags._flag.Flag object at 0x7fdfaa87b048>
decay_step: <absl.flags._flag.Flag object at 0x7fdfaa87b0b8>
decay_rate: <absl.flags._flag.Flag object at 0x7fdfaa87b128>
stair: <absl.flags._flag.BooleanFlag object at 0x7fdfaa87b160>
beta: <absl.flags._flag.Flag object at 0x7fdfaa87b240>
max_epoch: <absl.flags._flag.Flag object at 0x7fdfaa87b2e8>
max_iter: <absl.flags._flag.Flag object at 0x7fdfaa87b358>
display_freq: <absl.flags._flag.Flag object at 0x7fdfaa87b3c8>
summary_freq: <absl.flags._flag.Flag object at 0x7fdfaa87b470>
save_freq: <absl.flags._flag.Flag object at 0x7fdfaa87b518>
End of configuration
WARNING:tensorflow:From /content/drive/SRGAN/lib/model.py:56: slice_input_producer (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by tf.data. Use tf.data.Dataset.from_tensor_slices(tuple(tensor_list)).shuffle(tf.shape(input_tensor, out_type=tf.int64)[0]).repeat(num_epochs). If shuffle=False, omit the .shuffle(...).
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/input.py:372: range_input_producer (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by tf.data. Use tf.data.Dataset.range(limit).shuffle(limit).repeat(num_epochs). If shuffle=False, omit the .shuffle(...).
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/input.py:318: input_producer (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by tf.data. Use tf.data.Dataset.from_tensor_slices(input_tensor).shuffle(tf.shape(input_tensor, out_type=tf.int64)[0]).repeat(num_epochs). If shuffle=False, omit the .shuffle(...).
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/input.py:188: limit_epochs (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by tf.data. Use tf.data.Dataset.from_tensors(tensor).repeat(num_epochs).
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/input.py:197: QueueRunner.init (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the tf.data module.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/input.py:197: add_queue_runner (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the tf.data module.
WARNING:tensorflow:From /content/drive/SRGAN/lib/model.py:60: WholeFileReader.init (from tensorflow.python.ops.io_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by tf.data. Use tf.data.Dataset.map(tf.read_file).
[Config] Use random crop
[Config] Use random flip
WARNING:tensorflow:From /content/drive/SRGAN/lib/model.py:141: shuffle_batch (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by tf.data. Use tf.data.Dataset.shuffle(min_after_dequeue).batch(batch_size).
Data count = 8156
WARNING:tensorflow:From /content/drive/SRGAN/lib/model.py:527: get_or_create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.get_or_create_global_step
Finish building the network!!!
WARNING:tensorflow:From /content/drive/SRGAN/main.py:320: Supervisor.init (from tensorflow.python.training.supervisor) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.MonitoredTrainingSession
2018-10-31 01:35:06.517446: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-10-31 01:35:06.517995: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:00:04.0
totalMemory: 11.17GiB freeMemory: 11.10GiB
2018-10-31 01:35:06.518060: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2018-10-31 01:35:07.491238: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-10-31 01:35:07.491311: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2018-10-31 01:35:07.491337: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2018-10-31 01:35:07.491694: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10758 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
Optimization starts!!!
Optimization starts 1!!!
2018-10-31 01:35:24.728873: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:666] Iteration = 0, topological sort failed with message: The graph couldn't be sorted in topological order.
2018-10-31 01:35:24.823117: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:666] Iteration = 1, topological sort failed with message: The graph couldn't be sorted in topological order.
2018-10-31 01:35:25.398261: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:666] Iteration = 0, topological sort failed with message: The graph couldn't be sorted in topological order.
2018-10-31 01:35:25.449970: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:666] Iteration = 1, topological sort failed with message: The graph couldn't be sorted in topological order.
Optimization starts 1!!!
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.OutOfRangeError: RandomShuffleQueue '_1_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 2, current size 0)
[[{{node shuffle_batch}} = QueueDequeueManyV2[component_types=[DT_STRING, DT_STRING, DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/supervisor.py", line 994, in managed_session
yield sess
File "/content/drive/SRGAN/main.py", line 369, in
results = sess.run(fetches)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: RandomShuffleQueue '_1_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 2, current size 0)
[[node shuffle_batch (defined at /content/drive/SRGAN/lib/model.py:141) = QueueDequeueManyV2[component_types=[DT_STRING, DT_STRING, DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]

Caused by op 'shuffle_batch', defined at:
File "/content/drive/SRGAN/main.py", line 236, in
data = data_loader(FLAGS)
File "/content/drive/SRGAN/lib/model.py", line 141, in data_loader
min_after_dequeue=FLAGS.image_queue_capacity, num_threads=FLAGS.queue_thread)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 306, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/input.py", line 1344, in shuffle_batch
name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/input.py", line 871, in _shuffle_batch
dequeued = queue.dequeue_many(batch_size, name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/data_flow_ops.py", line 478, in dequeue_many
self._queue_ref, n=n, component_types=self._dtypes, name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 3487, in queue_dequeue_many_v2
component_types=component_types, timeout_ms=timeout_ms, name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1770, in init
self._traceback = tf_stack.extract_stack()

OutOfRangeError (see above for traceback): RandomShuffleQueue '_1_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 2, current size 0)
[[node shuffle_batch (defined at /content/drive/SRGAN/lib/model.py:141) = QueueDequeueManyV2[component_types=[DT_STRING, DT_STRING, DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/content/drive/SRGAN/main.py", line 395, in
print('Optimization done!!!!!!!!!!!!')
File "/usr/lib/python3.6/contextlib.py", line 99, in exit
self.gen.throw(type, value, traceback)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/supervisor.py", line 1004, in managed_session
self.stop(close_summary_writer=close_summary_writer)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/supervisor.py", line 832, in stop
ignore_live_threads=ignore_live_threads)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/coordinator.py", line 389, in join
six.reraise(*self._exc_info_to_raise)
File "/usr/local/lib/python3.6/dist-packages/six.py", line 693, in reraise
raise value
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/queue_runner_impl.py", line 257, in _run
enqueue_callable()
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1257, in _single_operation_run
self._call_tf_sessionrun(None, {}, [], target_list, None)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected image (JPEG, PNG, or GIF), got empty file
[[{{node load_image/DecodePng_1}} = DecodePngchannels=3, dtype=DT_UINT8, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

CUDA_ERROR_OUT_OF_MEMORY Failed to create session

Hello!
I am trying to run the test_SRGAN.sh but I am getting an error.
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.

I am running tensor flow 1.4.1 in virtualenv

Any ideas where its coming from and how to fix it ?

Thanks !
Full Error

E tensorflow/core/common_runtime/direct_session.cc:170] Internal: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_OUT_OF_MEMORY; total memory reported: 8499691520
Traceback (most recent call last):
  File "main.py", line 20, in <module>
    sess = tf.Session(config = config)
  File "/home/cybrak/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1482, in __init__
    super(Session, self).__init__(target, graph, config=config)
  File "/home/cybrak/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 622, in __init__
    self._session = tf_session.TF_NewDeprecatedSession(opts, status)
  File "/home/cybrak/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.

results are super blurry

I am trying to use SRGAN for increasing resolution of some artwork. However, the results I got are super blurry, basically it seems to be doing opposite of what I was expecting: instead of a higher resolution image, I got an image where all nearby colors are mixed together (see this: https://i.imgur.com/1AZpoiZ.png)

My config is as follows:

CUDA_VISIBLE_DEVICES=0,1 python main.py
--output_dir ./experiment_SRResnet/
--summary_dir ./experiment_SRResnet/log/
--mode train
--is_training True
--task SRResnet
--batch_size 2
--flip True
--random_crop True
--crop_size 24
--input_dir_LR ./data/img_lowres/
--input_dir_HR ./data/img_highres/
--num_resblock 16
--name_queue_capacity 4096
--image_queue_capacity 4096
--perceptual_mode MSE
--queue_thread 4
--ratio 0.001
--learning_rate 0.0001
--decay_step 400000
--decay_rate 0.1
--stair False
--beta 0.9
--max_iter 1000000
--save_freq 20000

I am so upset after training the model for so long (1M iterations) and getting this :( What might go wrong?

PS: my images are png files, and I created the low-res version by resizing the high-res ones to 128 x height (the high-res ones are around 800 x height), and I have 3000 images in total.

Max input resolution?

So I have everything setup right (as far as I can tell). I've been able to test on a few of my own small images with pretty good results.
However when I tried an image that had a 700x700 resolution my cpu usage would go up indicating that everything is working, then after a minute or so when the usage dropped nothing would be output, with no errors being indicated.

Tried lots of things, but the only way I successfully got the image to work in this was to shrink the image down to 500x500, then everything worked fine and it was up scaled to 2000x2000.

So my question is, is there a maximum input image resolution? If so, is it an actual limitation of the program, or is there a variable I can change to allow for a larger input?

great work

hi @brade31919
i just want to thank you much for the best srgan project i have ever try, man you are doing excllent upscal with excllent hallucinate texture detail, i donot know how your progect take less star than some other Weak srgan progect that take over 250 star and have much less result than your srgan implementation, there are new website https://letsenhance.io/pricing, that offer great online paid enhanced image upscale that make good and big sound when pepole know this site, that can turn your sd image to hires details iamge, i try most of the Single Image Super-Resolution in github, and no one of him can make that great result as letsenhance can do, just until i try your srgan progect and WOW, your srgan cane make equal result as letsenhance sit can, but in some case your srgan cane make better result especially in hallucinate trees and grass and human texture details, while letsenhance make very bad and scary result in hallucinate human faces texture details, your srgan can make good and natural faces result.
now i always use your srgan to enhance my sd images to excellent hi res images.

congratulations man, and thank you for this great work.

Compatibility With Tensorflow 1.4?

@brade31919 , I am having troubles install this project properly.

I tried to force tensorflow 1.0, and 1.2 via sudo pip install tensorflow-gpu==1.0.0 & sudo pip install tensorflow-gpu==1.2

This resulted in an error:

tensorflow No module named slim

I also tried using nvidia-docker with the official tensorflow image, but I couldn't get it to work with SRGAN-tensorflow.

Dealing with Cuda and cuDNN versions is not fun, as the latest Cuda version is 9.0, but tensorflow only supports Cuda 8.0 with version 1.4. I couldn't manage to reconcile any issues regarding Cuda/cuDNN versions either.

License

Hi. I used your library and it works great. Could you please add a license to the repository?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.