GithubHelp home page GithubHelp logo

huangzehao / caffe-vdsr Goto Github PK

View Code? Open in Web Editor NEW
272.0 272.0 134.0 12.3 MB

A Caffe-based implementation of very deep convolution network for image super-resolution

License: MIT License

MATLAB 28.30% Makefile 0.97% TeX 9.70% Python 7.67% CSS 0.23% JavaScript 0.03% C++ 16.58% Cuda 22.10% C 1.91% Shell 2.18% Protocol Buffer 10.32% M 0.01%
caffe super-resolution

caffe-vdsr's People

Contributors

huangzehao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

caffe-vdsr's Issues

Can't reach the performance of your implementation

Hi, first of all thanks for sharing your code of vdsr.
I followed your code rencently. My training data is from 291 images (with data augumentation and factor 2) . I used your 'generate_train.m' to generate 148288 41*41 patches in all. The net and deploy prototxt files are the same as yours. The training procedure is about 73 epoches.
However, my implementation can't reach your performance on Set5 (using model VDSR_170000.mat). Here is my result.
VDSR_Official 37.55
YOUR_VDSR_170000 37.42
MY_VDSR_170000 37.14
It can be found that there is a obvious difference between my result and yours.
Can you help me to figure out what's the problem of my impementation? Thanks a lot.

About learning rate

Hi, I have a question about the learning rate. I notice that you set the learning rate to 0.0001 for weight and 0.00001 for bias. The training result with these two values is pretty good. I am curious that how you find these two values? Just keep trying or you have some methods to set learning rates? Could you talk about what you did when you tried to find appropriate learning rates? Thanks.

About testing data in Train/

Hi , I see the generate_test.m in your Train directory. I found that it is used to load the Set5 data.
Were those data used for validation data when training? If so, I don't know why you did this because normally we did not use the same data when validation and testing.
Thanks!

data augmentation

hi. Firstly, thanks for your source code. i wonder how can you do data augmentation to get 300000 sample data. could you explain this a little bit? did you use real time data augmentation during training or you do preprocessing likes image affinetransform. Cause i am struggle with real time data augmentation in Caffe

Data Preprocessing

Hi, I have questions about the training data. I use you generate_train.m to get the training data file and the training data set is 291. But I just get 54144 samples. In Issue Multi-Scale Implementation you say you get about 748000 samples, did you do data pretreatment (for example, rotating the picture with
the degree of 90, 180 and 270) to the training data set ?

Thanks

Need help for DRCN model

Hi, my inplement of DRCN model did not show better performance than VDSR, can you help me to check the net.prototxt?
[email protected],this is my email address , can you send me your email address?

question about test

Thank you for sharing.There are something wrong when i ran the "Demo_SR_Conv.m"in the test folder.The errors are listed below. Is there anything I need to set ??
出错 VDSR_Matconvnet (line 15)
convfea = vl_nnconv(im_y,weight{1},bias{1},'Pad',1);

出错 Demo_SR_Conv (line 49)
im_h_y = VDSR_Matconvnet(im_l_y, model,up_scale,use_cascade);

Test code in pyCaffe/C++

Do anyone have the test code without matconvnet. I tried using python interface and I am not able to get good results. I am able to get good results using matconvnet and the given test code. But not for python interface. Seems like I have done something wrong in the data preprocessing. It would be great if anyone has done it already.

How to decide when to low down the learning rate in an automatic way

Hi zehao,
Excellent work! I have a question. I train the model and run with 100000 iteration without lowering down the learning rate. Do we have an automatic way to decide when to lower down the learning rate? Thank you!
Besides, the original paper states, 'Input patch size is now equal to the size of the receptive field and images are divided into sub-images with no overlap.' In the code, it seems the overlap existed with a stride of 14, when sub image is 41. Is there any influences?
Thank you!

loss nan

你好,我用自己的训练集重新训练网络,但是loss输出总是nan,请问会是什么原因呢?

data preparation with python

Hi, I tried to prepare the data with cv2 resize function instead of matlab imresize, the training loss is not bad but the psnr performance is far worse than the paper. Any idea about that? Does it only due to the difference between opencv and matlab?

About GPU load

Could I ask you a question?
I trained a network and then used this network to process the image sequence and the Caffe framework is adopted. When I started one task, GPU load is 65%, but when I ran two independent tasks at the same time, the GPU load is only 70%. The processing speed of each road is almost half that of the original, that is, it cost two times time to process each task. The utilization rate of CPU is not high and there is plenty of memory. And there are no other tasks taking up GPU. My GPU is not full but processing speed is down,do you know why?

question about training time

hi huang,
I am going to training the x4 models using your vdsr training code and 91 images database, how long time will I take?
It has been running for about 16 hours.

Test Function at Caffe

Hi,
I'm trying to code the test function without the MatLab (which I do not have). But I cannot get any good results, it look. I have some question about the testing code:

  1. While testing you divide the image by 255.0 (so you have image between 0-1). Why are you doing that? I did not see any such normalization when you are creating a training data? I try to add this to my code but it does not work for my case.
  2. When I'm predicting the Y channel, the values from it are from -40 to 260, so there are not uint8. Aftter conversion it to uint8, I have some white/black spot. I haven't found any code which will resolve such case in that repo. Do you handle it any way?

Here is my Python code for Caffe:

import cv2
import numpy as np
import os
import sys
import caffe

class SuperResolutionNet(object):
    '''Predict the image which is rescaled version of original '''
    def __init__(self, config):
        caffe.set_mode_cpu()
        self.network = caffe.Net(config.deploy, config.extract, caffe.TEST)

    def predictImage(self, image, resize):
        # Get Image of desire size in YCrCb color space      
        h,w,c                                = image.shape
        hs                                   = int(np.ceil(h*resize))
        ws                                   = int(np.ceil(w*resize))
        # Change color space to YCrCb and get Y channel resized to new size
        img_color                            = cv2.cvtColor(image, cv2.COLOR_BGR2YCrCb)
        img_Y                                = img_color[:,:,0]
        img_Y                                = cv2.resize(img_Y, (ws,hs)).reshape(1, ws, hs)
        
        # Get Output of Y channel from Network
        self.network.blobs['data'].reshape(1, 1, hs, ws)
        self.network.blobs['data'].data[...] = img_Y.astype(np.float32)
        out                                  = self.network.forward()
        
        
        im_data                              = cv2.resize(img_color, (ws,hs))
        # Replace Y channel with predicted one
        im_data[:,:,0]                       = out['sum'].astype(np.uint8)
        
        # Get the BGR image 
        img_resized                          = cv2.cvtColor(im_data, cv2.COLOR_YCrCb2BGR)
        return img_resized



if __name__ == "__main__":
    config = lambda: None
    config.deploy  = 'train_model/VDSR_deploy.prototxt'
    config.extract = 'train_model/VDSR_Adam.caffemodel'
    config.resize  = 3.0
    spNet = SuperResolutionNet(config)
    image = cv2.imread(sys.argv[1])
    image_small = cv2.resize(image, None,fx=1.0/config.resize, fy=1.0/config.resize)
    image_resized = spNet.predictImage(image_small, config.resize)
    cv2.imwrite("resized_net.jpg", image_resized)
    
    image_cubic = cv2.resize(image_small, None,fx=config.resize, fy=config.resize)
    cv2.imwrite("resized_cubic.jpg", image_cubic)

你好!我想请教一下。

我用了60多万张图片,每个scale 2/3/4 按等分的。每个hdf5文件中都包含3个scale的训练样本,为什么我无论怎么训练都无法达到作者提供的模型呢!而且测试set5的图片一般都会相差0.3到0.4左右,大神能不能帮我分析一下什么原因么?

Some questions about DEM(>255)

Thanks for your work.
I use the code to test my DEM image(only Y channel)(3000-3500), It did not work first,I delete the ' 255 'in the code ,it works but the PSNR is really bad.
I thought it was beacuse the train set is not match ,so i try to train it for half a month when i use several kinds of data set,but i could not get a good result which better than bicubic ,the result is just the same as bicubic ,I once thought I make the code wrong,but I fail to fix it .
Could you help me?Thank you .

Thanks and some questions

Hi Huang,thanks your sharing! But I have some question.
1.The relation of Training image data and the image I want to rebuild.I mean if i want to rebuild some low resolution images of human face. Do you think my Training data should be all the human face images? Or I should make my Training data the more types, the better?
2. The Size of the cutting image in the file "generate_train". I mean you choose the 4141 size to cut the training data cause most paper choose the 91 or 291 images. And the training data all about 300300 size. If my training data all about 1000*1000, should I make my cutting size larger? If I do it, larger cutting size would learn more or less features?
3.About the Multi-scale. In the paper of VDSR the writer said the"Scale augmentation during training is a key technique to equip a network with super-resolution machines of multiple scales." I could not understand that because in FSRCNN, they also used the aug. I think VDSR make the multi scale because of the files "generate_train" and "generate_test". These two files have a "for scale = 2:4". So I want to ask that which is the most important reason to achieve the multi scale rebuild?
4.The most important reason that The VDSR is better the the SRCNN. I did a lot of trying. But I still don't know the key skill of VDSR. What do you think the most important change of VDSR to make a better PSNR to the SRCNN? Maybe the reason is "sum" in the end of the net make it only learn the difference between low and high resolution images? Or the deeper net 20 layers? (I tried 50 layers but a little little better rebuild effect).
So many questions to ask you. A little ashame ^-^

A display bug in Demo_SR_Conv.m ?

Hi, @huangzehao At first thanks for your work, i fellow your README.md file step by step to reproduce the results, but i found the figure title and content of the image do not correspond.

操作系统: Linux 4.2.0-27-generic x86_64
MATLAB 版本: 8.6.0.267246 (R2015b)
Java 版本: Java 1.7.0_60-b19 with Oracle

run Test/Demo_SR_Conv.m, i got

question

result like blow, it`s clear that the result of convolution super-resolution is better than the double-cubic interpolation, but does not match the visual result.

iter:1
时间已过 1.637649 秒。
sr_psnr: 29.954547 dB
bi_psnr: 24.038857 dB

So, i guess the figure command in matlab not run like we thought. In order to test my thoughts, I wrote one test_figure.m

 clc;
 clear all;
 close all;

a= imread('./images/Set5/baby_GT.bmp');
b= imread('./images/Set5/bird_GT.bmp');
c= imread('./images/Set5/butterfly_GT.bmp');

% Test/Demo_SR_Conv.m Line 57-59
% figure;imshow(uint8(im_b));title('Bicubic Interpolation');
% figure;imshow(uint8(im_h));title('SR Reconstruction');
% figure;imshow(uint8(im_gt));title('Origin');

% follow your way, display not right
figure;imshow(a);title('baby');
figure;imshow(b);title('bird');
figure;imshow(c);title('butterfly');

test_figure

Finally I found the right way

% right display code
imshow(a);title('baby');
figure,imshow(b),title('bird');
figure,imshow(c),title('butterfly');

figure_right

If there is something wrong, please note me,

Error in VDSR_Matconvnet (line 15)

hi, when i want to test the vdsr model by using matlab , i have met the following errors:
iter:1
Attempt to execute SCRIPT vl_nnconv as a function:
/local/home/share/xujinchang/project/learn_pytorch/caffe-vdsr-master/Test/matconvnet/matlab/vl_nnconv.m

Error in VDSR_Matconvnet (line 15)
convfea = vl_nnconv(im_y,weight{1},bias{1},'Pad',1);

Error in Demo_SR_Conv (line 43)
im_h_y = VDSR_Matconvnet(im_l_y, model,up_scale,use_cascade);

Thanks!

关于测试结果的一些问题

你好!有个问题想请教下,我在运行Demo_SR_Conv.m时输出的图片不太正常,图片会变成8个竖条,其中间隔的4个是黑白的。我使用的是VDSR_Official.mat,upscale=3。
想请教下是网络的问题还是有些代码没有调好,谢谢!

Tain a model

Hello,
I use the training prototxt and train a model for scaling factor 2 by myself on caffe. I didn't change anything, but after i get the trained model, the psnr using the trained model show low performance. The average PSNR on set5 is just 36.19. Could you provide some help for me?

About the image normalize

Hi ,can I ask a question that did you normalize or /255 to your image when you train your model?
I didn't see that in any script under the Train directory, but I found that in the Test/Demo_SR_Conv.m.
Thanks.!

Inference

Hi,

Thanks for sharing your code. What is the best way to use the trained model to get the upsampled version of a new image?

Thanks!

Should the pad be zero when training?

Hi,
I noticed that the pad was set to 1 in the prototxt. However, I think this will introduce errors when training.
For a 41x41 patch, the result was impacted by 43x43 pixels arround (assuming all kernels were 3x3).

Would it be better if the train sample pairs to be (43x43 low resolution image --> 41x41 high resolution image). And set pad = 0 for the first conv layer while leave pad = 1 for rest.

how to make the output of vl_simplenn has the same Matrix dimensions

function impred = runVDSR(net, imlow, gpu)
net2.layers = net.layers(1:end-1);
res2 = vl_simplenn(net2, imlow, []);
impred = res2(end).x;
impred = imlow+impred;
if gpu, impred = gather(impred); end

When I run the code mentioned above, I got the error shown as the blow:

run testVDSR.m
255 255
295 295

Error using +
Matrix dimensions must agree.

Error in runVDSR (line 8)
impred = imlow+impred;

Error in VDSR (line 47)
impred = runVDSR(net, imlowy, gpu);

Error in testVDSR (line 13)
VDSR(data, SF, 'VDSR.mat', outRoute);

Error in run (line 63)
evalin('caller', [script ';']);

Anyone can help me out ? the imlow is the input image 255255. but the output is 295295. is there anything wrong with my code??

Thank you !

loss during training

I am curious about the loss during training: what's the typical loss when it converges in your work (I think you use 256X256 size image for training?)?

Many thanks.

Jianyu

关于参数提取问题

你好!我看了你的代码不知道怎么去保存训练好的模型参数,希望你能上传一份你的代码给我们吗?不胜感激!谢谢了!

random white pixels in superres image

Hi,

I have successfully trained a model and the output images look good, with one exception. For some reason I get a lot of stray white pixels in dark areas of the image. I have attached an example. Some images only have a few of these stray pixels, while others (like this one) have many. Any idea what might be causing this?

Thanks!
badzebra

After finishing train

Thanks for sharing your code.

I finished training using caffe but after that i don't know how to test my data.
In SRCNN, they has matlab code for extracting parameter of caffemodel. Am i made my own code like that? How can you extract parameter from your caffemodel?

Multi-Scale Implementation

Hi, guys! The codes of multi-scale implementation and data argumentation have been updated.
The model trained by myself yields similar performance with original paper!

Data Augmentation

Hi,

I would like to know how you increased the images from 91 to 1638.

Any info will be highly appreciated.

Regards,

question about the learning rate

Dear huangzehao, thank you for sharing your code. I have one question bout the learning rate. I notice that during training, when the error plateaus, the learning rate should be changed. I divide learning rate by 10 when the error plateaus, however, it seems not helpful at all. The error did not reduce. Did you meet similar the problem during training ?

sr_psnr less than bicubic psnr

I tried your Test code directly and run the Demo_SR_Conv. But I get the following result:

iter:1
Elapsed time is 0.805680 seconds.
sr_psnr: 11.556457 dB
bi_psnr: 24.038857 dB

And the sr_reconstructed image looks like this:

figure
Any recommendation will be appreciated.

Questions about parameters

Hello, author. Thanks for your work.

I have some questions about the parameters. The input size is 41x41 when you generate training set. After training, you get a model based on the size 41x41. Can this model work well when you reconstruct an image whose size is quite different, like 2000x2000? I trained a model using your code. Everything is same except for the down sampling method(I did not use imreszie). I found when the size of the image is small, like 100x100, I got better PSNR and SSIM than bicubic interpolation. If the size of the image increases, bicubic interpolation will be bettter than VDSR. Do you think it is because of the input size? The other question is that, in the file VDSR_net_deploy.prototxt, the last two input_dim are set to 256, I don't understand where this value come from.

I'm looking forward to your reply. Thanks.

loss during training

At first thanks for your work.
I have trained nearly 300000 times by the code you provided. The trained images have been augmented to 5820 .The train_num is 748608.
Then the test loss converges to 0.4 almostly. Is that reasonable?

About the clip_gradient parameter

hi, huang
I made the clip_gradient = 0.1. when training, it always shows like this
"
I0111 17:21:27.510517 4417 sgd_solver.cpp:92] Gradient clipping: scaling down gradients (L2 norm 1.03171 > 0.1) by scale factor 0.0969264
"
from the begining to the end. (The numbers are different).
Check the original paper, the authors used the adjustable clip_gradient parameter. I think that your code is different from the original paper in the aspect of clip_gradient parameter. Will it cause some problem?
By the way, your code still gets a good performance.
Thanks for sharing your code with us.

End-2-end learning or residual learning

HI, Thank you for sharing this code for vdsr. I have a question about your implementation. It seems like authors from the paper achieved their performance by learning the residual (HR-interpolated) image. However, in your implementation, it looks like you are only utilizing the ground-truth images during training and you achieve similar results. Have you try to use the residual image at all?

I'm wondering is there some other things need to do in the traing progresss?

At first thanks for your work. I just run the code with caffe and get a iter_200000 model, and test it by your matlab code.But it looks doesn't have the performance compared to your 170000 iterations counterpart.Is there some training tricks I need to do during training progress?(like decrease the learning rate or other things).
ps:the test loss decreases quickly and stucked at about average 0.185 after 30000 iterations.

Questions about usage

Hi Zehao,

I don't quite understand the usage instructions.

Place the "Train" folder into "($Caffe_Dir)/examples/", and rename "Train" to "VDSR"
What is ($Caffe_Dir)? Is it caffe or caffe2? I am a bit confused about your directory structures in the usage section.

  1. To train VDSR, run ./build/tools/caffe train --solver examples/VDSR/VDSR_solver.prototxt
    What is ./build/tools/caffe? Is caffe an executable? Where is this directory?

I thought by based on caffe, you mean python code written in caffe, but I didn't find any python code file.

Thanks for your clarification.

about the data sample

hi zehao
first of all thank you for sharing the code. I have the train sample for 1920x1080 . the num is 5000. i did not do the data_aug. I think the data is enough. but when i run the code generate_train.m .The computer will crash. How should i do. hope your reply.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.