GithubHelp home page GithubHelp logo

pradeeplam / anime-sketch-coloring-with-swish-gated-residual-unet Goto Github PK

View Code? Open in Web Editor NEW
209.0 209.0 38.0 82 KB

Implementation of paper which uses a swish-gated residual U-net to color line-art anime drawings

Python 98.22% Shell 1.78%

anime-sketch-coloring-with-swish-gated-residual-unet's People

Contributors

alexanderkoumis avatar pizzorni avatar pradeeplam avatar zamlz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

anime-sketch-coloring-with-swish-gated-residual-unet's Issues

Open questions for authors

  • Batch size?
  • In up layers, how does swish blocks upsample? Unpooling or conv2d transpose? (conv2d transpose)

Broken requirements.txt

Tried to install as is and got
image

After dropping the version of opencv-python and some others build has failed
image

Ubuntu 20.04 LTS
pip 20.0.2 from /usr/lib/python3/dist-packages/pip (python 3.8)

Update readme

Readme needs to incorporate:

  • Result images
  • Instructions on how to run predictions on new images using evaluate.py script

The constructed network is inconsistent with the network in the paper!

In the network structure diagram of the paper, the resolution of the feature map is divided into 6 levels, but in model.py, there are only 5 resolution levels. Actually, when the number of channels of the feature map is 512, the resolution of the feature map is the smallest, which is still obtained by downsampling, and it has no horizontal connection like the right branch.
In model.py, in the final output of the right branch, the output of the horizontal swish layer of the left branch is not acquired.

This leads to the asymmetry of the left and right branches of the network, misaligned connections.

Select the best training model through the verification phase

I have an idea that perhaps a small number of verification sets should be build that can verify the effect of the model after each training eopch. Instead of choosing the last epoch of training or the epoch with the best convergence. It may prevent over-training (over-fitting), resulting in inconsistent colors in a certain area of the generated picture. For example, the hair of the person in the final epoch of training will show multiple colors, and even the left and right colors of the clothing are not symmetrically consistent.
In the dataset given in the paper, there are dozens of numbered pictures at the end for the verification phase. It seems to be about 60 sketches.

Diagrams in paper don't seem consistent.

The two diagrams below don't seem consistent. I mentioned this in the slack, but I'm putting it here so this conversation will be easier to access.
image
image
Essentially the SGB marked as (c) doesn't seem to match with the SGB in the first image (brown dashed box).

Remove temporary hack that rounds all images to 128x128

If the dimension of the input image is not evenly divisible by 32 (the network cuts it in half 5 times) there will most likely be a wonky concatenation issue due to mismatching tensor dimensions. The referenced line performs a hack that prevents this by resizing all images to 128x128. A better solution would be to recreate a dataset where all images are 224 or 256 or something.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.