GithubHelp home page GithubHelp logo

denoising-autoencoder's Introduction

Denoising AutoEncoder 自编码器图像去噪

23Spring 重庆大学计算机学院 深度学习课程项目-任务8

0. 问题描述

在神经网络世界中,对图像数据进行建模需要特殊的方法。其中最著名的是卷积神经网络(CNN或ConvNet)或称为卷积自编码器。自编码器是一种神经网络结构,它由两个子网络组成,即编码和解码网络,它们之间通过一个潜在空间相互连接。

1. 自编码器模型

1.1 DAE 去噪自编码器

最简单的Encoder-Decoder结构,以MLP为基础

1.2 DCAE 卷积自编码器

将DAE中的MLP替换为卷积神经网络

1.3 VAE 变分自编码器

与自编码器通过数值的方式描述潜在空间不同,以概率的方式描述对潜在空间。在原DAE的基础上,还具有生成能力

2. 代码框架

─── Denoising_AE/
    └── data/ # 数据集
        ├── MNIST
    └── checkpoints/ # 预训练模型
        └── DAE.pth
        └── DCAE.pth
        └── VAE.pth
    └── models/ # 网络模型
    	└── DAE.py
        └── DCAE.py
        └── VAE.py
    └── results/ # 实验结果
        └── VAE_test/
            ├── origin.png
            ├── noisy.png
            ├── reconstructed.png
        └── DAE_test
        └── DCAE_test
    └── scripts/ # 训练测试脚本
        └── VAE/
            ├── train.sh
            ├── test.sh
        └── DAE
        └── DCAE
    └── train.py # 训练代码
    └── predict.py # 测试代码
    └── scripts.ipynb 

3. 代码运行

3.1 数据集下载

若已有MNIST数据集,将其放在/data/文件夹中。若没有数据集,则将train.py中第38行的download设置为True,将会自动下载数据集至/data/文件夹中

3.2 模型训练

若为Colab或华为Notebook运行,可在scripts.ipynb文件中运行

/Denoising_ae/路径下运行:

sh scripts/[选择模型]/train.sh

scripts/[选择模型]/train.sh中可进行超参数调整,可调整的超参数有:

  • batch_size 批量大小 - 默认值32

  • lr 学习率 - 默认1e-3

  • num_epochs 训练轮次 - 默认10

  • input_dim DAEVAE输入数据维数 - 默认784(MNIST数据集图像大小为28*28

  • hidden_dim DAEVAE隐藏层数据维数 - 默认400

  • output_dim DAEVAE隐空间z维数 - 默认20

  • ae_model_type 模型类型 - 默认DAE, 可选[DAE, DCAE, VAE]

  • log_dir Tensorboard log文件输出地址 - 默认./logs

  • model_save_path 模型checkpoints保存地址 - 默认./checkpoints

  • device GPU选择 - 默认'0' (第0块gpu)

运行完成后,模型训练得到的Checkpoints保存在checkpoints文件夹中。训练过程写入Tensorboard中,log文件保存在./logs中,查看训练过程可运行:

tensorboard --logdir=./logs --port 6006

若为服务器运行,可先用ssh实现服务器远程端口到本地端口的转发:

ssh -L 6006:127.0.0.1:6006 hostname@my_server_ip

3.3 模型预测

在完成训练后,或已有模型Checkpoints后,可在MNIST数据集上测试模型去噪效果(确保数据集已下载且放在/data/文件夹中)。

/Denoising_ae/路径下运行:

sh scripts/[选择模型]/predict.sh

预测部分同样可以在scripts/[选择模型]/predict.sh中可进行超参数调整,可调整的超参数有:

  • batch_size 批量大小 - 默认值32

  • load_model_path 模型checkpoints地址

  • num_epochs 训练轮次 - 默认10

  • input_dim DAEVAE输入数据维数 - 默认784(MNIST数据集图像大小为28*28

  • hidden_dim DAEVAE隐藏层数据维数 - 默认400

  • output_dim DAEVAE隐空间z维数 - 默认20

  • ae_model_type 模型类型 - 默认DAE, 可选[DAE, DCAE, VAE]

  • device GPU选择 - 默认'0' (第0块gpu)

!!注意 input_dim, hidden_dim, output_dim 需要与训练时指定值保持一致

输出结果为origin.png, noisy.png, reconstructed.png. 分别为MNIST的原始图像,加噪声的图像和模型去噪得到的图像

参考文献

1. Image Denoising and Inpainting with Deep Neural Networks
2. How to Reduce Image Noises by Autoencoder
3. Building Autoencoders in Keras

denoising-autoencoder's People

Contributors

calvinren avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.