GithubHelp home page GithubHelp logo

pha123661 / ntu-2022fall-dlcv Goto Github PK

View Code? Open in Web Editor NEW
17.0 2.0 2.0 194.25 MB

Deep Learning for Computer Vision 深度學習於電腦視覺 by Frank Wang 王鈺強

Home Page: http://vllab.ee.ntu.edu.tw/dlcv.html

Shell 0.95% Python 84.60% C++ 4.33% Cuda 10.12%
computer-vision deep-learning gan image-classification image-generation image-segmentation adversarial-domain-adaptation cnn image-captioning long-tailed-recognition novel-view-synthesis point-cloud-segmentation self-supervised-learning vision-language zero-shot-classification

ntu-2022fall-dlcv's Introduction

NTU-2022Fall-DLCV

Deep Learning for Computer Vision 深度學習於電腦視覺 by Frank Wang 王鈺強

Surpassed strong baseline for all four assignments (Final grade: 97.22/100)

⭐Please consider starring this project if you find my code useful.⭐

Outline

For more details, refer to the reports.

  • HW1 Spec: Report
    • Image classification ← Pretrained BEiT v1
      • Accuracy: 0.9360
    • Image segmentation ← Pretrained Deeplab v3
      • mIoU: 0.7438
  • HW2 Spec: Report
    • Face generation with GAN ← SNGAN (DCGAN with spectral normalization)
      • FID: 25.986
      • Face recognition accuracy: 0.9110
    • Digit generation with diffusion model ← Ho et al. Classifier-Free Diffusion Guidance.
      • Digit classifier accuracy: 0.9990
    • Domain adversarial network on MNIST-M, SVHN, and USPS
      • M→S Accuracy: 0.4943
      • M→U Accuracy: 0.9025
  • HW3 Spec: Report
    • Zero-Shot image classification with CLIP ← CLIP L/14
      • Accuracy: 0.8124
    • Image captioning with pretrained encoder ← Pretrained DeiT v3 as encoder
      • CIDEr score: 0.9413
      • CLIP score: 0.7310
    • Attention map visualization for image captioning
  • HW4 Spec: Report
    • 3D novel view synthesis ← DVGO (voxelized NeRF)
      • PSNR: 35.6029
      • SSIM: 0.9769
    • Self-Supervised pretraining for image classification ← BYOL
      • Accuracy: 0.5985
      • Outperforms the supervised equivalent in both full fine-tuning and frozen backbone evaluation.
  • Final Project Spec -- Challenge 2: Poster
    • Long-tailed 3D point cloud semantic segmentation

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.