GithubHelp home page GithubHelp logo

afondiel / computer-vision-challenge Goto Github PK

View Code? Open in Web Editor NEW
21.0 1.0 1.0 101.35 MB

This is a series of computer vision foundational projects that anyone diving into the field must tackle.

License: MIT License

Jupyter Notebook 99.97% Python 0.03%
computer-vision computer-vision-algorithms computer-vision-datasets computer-vision-opencv computer-vision-projects computer-vision-python computer-vision-tools computer-vision-hello-world computer-vision-challenge cv-challenge image-classification image-detection image-generation image-processing vision-models vision-transformer

computer-vision-challenge's Introduction

GitHub GitHub repo size GitHub commit activity (branch) Packagist Stars Packagist forks

Computer Vision Challenge ๐Ÿ†

Overview

This is a collection of foundational projects for anyone diving into computer vision.

Explore some of computer vision core concepts and hands-on projects through this fun challenge.

The project has 3 levels:

  • Level 0 - Zero (beginner): Getting Started with Basics
  • Level 1 - Apprentice (intermediate): Hands-on Computer Vision with Deep Learning
  • Level 2 - Hero (advanced): Vision LLMs: Image Generation(GANs, VAEs...), Synthesis & Captioning

Important

In L1 and L2, we primarily leverage pre-trained models to ensure accessibility for everyone. This also allows us to explore a wider range of vision recognition tasks using different types of models while focusing on the model's performance and outcome.

Basic Computer Vision Pipeline

graph LR
    A[Image Acquisition] ==> B[Image Processing]
    B ==> C[Feature Extraction]
    C ==> D[Output, Interpretation & Analysis]

    style A fill:#EEE,stroke:#333,stroke-width:4px
    style B fill:#F88,stroke:#333,stroke-width:4px
    style C fill:#4F4,stroke:#333,stroke-width:4px
    style D fill:#33F,stroke:#333,stroke-width:4px
Loading

Requirements

To install the dependency packages using either conda or pip:

Using conda:

  1. create a new conda environment
conda create --name cv-challenge
  1. Activate the newly created environment:
source activate cv-challenge  # For bash/zsh
conda activate cv-challenge  # For conda prompt/powershell
  1. Install dependencies from the requirements.txt file:
conda install --channel conda-forge --file requirements.txt

Using pip:

  1. Install dependencies from the requirements.txt file:
pip install -r requirements.txt

Hands-on Computer Vision Challenges!

Level 0 - Zero: Getting Started with Basics ๐Ÿ’ช

Project Description Notebooks
[1] Getting Stated with Images Load an image, display it, and apply basic transformations. Open notebook in Colab
[2] Basic Image Manipulation Modify pixels, resizing, Flipping, Cropping, image annotations Open notebook in Colab
[3] Image Filtering & Restoration Enhance or manipulate image features using filtering techniques. Open notebook in Colab
[4] Image Enhancement Enhance using arithmetic & bitwise operations Open notebook in Colab
[5] Image Segmentation (Traditional) segment images into regions or pixels that belong to different classes or categories Open notebook in Colab
[6] Feature Extraction & Alignment Learn how to extract features from images using descriptors based on the nature of the features Open notebook in Colab
[7] Optical Character Recognition (OCR) Learn how to recognize text in images or documents using libraries such as Tesseract, Pytesseract, or EasyOCR Open notebook in Colab

Level 1 - Apprentice: Hands-on Computer Vision with Deep Learning ๐Ÿ”ฅ

Project Description Notebooks
[1] MNIST Handwritten Digit Recognition Train a simple neural network to classify handwritten digits from the MNIST dataset. Open notebook in Colab
[2] CIFAR-10 Image Classification Utilize convolutional neural networks (CNNs) to classify images of different types of objects from the CIFAR-10 dataset. Open notebook in Colab
[3] Object Detection with YOLOv5 Implement YOLOv5, a real-time object detection algorithm, to detect objects in images and videos. Open notebook in Colab
[4] Semantic Segmentation with DeepLabv3+ Utilize DeepLabv3+, a semantic segmentation model, to segment images into different semantic categories. Open notebook in Colab
[5] Facial Recognition with OpenFace Explore facial recognition using OpenFace, a facial recognition library, to identify individuals in images. Open notebook in Colab
[6] Object Tracking Follow the movement of objects in a video sequence. Open notebook in Colab
[7] Human Pose Estimation Estimate the pose of a person in an image or a video using OpenCV and a pre-trained model. Open notebook in Colab

Level 2 - Hero: Vision LLMs: Image Generation(GANs, VAEs...), Synthesis & Captioning โšก

Project Description Notebooks
[1] Creative Image Generation with GANs Generate novel images of different styles using GANs. Open notebook in Colab
[2] Text-to-Image Synthesis with LLMs and Diffusion Models Create realistic and creative images from text descriptions using LLMs and diffusion models. Open notebook in Colab
[3] AI-Powered Image Restoration and Enhancement Restore and enhance images using AI methods. Open notebook in Colab
[4] Style Transfer with GANs and Image Processing Transfer the artistic style of one image to another. Open notebook in Colab
[5] AI-Driven Image Captioning and Storytelling Generate comprehensive and creative captions and stories from images using LLMs. Open notebook in Colab
[6] AI-Assisted Image Editing and Manipulation Automate image editing and manipulation tasks using AI. Open notebook in Colab
[7] AI-Powered Image Analysis and Classification Analyze and classify images using AI models Open notebook in Colab

Usage

Most projects are written in Jupyter notebooks, you can run the directly using jupyter notebook/lab or Colab.

For projects with a main.py file, run the command below:

python main.py

Contributing

Help this project grow! Add new projects, improve existing ones and fix issues.

Please follow these steps to contribute:

  • Fork this repository and clone it to your local machine.
  • Create a new branch with a descriptive name for your contribution.
  • Add your code and files to the branch and commit your changes.
  • Push your branch to your forked repository and create a pull request to the main repository.
  • Wait for your pull request to be reviewed and merged.

LICENSE

This project is licensed under the MIT LICENSE.

Reference

Some of the projects in this repository are inspired by or based on the following sources:

"Vision is a picture of the future that produces passion." - Bill Hybels

computer-vision-challenge's People

Contributors

afondiel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

hammadishfaq

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.