GithubHelp home page GithubHelp logo

vc's Introduction

Visão Computacional

FEUP 2023 2ºS

⚠️ Under Maintenance

Open in Gitpod

The data set wont be uploaded because size but, these noteboos are mostly trial of different opencv features so any image would do.

# 📝 Lecture Status
1 Convolution wExercises 66.67%
1 HandsOnPython 91.84%
1 LinearAlgebraNumpy 54.17%
2 Lecture Images and Pixels 85.71%
3 Lecture Filters 10.53%
4 Lecture ConnectedComponentLabeling 16.67%
5 Lecture Features 100.00%
6 Lecture Outliers 66.67%
7 Lecture Obj detection 6.25%
10 PytorchIntro 44.00%
11 Pytorch ObjectDetector 82.14%
12 Pytorch SemanticSegmentation 43.48%
# Assigments Status
1 Q&A M&M
2 Estimation of apparent Motion

vc's People

Contributors

martinhofigueiredo avatar nmcnascimento avatar josepedrocruz avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

vc's Issues

Research

👀 Look into

  • (Local) Lucas-Kanade
  • (Global) Horn-Schunck
  • Visualizations
  • Source code

📃 Find the papers:

Optical flow

This is defined as the velocity (in x and y components) and the intensity gradient in the same point

Lucas-Kanade

  • This approach is local because it assumes neighboring pixels move together

Horn-Schunck

  • the formulation of this method includes a "smoothing term" which helps propagate the neighboring value through flat texture-less areas
  • This help the algorithm deal with large optical flow values but degrades the boundary conditions between groups of pixels

Multi-channel

Just means all colors RGB or whatever the image means

Multi-resolution

Resizing the image helps the algorithm to detect large blobs because of the overall lack of resolution
Creates iteratively smaller images to help clear up boundaries

Visualizations

There should be a python function to do the visualizations required but it also should be hard to make after we implement each algorithm

the direction of the gradient vector is the hue of the segmented pixels and the strength of this vector is the brightness associated with the pixel.

Implementations

The source code regarding this is scarce but there is some documentation in OpenCV regarding particulary Lucas-Kanade

Choose from proposals

Project 1 – Estimation of the apparent motion

Visual motion perception from a moving observer is the most often encountered case in real life situations. It is a complex and challenging problem, although, it can promote the arising of new applications.

  1. Implement two traditional optical flow techniques with multichannel and multiresolution with refinement approaches, namely: (Local) Lucas-Kanade and (Global) Horn-Schunck.

  2. Consider the following metrics to assess the quality of your implementation:

    • Average angular error (AAE) and standard deviation
    • Average end-point error (EPE) and standard deviation
    • Dataset for benchmarking: link
  3. Discuss the results, taking into consideration the following paper:

Andry et al. (2013), Revisiting Lucas-Kanade and Horn-Schunck, Journal of Computer Engineering and Informatics, Apr. 2013, Vol. 1 Iss. 2, PP. 23-29.

Results must be provided as a table.

  1. Consider the image sequences for this project and estimate the optical flow using these two techniques. Produce two videos per image sequence showing the magnitude and orientation of the flow using the color scheme presented in the lectures. Discuss the results obtained.

Image sequences for this project:
- Forest_15_3b_Videvo
- Forest_15_4_Videvo

Project 2 – Denoising a video sequence.
    The presence of noise in videos affects subsequent image processing phases, such as three dimensional reconstruction, registration, classification of objects, motion segmentation and
    analysis, tracking, identification and recognition of humans. Thus, denoising is an extremely
    important pre-processing phase that is used to improve the perceptual appearance of images;
    however, a trade-off between noise reduction and data preservation is important to enhance
    the characteristics of images that are relevant for high level algorithms

    1) Implement the robust bilateral and temporal filter (RBLT) for denoising a video sequence.
    Spatial and temporal components are incorporated into the filter formulation, which increases
    the filter's ability to remove strong noise components. Consider the Geman-Mcclure or the
    Charbonnier as error norms for M-Estimators.

    2) Consider the following evaluation metrics to assess the quality of your implementation:
        - SIIM
        - PNSR

    The original image (distortion-free or reference), must be compared to the distorted image,
    using these two evaluation metrics. The distorted image is obtained by corrupting the original
    image with a distinct noise configuration (Salt-Pepper and Gaussian Noise) and then, the image
    sample is filtered by each filter, individually. The level of noise that should be added to each
    original image is 20 to 40 of standard deviation for Gaussian noise and 10 to 30% for the SaltPepper noise. Results must be provided graphically.

    3) Discuss the results by taking into consideration the median and Gaussian filter. You can also
    consider the following paper: Andry et al. (2013), Enhancing dynamic videos for surveillance and
    robotic applications: The robust bilateral and temporal filter, Signal Processing: Image
    Communication, Elsevier, 2014.

    4) Consider the image sequences for this project and estimate the optical flow using these two
    techniques. Produce two videos per image sequence showing the magnitude and orientation of
    the flow using the color scheme presented in the lectures. Discuss the results obtained.
    Image sequences for this project:
        - mlky_6
        - 210329_06A_Bali_4k_004
        - Saint_Barthelemy_2
Project 3 – Captcha decoding. A CAPTCHA (Completely Automated Public Touring test to Tell Computers and Humans Apart) is a commonly used feature in web applications to block non-human access. CAPTCHAs' purpose is to prevent spam on websites, such as promotion spam, registration spam, and data scraping, and bots are less likely to abuse websites with spamming if those websites use CAPTCHA. Many websites use CAPTCHA to prevent bot raiding, and it works effectively. CAPTCHA's design is that humans can complete CAPTCHAs, while most robots can't. 1) This project aims to develop a CNN with ability to decode CAPTCHA images considering 4 and 5 encoders. The model of the CNN needs to be designed, implemented and trained (no fine tuning approaches should be applied); 2) Consider the following metrics: a. Train and test accuracy; b. Confusion matrix; c. Others evaluation methodologies (e.g., confusion matrix, histograms). 3) Discuss the result of your approach, in particular, limitations; 4) Consider the CAPTCHA dataset provided which has 4 to 5 digits. a. Soft dataset is formed by CAPTCHAs that are more simple. Students must start the project with this dataset. b. Hard dataset is formed by CAPTCHAs with strange elements added, to make the identification more difficult to predict.
Project 4 – Open Project

Students can develop a project in CV that is related to their MSc Thesis. Therefore, the teams
should send a project proposal until the 14th of April, 2023, containing the following topics:
- Motivation
- Objectives
- Problem statement (eg, classification, regression, etc)
- Dataset

Implement Benchmarking

To be filled after research what its needed to implement it

To do:

  • Implent it 😜
  • Average angular error (AAE) and standard deviation
  • Average end-point error (EPE) and standard deviation
  • Dataset for benchmarking: link

Setup Env

To do List:

  • Create dockerfile for gitpod and github
  • Pytorch
  • OpenCv
  • Add dataset to github 1fa9bc0
  • CI/CD for evaluation (❔)

Report Making

To try and write as little as possible we will deliver the jupyter notebook (i think)

But to try and speed up the process we will use the project to track what we do and hopefully it will be copy and paste to a new document and that will work

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.