GithubHelp home page GithubHelp logo

lxrd-aj / yolo_v1 Goto Github PK

View Code? Open in Web Editor NEW
6.0 3.0 2.0 8.37 MB

You Only Look Once version 1

Home Page: https://araintelligence.com/blogs/deep-learning/object-detection/yolo_v1/

Python 98.79% Shell 1.21%
object-detection yolo yolov1 yolov1-training computer-vision

yolo_v1's Introduction

YOLOv1

A technical article and PyTorch Implementation of the popular YOLO object detection algorithm by Joseph Redmon. Follow this link for the technical article available online at AraIntelligence.com which provides an in-depth explanation about the algorithm.

Sample Prediction

PIP Requirements

The python version used for development is Python 3.6.10 and the libraries used are located in requirements.txt. Highlighted below are the most important ones

  • numpy==1.18.1
  • Pillow==7.0.0
  • torch==1.5.0
  • torchvision==0.6.0
  • torchviz==0.0.1

Training

The model was trained on the Pascal VOC 2007+2012 dataset which was partitioned into a training set of 13,170 images and a validation set of 8,333 images. There are some helper scripts available:

  • ./data/download_voc.sh : Downloads the VOC dataset from Joseph Redmon's VOC mirror and partitions the dataset into a training and validation one
  • python training.py : is the script used to train the model

The model achieves a relatively low training loss on the dataset. The validation loss is lower than the training loss due to the data augmentation applied to the train dataset during training.

Image Augmentation

Sample augmentations applied are shown below

Color Jitter

Random Blur

Random Horizontal Flip

Random Vertical Flip

Inference

Inference speed

Device Backbone Input Size Time (seconds)
2.9 GHz Dual-Core Intel Core i5 ResNet50 448x448 2s
NVIDIA Tesla K80 GPU ResNet50 448x448 -

Prediction

  • Image File on disk: python detect_image --image="./data/car_bike.png" --model="./model_checkpoints/90_epoch.pth"
  • Video file on disk: python detect_image --video="./data/IMG_4855.mov" --model="./model_checkpoints/90_epoch.pth" --output="./processed_IMG_4855.mov"

yolo_v1's People

Contributors

dependabot[bot] avatar jenhaoyang avatar lxrd-aj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

yolo_v1's Issues

crash

hi Lxrd-AJ:
I run your code, but when one epoch will end, its output error as below
my torch version is: 1.6.0

Traceback (most recent call last):
File "xxxx/python-code/yolov1/training.py", line 157, in
predictions = model(images)
File "xxxx/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "xxxx/anaconda3/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 155, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "xxxx/anaconda3/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 165, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "xxxx/anaconda3/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
output.reraise()
File "xxxx/anaconda3/lib/python3.8/site-packages/torch/_utils.py", line 395, in reraise
raise self.exc_type(msg)
ValueError: Caught ValueError in replica 0 on device 0.
Original Traceback (most recent call last):
File "xxxx/anaconda3/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, **kwargs)
File "xxxx/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "xxxx/python-code/yolov1/yolo_v1.py", line 70, in forward
linearOutput = self.linear_layers(flattened)
File "xxxx/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "xxxx/anaconda3/lib/python3.8/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "xxxx/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "xxxx/anaconda3/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 131, in forward
return F.batch_norm(
File "xxxx/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py", line 2012, in batch_norm
_verify_batch_size(input.size())
File "xxxx/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py", line 1995, in _verify_batch_size
raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size))
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 12544])

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.