GithubHelp home page GithubHelp logo

satojkovic / deeplogo2 Goto Github PK

View Code? Open in Web Editor NEW
54.0 3.0 11.0 1.09 MB

A brand logo detection system by DETR

License: MIT License

Python 77.28% Jupyter Notebook 22.59% Dockerfile 0.13%
deep-learning object-detection logo-detection pytorch transformer detr fine-tuning transfer-learning

deeplogo2's Introduction

DeepLogo2

A brand logo detection system using DETR. (DeepLogo with Tensorflow Object Detection API is here)

Description

DETR is a Transformer-based object detection model published by Facebook AI in 2020. Pytorch training code and pretrained models are also available on Github.

DeepLogo2 provides a training and inference environment for creating brand logo detection models using DETR.

Detection results

example1 example2
example3 example4
example5 example6

Dataset

DeepLogo2 use the flickr logos 27 dataset. The flickr logos 27 dataset contains 27 classes of brand logo images downloaded from Flickr. The brands included in the dataset are: Adidas, Apple, BMW, Citroen, Coca Cola, DHL, Fedex, Ferrari, Ford, Google, Heineken, HP, McDonalds, Mini, Nbc, Nike, Pepsi, Porsche, Puma, Red Bull, Sprite, Starbucks, Intel, Texaco, Unisef, Vodafone and Yahoo.

To fine-tuning DETR, the dataset is conveted to COCO format.

python preproc_annot.py
python flickr2coco.py --mode train --output_dir flickr_logos_27_dataset
python flickr2coco.py --mode test --output_dir flickr_logos_27_dataset

Fine-tuning DETR

DeepLogo incorporates the DETR repository as a subtree, with the following changes for fine-tuning on the flickr logos 27 dataset.

Note: For code modifications for fine-tuning DETR, please refer to woctezuma/detr_fine_tune.md

  • Add custom dataset builder method(detr/datasets/flickr_logos_27.py, detr/datsets/__init__.py)

    def build(image_set, args):
        root = Path(args.coco_path)
        assert root.exists(), f'provided root path {root} does not exist'
        train_json = 'flickr_logos_27_train.json'
        test_json = 'flickr_logos_27_test.json'
        PATHS = {
            "train": (root / 'flickr_logos_27_dataset_images', root / train_json),
            "val": (root / 'flickr_logos_27_dataset_images', root / test_json),
        }
    
        img_folder, ann_file = PATHS[image_set]
        dataset = CocoDetection(img_folder, ann_file, transforms=make_coco_transforms(image_set), return_masks=args.masks)
        return dataset
    def build_dataset(image_set, args):
        ...
        if args.dataset_file == 'flickr_logos_27':
            from .flickr_logos_27 import build as build_flickr_logos_27
            return build_flickr_logos_27(image_set, args)
        raise ValueError(f'dataset {args.dataset_file} not supported')
  • Modify the num_classes to match the flickr logos 27 dataset(detr/models/detr.py)

    def build(args):
        # the `num_classes` naming here is somewhat misleading.
        # it indeed corresponds to `max_obj_id + 1`, where max_obj_id
        # is the maximum id for a class in your dataset. For example,
        # COCO has a max_obj_id of 90, so we pass `num_classes` to be 91.
        # As another example, for a dataset that has a single class with id 1,
        # you should pass `num_classes` to be 2 (max_obj_id + 1).
        # For more details on this, check the following discussion
        # https://github.com/facebookresearch/detr/issues/108#issuecomment-650269223
        num_classes = 20 if args.dataset_file != 'coco' else 91
        if args.dataset_file == "coco_panoptic":
            # for panoptic, we just add a num_classes that is large enough to hold
            # max_obj_id + 1, but the exact value doesn't really matter
            num_classes = 250
        if args.dataset_file == 'flickr_logos_27':
            num_classes = 27  # max_obj_id: 26
        ...
  • Delete the classification head and loading the state dict(delete_head_and_save.py, detr/main.py)

    Get the pretrained weights with the following script, delete the head, and save it as new file.

    python delete_and_save.py

    Load the state dict at main.py

    model_without_ddp.load_state_dict(checkpoint['model'], strict=False)

    Reference: facebookresearch/detr#9 (comment)

Training

To fine-tuning DETR on flickr logos 27 dataset:

python detr/main.py \
  --dataset_file "flickr_logos_27" \
  --coco_path "flickr_logos_27_dataset" \
  --output_dir "outputs" \
  --resume "detr-r50_no-class-head.pth" \
  --epochs 100

It takes about 3 hours and 15 minutes with Google Colab Pro to run 100 epochs.

The DETR fine-tuning can be checked by running Train_DeepLogo2_by_detr.ipynb.

deeplogo2's People

Contributors

satojkovic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

deeplogo2's Issues

Problem with Pytorch Distributed Training

Attempting to train a model using the Jupyter notebook:

!python3 detr/main.py
--dataset_file "flickr_logos_27"
--coco_path "flickr_logos_27_dataset"
--output_dir "outputs"
--resume "detr-r50_no-class-head.pth"
--epochs 100

Results in the following error:

| distributed init (rank 0): env://
Traceback (most recent call last):
File "/blue/egn4951/manuel.cortes/data/testing_grounds/DeepLogo2/detr/main.py", line 250, in
main(args)
File "/blue/egn4951/manuel.cortes/data/testing_grounds/DeepLogo2/detr/main.py", line 108, in main
utils.init_distributed_mode(args)
File "/blue/egn4951/manuel.cortes/data/testing_grounds/DeepLogo2/detr/util/misc.py", line 427, in init_distributed_mode
torch.distributed.init_process_group(backend=args.dist_backend, init_method=args.dist_url,
File "/home/manuel.cortes/.local/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py", line 595, in init_process_group
store, rank, world_size = next(rendezvous_iterator)
File "/home/manuel.cortes/.local/lib/python3.9/site-packages/torch/distributed/rendezvous.py", line 255, in _env_rendezvous_handler
master_addr = _get_env_or_raise("MASTER_ADDR")
File "/home/manuel.cortes/.local/lib/python3.9/site-packages/torch/distributed/rendezvous.py", line 232, in _get_env_or_raise
raise _env_error(env_var)
ValueError: Error initializing torch.distributed using env:// rendezvous: environment variable MASTER_ADDR expected, but not set

These are all the arguments being passed:

Namespace(lr=0.0001, lr_backbone=1e-05, batch_size=2, weight_decay=0.0001, epochs=100, lr_drop=200, clip_max_norm=0.1, frozen_weights=None, backbone='resnet50', dilation=False, position_embedding='sine', enc_layers=6, dec_layers=6, dim_feedforward=2048, hidden_dim=256, dropout=0.1, nheads=8, num_queries=100, pre_norm=False, masks=False, aux_loss=True, set_cost_class=1, set_cost_bbox=5, set_cost_giou=2, mask_loss_coef=1, dice_loss_coef=1, bbox_loss_coef=5, giou_loss_coef=2, eos_coef=0.1, dataset_file='flickr_logos_27', coco_path='flickr_logos_27_dataset', coco_panoptic_path=None, remove_difficult=False, output_dir='outputs', device='cuda', seed=42, resume='detr-r50_no-class-head.pth', start_epoch=0, eval=False, num_workers=2, world_size=1, dist_url='env://')

The problem seems to be that MASTER_ADDR is not set correctly, if at all. Any way to find a fix for this?

Thank you,
Manuel

Creating Confusion matrix

Where in the code can we gain access to the predictions and target values for each image such that a confusion matrix may be created?

Missing Files

@satojkovic Can you please provide the link to the following files?
train_img_npy = '/content/drive/MyDrive/DeepLogo2/train_data/train_images_np.npy'
gt_boxes_npy = '/content/drive/MyDrive/DeepLogo2/train_data/gt_boxes.npy'
gt_class_ids_npy = '/content/drive/MyDrive/DeepLogo2/train_data/gt_class_ids.npy'

Pick up training where it left off

Running the fine-tuning for 100 epochs seems to only marginally train the model. However, running the command again leads to the model starting the fine-tuning from the beginning, rather than the 100th epoch. Is there a way to make it so that I may 'continue' the finetuning?

Files Missing

flickr_logos_27_dataset_training_set_annotation_cropped.txt not found

Number of Samples

Hey, if I want to add custom logos, how many examples/pictures of the logo are needed to achieve reasonable performance.
And is it possible to add detectable logos without retraining the whole model?

Thanks for the great project!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.