satojkovic / deeplogo2 Goto Github PK

View Code? Open in Web Editor NEW

54.0 3.0 11.0 1.09 MB

A brand logo detection system by DETR

License: MIT License

Python 77.28% Jupyter Notebook 22.59% Dockerfile 0.13%

deep-learning object-detection logo-detection pytorch transformer detr fine-tuning transfer-learning

deeplogo2's Introduction

DeepLogo2

A brand logo detection system using DETR. (DeepLogo with Tensorflow Object Detection API is here)

Description

DETR is a Transformer-based object detection model published by Facebook AI in 2020. Pytorch training code and pretrained models are also available on Github.

DeepLogo2 provides a training and inference environment for creating brand logo detection models using DETR.

Detection results

Dataset

DeepLogo2 use the flickr logos 27 dataset. The flickr logos 27 dataset contains 27 classes of brand logo images downloaded from Flickr. The brands included in the dataset are: Adidas, Apple, BMW, Citroen, Coca Cola, DHL, Fedex, Ferrari, Ford, Google, Heineken, HP, McDonalds, Mini, Nbc, Nike, Pepsi, Porsche, Puma, Red Bull, Sprite, Starbucks, Intel, Texaco, Unisef, Vodafone and Yahoo.

To fine-tuning DETR, the dataset is conveted to COCO format.

python preproc_annot.py
python flickr2coco.py --mode train --output_dir flickr_logos_27_dataset
python flickr2coco.py --mode test --output_dir flickr_logos_27_dataset

Fine-tuning DETR

DeepLogo incorporates the DETR repository as a subtree, with the following changes for fine-tuning on the flickr logos 27 dataset.

Note: For code modifications for fine-tuning DETR, please refer to woctezuma/detr_fine_tune.md

Add custom dataset builder method(detr/datasets/flickr_logos_27.py, detr/datsets/__init__.py)

def build(image_set, args):
    root = Path(args.coco_path)
    assert root.exists(), f'provided root path {root} does not exist'
    train_json = 'flickr_logos_27_train.json'
    test_json = 'flickr_logos_27_test.json'
    PATHS = {
        "train": (root / 'flickr_logos_27_dataset_images', root / train_json),
        "val": (root / 'flickr_logos_27_dataset_images', root / test_json),
    }

    img_folder, ann_file = PATHS[image_set]
    dataset = CocoDetection(img_folder, ann_file, transforms=make_coco_transforms(image_set), return_masks=args.masks)
    return dataset

def build_dataset(image_set, args):
    ...
    if args.dataset_file == 'flickr_logos_27':
        from .flickr_logos_27 import build as build_flickr_logos_27
        return build_flickr_logos_27(image_set, args)
    raise ValueError(f'dataset {args.dataset_file} not supported')

Modify the num_classes to match the flickr logos 27 dataset(detr/models/detr.py)

def build(args):
    # the `num_classes` naming here is somewhat misleading.
    # it indeed corresponds to `max_obj_id + 1`, where max_obj_id
    # is the maximum id for a class in your dataset. For example,
    # COCO has a max_obj_id of 90, so we pass `num_classes` to be 91.
    # As another example, for a dataset that has a single class with id 1,
    # you should pass `num_classes` to be 2 (max_obj_id + 1).
    # For more details on this, check the following discussion
    # https://github.com/facebookresearch/detr/issues/108#issuecomment-650269223
    num_classes = 20 if args.dataset_file != 'coco' else 91
    if args.dataset_file == "coco_panoptic":
        # for panoptic, we just add a num_classes that is large enough to hold
        # max_obj_id + 1, but the exact value doesn't really matter
        num_classes = 250
    if args.dataset_file == 'flickr_logos_27':
        num_classes = 27  # max_obj_id: 26
    ...

Delete the classification head and loading the state dict(delete_head_and_save.py, detr/main.py)

Get the pretrained weights with the following script, delete the head, and save it as new file.
```
python delete_and_save.py
```
Load the state dict at main.py
```
model_without_ddp.load_state_dict(checkpoint['model'], strict=False)
```
Reference: facebookresearch/detr#9 (comment)

Training

To fine-tuning DETR on flickr logos 27 dataset:

python detr/main.py \
  --dataset_file "flickr_logos_27" \
  --coco_path "flickr_logos_27_dataset" \
  --output_dir "outputs" \
  --resume "detr-r50_no-class-head.pth" \
  --epochs 100

It takes about 3 hours and 15 minutes with Google Colab Pro to run 100 epochs.

The DETR fine-tuning can be checked by running Train_DeepLogo2_by_detr.ipynb.

deeplogo2's People

Contributors

Stargazers

Watchers

Forkers

susovangithub blueprintparadise brahimmade thomasjrye hxdaze alexeybary brianjking paulramirezlopez shockzinfinity dkden7e robert0777

deeplogo2's Issues

Problem with Pytorch Distributed Training

Attempting to train a model using the Jupyter notebook:

!python3 detr/main.py
--dataset_file "flickr_logos_27"
--coco_path "flickr_logos_27_dataset"
--output_dir "outputs"
--resume "detr-r50_no-class-head.pth"
--epochs 100

Results in the following error:

| distributed init (rank 0): env://
Traceback (most recent call last):
File "/blue/egn4951/manuel.cortes/data/testing_grounds/DeepLogo2/detr/main.py", line 250, in
main(args)
File "/blue/egn4951/manuel.cortes/data/testing_grounds/DeepLogo2/detr/main.py", line 108, in main
utils.init_distributed_mode(args)
File "/blue/egn4951/manuel.cortes/data/testing_grounds/DeepLogo2/detr/util/misc.py", line 427, in init_distributed_mode
torch.distributed.init_process_group(backend=args.dist_backend, init_method=args.dist_url,
File "/home/manuel.cortes/.local/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py", line 595, in init_process_group
store, rank, world_size = next(rendezvous_iterator)
File "/home/manuel.cortes/.local/lib/python3.9/site-packages/torch/distributed/rendezvous.py", line 255, in _env_rendezvous_handler
master_addr = _get_env_or_raise("MASTER_ADDR")
File "/home/manuel.cortes/.local/lib/python3.9/site-packages/torch/distributed/rendezvous.py", line 232, in _get_env_or_raise
raise _env_error(env_var)
ValueError: Error initializing torch.distributed using env:// rendezvous: environment variable MASTER_ADDR expected, but not set

These are all the arguments being passed:

Namespace(lr=0.0001, lr_backbone=1e-05, batch_size=2, weight_decay=0.0001, epochs=100, lr_drop=200, clip_max_norm=0.1, frozen_weights=None, backbone='resnet50', dilation=False, position_embedding='sine', enc_layers=6, dec_layers=6, dim_feedforward=2048, hidden_dim=256, dropout=0.1, nheads=8, num_queries=100, pre_norm=False, masks=False, aux_loss=True, set_cost_class=1, set_cost_bbox=5, set_cost_giou=2, mask_loss_coef=1, dice_loss_coef=1, bbox_loss_coef=5, giou_loss_coef=2, eos_coef=0.1, dataset_file='flickr_logos_27', coco_path='flickr_logos_27_dataset', coco_panoptic_path=None, remove_difficult=False, output_dir='outputs', device='cuda', seed=42, resume='detr-r50_no-class-head.pth', start_epoch=0, eval=False, num_workers=2, world_size=1, dist_url='env://')

The problem seems to be that MASTER_ADDR is not set correctly, if at all. Any way to find a fix for this?

Thank you,
Manuel

Creating Confusion matrix

Where in the code can we gain access to the predictions and target values for each image such that a confusion matrix may be created?

Paper References

do you write some paper for this topic?

Missing Files

@satojkovic Can you please provide the link to the following files?
train_img_npy = '/content/drive/MyDrive/DeepLogo2/train_data/train_images_np.npy'
gt_boxes_npy = '/content/drive/MyDrive/DeepLogo2/train_data/gt_boxes.npy'
gt_class_ids_npy = '/content/drive/MyDrive/DeepLogo2/train_data/gt_class_ids.npy'

Pick up training where it left off

Running the fine-tuning for 100 epochs seems to only marginally train the model. However, running the command again leads to the model starting the fine-tuning from the beginning, rather than the 100th epoch. Is there a way to make it so that I may 'continue' the finetuning?