I am trying to prune from mmdet (<a href="https://github.com/open-mmlab/mmdetection/bl

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

it's the same as <a href="https://github.com/open-mmlab/mmrazor/blob/master/configs/pr

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi! This bug has been fixed in <a href="https://github.com/open-mmlab/mmrazor/pull/126

How to make pruner to support FPN like structure?,about open-mmlab/mmrazor

Comments (23)

pppppM commented on August 27, 2024 1

Sounds great！
Are you interested in making a PR？ We can discuss further.

from mmrazor.

twmht commented on August 27, 2024 1

@HIT-cwh

In fact, I have implemented my own autoslim, it's quite different from mmrazor, the memory usage is much efficient than mmrazor.

I use grad clip to clip the gradient in object detection, without distilling the result is satisfied. but when applying the distilling like cwd, the result is bad. You may try grad clip if you hava nan in the beginning of training.

by the way, most anytime network (like BigNAS) does not explain how they use distilling in object detection, I am exploring this and I am looking forward your experiment on this.

from mmrazor.

HIT-cwh commented on August 27, 2024 1

@HIT-cwh

In fact, I have implemented my own autoslim, it's quite different from mmrazor, the memory usage is much efficient than mmrazor.

I use grad clip to clip the gradient in object detection, without distilling the result is satisfied. but when applying the distilling like cwd, the result is bad. You may try grad clip if you hava nan in the beginning of training.

by the way, most anytime network (like BigNAS) does not explain how they use distilling in object detection, I am exploring this and I am looking forward your experiment on this.

I will appreciate it if you can share how to save memory in your implementation. And we will improve our codes based on that.

from mmrazor.

pppppM commented on August 27, 2024

Could you upload pruner config?

from mmrazor.

twmht commented on August 27, 2024

it's the same as https://github.com/open-mmlab/mmrazor/blob/master/configs/pruning/autoslim/autoslim_mbv2_supernet_8xb256_in1k.py#L41, except the model is change to https://github.com/open-mmlab/mmdetection/blob/master/configs/atss/atss_r50_fpn_1x_coco.py

from mmrazor.

pppppM commented on August 27, 2024

I'm very sorry for the inconvenience to you.
There is a bug in the trace mechanism in pruner. The sharable head only traces its first parent module (FPN 0), and other parent modules (FPN 1, FPN 2, ...) are not traced.
I will fix it as soon as possible.

from mmrazor.

twmht commented on August 27, 2024

@pppppM

This is what I concerned.

By the way, I think it's better to let users to configure the whole block as a group (like neck and bbox_head) which sharing the mask, since these blocks are always complicated, and the parsers are hard to modify to deal with these cases.

I have done this by passing the prebuilt channel space (in txt format) to my reimplemented autoslim.

The parser is hard to deal with all the network architectures. the same problem can be found in nni(https://nni.readthedocs.io/en/stable/Compression/ModelSpeedup.html#limitations).

the channel space can be generated by nni or mmrazor and saved it with text file, and can be modified by the users if the channel dependencies are not correctly built.

What is your opinion?

from mmrazor.

twmht commented on August 27, 2024

@pppppM

Sure.

from mmrazor.

pppppM commented on August 27, 2024

Before our open source version, most popular models can be handled correctly, such as ResNet, MobileNet, RetinaNet, Yolox, etc. Probably something went wrong when we refactored the code.
It does require some configurable mechanism to handle models that cannot be handled correctly.
I'm very excited to develop this feature with you. Looking forward to your PR.

from mmrazor.

HIT-cwh commented on August 27, 2024

Hi! This bug has been fixed in pr#126.

from mmrazor.

Bing1002 commented on August 27, 2024

it's the same as https://github.com/open-mmlab/mmrazor/blob/master/configs/pruning/autoslim/autoslim_mbv2_supernet_8xb256_in1k.py#L41, except the model is change to https://github.com/open-mmlab/mmdetection/blob/master/configs/atss/atss_r50_fpn_1x_coco.py

Hi, Can you please upload the prune config file? I used the way you referred but still got errors. Did you successfully to run autoslim on object detection task? Thanks.

from mmrazor.

twmht commented on August 27, 2024

@Bing1002

I have not tried the latest mmraor. Did you try?

from mmrazor.

Bing1002 commented on August 27, 2024

I tried using latest one but still failed. I am not sure if I gave wrong config or there is still a bug there.

…

On Fri, Apr 15, 2022 at 22:21 Ming-Hsuan-Tu ***@***.***> wrote: @Bing1002 <https://github.com/Bing1002> I have not tried the latest mmraor. Did you try? — Reply to this email directly, view it on GitHub <#79 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AC27F5KWTFIXTVACVCLKRZ3VFIPZ3ANCNFSM5ODP7CIQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

from mmrazor.

HIT-cwh commented on August 27, 2024

I'm very sorry for the inconvenience to you.
Pruning models with GroupNorm haven't been supported at present. And GroupNorm is the default normalization in ATSSHead. We will fix it as soon as possible.
Models, such as RetinaNet and YoloX, can be pruned in our codes. The following codes can be used:

model = dict(
    type='mmdet.RetinaNet',
    backbone=dict(
        type='ResNet',
        depth=50,
        num_stages=4,
        out_indices=(0, 1, 2, 3),
        frozen_stages=1,
        norm_cfg=dict(type='BN', requires_grad=True),
        norm_eval=True,
        style='pytorch',
        init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')),
    neck=dict(
        type='FPN',
        in_channels=[256, 512, 1024, 2048],
        out_channels=256,
        start_level=1,
        add_extra_convs='on_input',
        num_outs=5),
    bbox_head=dict(
        type='RetinaHead',
        num_classes=80,
        in_channels=256,
        stacked_convs=4,
        feat_channels=256,
        anchor_generator=dict(
            type='AnchorGenerator',
            octave_base_scale=4,
            scales_per_octave=3,
            ratios=[0.5, 1.0, 2.0],
            strides=[8, 16, 32, 64, 128]),
        bbox_coder=dict(
            type='DeltaXYWHBBoxCoder',
            target_means=[.0, .0, .0, .0],
            target_stds=[1.0, 1.0, 1.0, 1.0]),
        loss_cls=dict(
            type='FocalLoss',
            use_sigmoid=True,
            gamma=2.0,
            alpha=0.25,
            loss_weight=1.0),
        loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
    # model training and testing settings
    train_cfg=dict(
        assigner=dict(
            type='MaxIoUAssigner',
            pos_iou_thr=0.5,
            neg_iou_thr=0.4,
            min_pos_iou=0,
            ignore_iof_thr=-1),
        allowed_border=-1,
        pos_weight=-1,
        debug=False),
    test_cfg=dict(
        nms_pre=1000,
        min_bbox_size=0,
        score_thr=0.05,
        nms=dict(type='nms', iou_threshold=0.5),
        max_per_img=100))

algorithm_cfg = ConfigDict(
    type='AutoSlim',
    architecture=dict(type='MMDetArchitecture', model=model),
    pruner=dict(
        type='RatioPruner',
        ratios=(2 / 12, 3 / 12, 4 / 12, 5 / 12, 6 / 12, 7 / 12, 8 / 12, 9 / 12,
                10 / 12, 11 / 12, 1.0)),
    retraining=False,
    bn_training_mode=True,
    input_shape=None)

algorithm = build_algorithm(algorithm_cfg)

from mmrazor.

Bing1002 commented on August 27, 2024

Hi, thanks for your reply. I tried this config but still failed.

Here is the config:

model = dict(
    type='mmdet.RetinaNet',
    backbone=dict(
        type='ResNet',
        depth=50,
        num_stages=4,
        out_indices=(0, 1, 2, 3),
        frozen_stages=1,
        norm_cfg=dict(type='BN', requires_grad=True),
        norm_eval=True,
        style='pytorch',
        init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')),
    neck=dict(
        type='FPN',
        in_channels=[256, 512, 1024, 2048],
        out_channels=256,
        start_level=1,
        add_extra_convs='on_input',
        num_outs=5),
    bbox_head=dict(
        type='RetinaHead',
        num_classes=80,
        in_channels=256,
        stacked_convs=4,
        feat_channels=256,
        anchor_generator=dict(
            type='AnchorGenerator',
            octave_base_scale=4,
            scales_per_octave=3,
            ratios=[0.5, 1.0, 2.0],
            strides=[8, 16, 32, 64, 128]),
        bbox_coder=dict(
            type='DeltaXYWHBBoxCoder',
            target_means=[.0, .0, .0, .0],
            target_stds=[1.0, 1.0, 1.0, 1.0]),
        loss_cls=dict(
            type='FocalLoss',
            use_sigmoid=True,
            gamma=2.0,
            alpha=0.25,
            loss_weight=1.0),
        loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
    # model training and testing settings
    train_cfg=dict(
        assigner=dict(
            type='MaxIoUAssigner',
            pos_iou_thr=0.5,
            neg_iou_thr=0.4,
            min_pos_iou=0,
            ignore_iof_thr=-1),
        allowed_border=-1,
        pos_weight=-1,
        debug=False),
    test_cfg=dict(
        nms_pre=1000,
        min_bbox_size=0,
        score_thr=0.05,
        nms=dict(type='nms', iou_threshold=0.5),
        max_per_img=100))

dataset_type = 'CocoDataset'
data_root = '/mnt/data/coco_demo/'
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
    dict(type='RandomFlip', flip_ratio=0.5),
    dict(
        type='Normalize',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        to_rgb=True),
    dict(type='Pad', size_divisor=32),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(1333, 800),
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='RandomFlip'),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='Pad', size_divisor=32),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img'])
        ])
]
data = dict(
    samples_per_gpu=4,
    workers_per_gpu=2,
    train=dict(
        type='CocoDataset',
        ann_file=data_root + 'annotations/instances_train2017.json',
        img_prefix=data_root + 'train2017/',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(type='LoadAnnotations', with_bbox=True),
            dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
            dict(type='RandomFlip', flip_ratio=0.5),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='Pad', size_divisor=32),
            dict(type='DefaultFormatBundle'),
            dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
        ]),
    val=dict(
        type='CocoDataset',
        ann_file=data_root + 'annotations/instances_val2017.json',
        img_prefix=data_root + 'val2017/',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(
                type='MultiScaleFlipAug',
                img_scale=(1333, 800),
                flip=False,
                transforms=[
                    dict(type='Resize', keep_ratio=True),
                    dict(type='RandomFlip'),
                    dict(
                        type='Normalize',
                        mean=[123.675, 116.28, 103.53],
                        std=[58.395, 57.12, 57.375],
                        to_rgb=True),
                    dict(type='Pad', size_divisor=32),
                    dict(type='ImageToTensor', keys=['img']),
                    dict(type='Collect', keys=['img'])
                ])
        ]),
    test=dict(
        type='CocoDataset',
        ann_file=data_root + 'annotations/instances_val2017.json',
        img_prefix=data_root + 'val2017/',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(
                type='MultiScaleFlipAug',
                img_scale=(1333, 800),
                flip=False,
                transforms=[
                    dict(type='Resize', keep_ratio=True),
                    dict(type='RandomFlip'),
                    dict(
                        type='Normalize',
                        mean=[123.675, 116.28, 103.53],
                        std=[58.395, 57.12, 57.375],
                        to_rgb=True),
                    dict(type='Pad', size_divisor=32),
                    dict(type='ImageToTensor', keys=['img']),
                    dict(type='Collect', keys=['img'])
                ])
        ]))
evaluation = dict(interval=1, metric='bbox')
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=None)
lr_config = dict(
    policy='step',
    warmup='linear',
    warmup_iters=500,
    warmup_ratio=0.001,
    step=[8, 11])
runner = dict(type='EpochBasedRunner', max_epochs=12)
checkpoint_config = dict(interval=1)
log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')])
custom_hooks = [dict(type='NumClassCheckHook')]
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
work_dir = './work_dirs/retinanet_r50_fpn_1x_coco'
auto_resume = False
gpu_ids = range(0, 1)


algorithm = dict(
    type='AutoSlim',
    architecture=dict(type='MMDetArchitecture', model=model),
    pruner=dict(
        type='RatioPruner',
        ratios=(2 / 12, 3 / 12, 4 / 12, 5 / 12, 6 / 12, 7 / 12, 8 / 12, 9 / 12,
                10 / 12, 11 / 12, 1.0)),
    retraining=False,
    bn_training_mode=True,
    input_shape=None)

And the error is the same as usual:

2022-04-16 09:49:55,736 - mmdet - INFO - workflow: [('train', 1)], max: 12 epochs
2022-04-16 09:49:55,736 - mmdet - INFO - Checkpoints will be saved to /home/local/york.lan/bing.zha/code/mmrazor/work_dirs/retinanet_r50_fpn_1x_coco by HardDiskBackend.
Traceback (most recent call last):
  File "/home/local/york.lan/bing.zha/code/mmrazor/tools/mmdet/train_mmdet.py", line 210, in <module>
    main()
  File "/home/local/york.lan/bing.zha/code/mmrazor/tools/mmdet/train_mmdet.py", line 199, in main
    train_mmdet_model(
  File "/home/local/york.lan/bing.zha/code/mmrazor/mmrazor/apis/mmdet/train.py", line 206, in train_mmdet_model
    runner.run(data_loader, cfg.workflow)
  File "/home/local/york.lan/bing.zha/code/mmcv_1.4.6/mmcv/runner/epoch_based_runner.py", line 127, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/home/local/york.lan/bing.zha/code/mmcv_1.4.6/mmcv/runner/epoch_based_runner.py", line 51, in train
    self.call_hook('after_train_iter')
  File "/home/local/york.lan/bing.zha/code/mmcv_1.4.6/mmcv/runner/base_runner.py", line 309, in call_hook
    getattr(hook, fn_name)(self)
  File "/home/local/york.lan/bing.zha/code/mmcv_1.4.6/mmcv/runner/hooks/optimizer.py", line 56, in after_train_iter
    runner.outputs['loss'].backward()
  File "/home/local/york.lan/bing.zha/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/_tensor.py", line 255, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/home/local/york.lan/bing.zha/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/autograd/__init__.py", line 147, in backward
    Variable._execution_engine.run_backward(
RuntimeError: Trying to backward through the graph a second time (or directly access saved variables after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved variables after calling backward.

Looking forward your response. Thanks.

from mmrazor.

twmht commented on August 27, 2024

You may try to set optimizer_config to None.

from mmrazor.

Bing1002 commented on August 27, 2024

You may try to set optimizer_config to None.

After changing that part, now I can run pruning. Could you please explain why that setting matters?

from mmrazor.

twmht commented on August 27, 2024

They call optimizer.step() in autoslim, not by mmcv hook. Setting optimizer_config to None would not register mmcv hook and you would not call optimizer.step() twice.

from mmrazor.

Bing1002 commented on August 27, 2024

Thanks. Then it seems the return loss become nan.

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 5000/5000, 15.2 task/s, elapsed: 329s, ETA:     0s2022-04-16 14:02:58,786 - mmdet - INFO - Evaluating bbox...
Loading and preparing results...
2022-04-16 14:02:58,787 - mmdet - ERROR - The testing results of the whole dataset is empty.
2022-04-16 14:02:58,816 - mmdet - INFO - Exp name: autoslim_retinanet.py
2022-04-16 14:02:58,841 - mmdet - INFO - Epoch(val) [4][5000]
2022-04-16 14:04:45,274 - mmdet - INFO - Epoch [5][50/1239]     lr: 1.000e-02, eta: 5:31:29, time: 2.128, data_time: 0.051, memory: 9424, max_model.loss_cls: nan, max_model.loss_bbox: nan, min_model.loss_cls: nan, min_model.loss_bbox: nan, prune_model1.loss_cls: nan, prune_model1.loss_bbox: nan, prune_model2.loss_cls: nan, prune_model2.loss_bbox: nan, loss: nan
2022-04-16 14:06:28,567 - mmdet - INFO - Epoch [5][100/1239]    lr: 1.000e-02, eta: 5:29:53, time: 2.066, data_time: 0.007, memory: 9424, max_model.loss_cls: nan, max_model.loss_bbox: nan, min_model.loss_cls: nan, min_model.loss_bbox: nan, prune_model1.loss_cls: nan, prune_model1.loss_bbox: nan, prune_model2.loss_cls: nan, prune_model2.loss_bbox: nan, loss: nan
2022-04-16 14:08:13,967 - mmdet - INFO - Epoch [5][150/1239]    lr: 1.000e-02, eta: 5:28:21, time: 2.108, data_time: 0.007, memory: 9424, max_model.loss_cls: nan, max_model.loss_bbox: nan, min_model.loss_cls: nan, min_model.loss_bbox: nan, prune_model1.loss_cls: nan, prune_model1.loss_bbox: nan, prune_model2.loss_cls: nan, prune_model2.loss_bbox: nan, loss: nan
2022-04-16 14:09:58,547 - mmdet - INFO - Epoch [5][200/1239]    lr: 1.000e-02, eta: 5:26:47, time: 2.092, data_time: 0.007, memory: 9424, max_model.loss_cls: nan, max_model.loss_bbox: nan, min_model.loss_cls: nan, min_model.loss_bbox: nan, prune_model1.loss_cls: nan, prune_model1.loss_bbox: nan, prune_model2.loss_cls: nan, prune_model2.loss_bbox: nan, loss: nan
2022-04-16 14:11:42,629 - mmdet - INFO - Epoch [5][250/1239]    lr: 1.000e-02, eta: 5:25:11, time: 2.082, data_time: 0.007, memory: 9424, max_model.loss_cls: nan, max_model.loss_bbox: nan, min_model.loss_cls: nan, min_model.loss_bbox: nan, prune_model1.loss_cls: nan, prune_model1.loss_bbox: nan, prune_model2.loss_cls: nan, prune_model2.loss_bbox: nan, loss: nan
2022-04-16 14:13:25,492 - mmdet - INFO - Epoch [5][300/1239]    lr: 1.000e-02, eta: 5:23:34, time: 2.057, data_time: 0.007, memory: 9424, max_model.loss_cls: nan, max_model.loss_bbox: nan, min_model.loss_cls: nan, min_model.loss_bbox: nan, prune_model1.loss_cls: nan, prune_model1.loss_bbox: nan, prune_model2.loss_cls: nan, prune_model2.loss_bbox: nan, loss: nan
2022-04-16 14:15:09,866 - mmdet - INFO - Epoch [5][350/1239]    lr: 1.000e-02, eta: 5:21:59, time: 2.087, data_time: 0.007, memory: 9424, max_model.loss_cls: nan, max_model.loss_bbox: nan, min_model.loss_cls: nan, min_model.loss_bbox: nan, prune_model1.loss_cls: nan, prune_model1.loss_bbox: nan, prune_model2.loss_cls: nan, prune_model2.loss_bbox: nan, loss: nan

Do you have any idea about it? Thanks a lot!

from mmrazor.

HIT-cwh commented on August 27, 2024

We have not verified whether AutoSlim works on object detection. Maybe you can try to prune Mobilenet v2 first to check if there is a problem with the codes or the AutoSlim.

from mmrazor.

HIT-cwh commented on August 27, 2024

Thanks. Then it seems the return loss become nan.

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 5000/5000, 15.2 task/s, elapsed: 329s, ETA:     0s2022-04-16 14:02:58,786 - mmdet - INFO - Evaluating bbox...
Loading and preparing results...
2022-04-16 14:02:58,787 - mmdet - ERROR - The testing results of the whole dataset is empty.
2022-04-16 14:02:58,816 - mmdet - INFO - Exp name: autoslim_retinanet.py
2022-04-16 14:02:58,841 - mmdet - INFO - Epoch(val) [4][5000]
2022-04-16 14:04:45,274 - mmdet - INFO - Epoch [5][50/1239]     lr: 1.000e-02, eta: 5:31:29, time: 2.128, data_time: 0.051, memory: 9424, max_model.loss_cls: nan, max_model.loss_bbox: nan, min_model.loss_cls: nan, min_model.loss_bbox: nan, prune_model1.loss_cls: nan, prune_model1.loss_bbox: nan, prune_model2.loss_cls: nan, prune_model2.loss_bbox: nan, loss: nan
2022-04-16 14:06:28,567 - mmdet - INFO - Epoch [5][100/1239]    lr: 1.000e-02, eta: 5:29:53, time: 2.066, data_time: 0.007, memory: 9424, max_model.loss_cls: nan, max_model.loss_bbox: nan, min_model.loss_cls: nan, min_model.loss_bbox: nan, prune_model1.loss_cls: nan, prune_model1.loss_bbox: nan, prune_model2.loss_cls: nan, prune_model2.loss_bbox: nan, loss: nan
2022-04-16 14:08:13,967 - mmdet - INFO - Epoch [5][150/1239]    lr: 1.000e-02, eta: 5:28:21, time: 2.108, data_time: 0.007, memory: 9424, max_model.loss_cls: nan, max_model.loss_bbox: nan, min_model.loss_cls: nan, min_model.loss_bbox: nan, prune_model1.loss_cls: nan, prune_model1.loss_bbox: nan, prune_model2.loss_cls: nan, prune_model2.loss_bbox: nan, loss: nan
2022-04-16 14:09:58,547 - mmdet - INFO - Epoch [5][200/1239]    lr: 1.000e-02, eta: 5:26:47, time: 2.092, data_time: 0.007, memory: 9424, max_model.loss_cls: nan, max_model.loss_bbox: nan, min_model.loss_cls: nan, min_model.loss_bbox: nan, prune_model1.loss_cls: nan, prune_model1.loss_bbox: nan, prune_model2.loss_cls: nan, prune_model2.loss_bbox: nan, loss: nan
2022-04-16 14:11:42,629 - mmdet - INFO - Epoch [5][250/1239]    lr: 1.000e-02, eta: 5:25:11, time: 2.082, data_time: 0.007, memory: 9424, max_model.loss_cls: nan, max_model.loss_bbox: nan, min_model.loss_cls: nan, min_model.loss_bbox: nan, prune_model1.loss_cls: nan, prune_model1.loss_bbox: nan, prune_model2.loss_cls: nan, prune_model2.loss_bbox: nan, loss: nan
2022-04-16 14:13:25,492 - mmdet - INFO - Epoch [5][300/1239]    lr: 1.000e-02, eta: 5:23:34, time: 2.057, data_time: 0.007, memory: 9424, max_model.loss_cls: nan, max_model.loss_bbox: nan, min_model.loss_cls: nan, min_model.loss_bbox: nan, prune_model1.loss_cls: nan, prune_model1.loss_bbox: nan, prune_model2.loss_cls: nan, prune_model2.loss_bbox: nan, loss: nan
2022-04-16 14:15:09,866 - mmdet - INFO - Epoch [5][350/1239]    lr: 1.000e-02, eta: 5:21:59, time: 2.087, data_time: 0.007, memory: 9424, max_model.loss_cls: nan, max_model.loss_bbox: nan, min_model.loss_cls: nan, min_model.loss_bbox: nan, prune_model1.loss_cls: nan, prune_model1.loss_bbox: nan, prune_model2.loss_cls: nan, prune_model2.loss_bbox: nan, loss: nan

Do you have any idea about it? Thanks a lot!

Do you detach the teacher's output in the loss function? Such as here.

from mmrazor.

twmht commented on August 27, 2024

@HIT-cwh

he did not use distilling.

from mmrazor.

HIT-cwh commented on August 27, 2024

@HIT-cwh

he did not use distilling.

My bad.
Due to a lack of manpower, the progress of transferring AutoSlim to other tasks is not very satisfactory. And I'm very sorry for the inconvenience to you.
We are reproducing BigNAS, if it goes well, we will release the BigNAS example on semantic segmentation.

from mmrazor.

How to make pruner to support FPN like structure? about mmrazor HOT 23 CLOSED

Comments (23)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs