GithubHelp home page GithubHelp logo

mmaaz60 / edgenext Goto Github PK

View Code? Open in Web Editor NEW
334.0 6.0 37.0 8.21 MB

[CADL'22, ECCVW] Official repository of paper titled "EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications".

License: MIT License

Python 100.00%
classification cnn edge-computing hybrid-model mobile-application transformers

edgenext's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

edgenext's Issues

edgenext_small is faster than edgenext_x_small ?

When I test the fps of edgenext_x_small and edgenext_small with RTX 2060(notebook), find edgenext_small is faster than edgenext_x_small ?

the following is the result . Each row is the average fps, time_mean, time_std of 100 inferences, and the last row(“result_average”) is the average of the 10 rows above.

# the command:python get_fps.py --model edgenext_small   --finetune weights/edgenext_small.pth
{'fps': 112.7, 'time_mean': 8.9, 'time_std': 0.4}
{'fps': 115.1, 'time_mean': 8.7, 'time_std': 0.4}
{'fps': 115.6, 'time_mean': 8.6, 'time_std': 0.3}
{'fps': 115.2, 'time_mean': 8.7, 'time_std': 0.3}
{'fps': 114.9, 'time_mean': 8.7, 'time_std': 0.3}
{'fps': 113.1, 'time_mean': 8.8, 'time_std': 0.5}
{'fps': 111.7, 'time_mean': 9.0, 'time_std': 0.4}
{'fps': 114.9, 'time_mean': 8.7, 'time_std': 0.4}
{'fps': 114.4, 'time_mean': 8.7, 'time_std': 0.5}
{'fps': 117.4, 'time_mean': 8.5, 'time_std': 0.3}
result_average:
{'fps': 114.5, 'time_mean': 8.7, 'time_std': 0.4}

# the command:python get_fps.py --model edgenext_x_small   --finetune weights/edgenext_x_small.pth
{'fps': 108.8, 'time_mean': 9.2, 'time_std': 0.5}
{'fps': 112.7, 'time_mean': 8.9, 'time_std': 0.5}
{'fps': 114.6, 'time_mean': 8.7, 'time_std': 0.4}
{'fps': 114.3, 'time_mean': 8.7, 'time_std': 0.4}
{'fps': 111.8, 'time_mean': 8.9, 'time_std': 0.6}
{'fps': 110.9, 'time_mean': 9.0, 'time_std': 0.5}
{'fps': 111.2, 'time_mean': 9.0, 'time_std': 0.5}
{'fps': 109.8, 'time_mean': 9.1, 'time_std': 0.5}
{'fps': 113.9, 'time_mean': 8.8, 'time_std': 0.5}
{'fps': 96.0, 'time_mean': 10.4, 'time_std': 2.9}
result_average:
{'fps': 110.4, 'time_mean': 9.1, 'time_std': 0.7}

Here is the code I used(just add get_fps function in the main.py)

def get_fps(args, repetitions=120, num_warmup=20, infer_epoch=10):
    utils.init_distributed_mode(args)
    print(args)
    device = torch.device(args.device)

    # Eval/USI_eval configurations
    if args.eval:
        if args.usi_eval:
            args.crop_pct = 0.95
            model_state_dict_name = 'state_dict'
        else:
            model_state_dict_name = 'model_ema'
    else:
        model_state_dict_name = 'model'

    # load mmodel
    model = create_model(
        args.model,
        pretrained=False,
        num_classes=args.nb_classes,
        drop_path_rate=args.drop_path,
        layer_scale_init_value=args.layer_scale_init_value,
        head_init_scale=1.0,
        input_res=args.input_size,
        classifier_dropout=args.classifier_dropout,
    )
    if args.finetune:
        checkpoint = torch.load(args.finetune, map_location="cpu")
        state_dict = checkpoint[model_state_dict_name]
        utils.load_state_dict(model, state_dict)

    from mmcv.cnn import fuse_conv_bn
    model = fuse_conv_bn(model)
    model.to(device)
    model.eval()

    # open cudnn speed up
    torch.backends.cudnn.benchmark = True

    # init data
    data = torch.randn(1, 3, 256, 256, dtype=torch.float).to(device)

    # test fps
    result_average = {'fps': 0, 'time_mean': 0, 'time_std': 0}
    for _ in range(infer_epoch):
        result = {}
        infer_time = []

        for i in range(repetitions):
            torch.cuda.synchronize()
            start_time = time.perf_counter()

            # infer
            with torch.no_grad():
                model(data)

            torch.cuda.synchronize()
            elapsed = (time.perf_counter() - start_time)

            if i >= num_warmup:
                infer_time.append(elapsed)

        result['fps'] = (repetitions - num_warmup) / sum(infer_time)
        result['time_mean'] = np.mean(infer_time) * 1000
        result['time_std'] = np.std(infer_time) * 1000

        result_average['fps'] += result['fps']
        result_average['time_mean'] += result['time_mean']
        result_average['time_std'] += result['time_std']

        for key, value in result.items():
            result[key] = round(value, 1)

        print(result)

    for key, value in result_average.items():
        result_average[key] = round(value / infer_epoch, 1)

    print("result_average:")
    print(result_average)

if __name__ == '__main__':
    parser = argparse.ArgumentParser('EdgeNeXt training and evaluation script', parents=[get_args_parser()])
    args = parser.parse_args()
    if args.output_dir:
        Path(args.output_dir).mkdir(parents=True, exist_ok=True)
    # main(args)
    get_fps(args)

Code for Det/Seg

Thanks for your great work! Could you support source codes for down-stream tasks of detection and segmentation? That would be very helpful, thanks

Faster Inference INT8

@mmaaz60 Can you please provide the code for converting this model to int8? It is successfully converting to ONNX and I am able to infer using ONNXRuntime. However, is there any way to decrease the inference time further? Int8 or quantisation or pruning? I'm using edgenext_xx_small_bn_hs. Thanks a lot! Love your work!

Is it take long to train this model?

Hi, I'm using EdgeNext as my Backbone for feature extraction for my image classification task, but I find it is very slow to converge (loss 35 vs loss 7 compare to efficientnetb0) so I'm not really sure if the model not really fits with my data or my config was wrong?
Can anyone share some experience training this type of model? Thanks :D

ValueError: max() arg is an empty sequence

Traceback (most recent call last):
File "D:/python/pycharm/EdgeNeXt-main/main.py", line 501, in
main(args)
File "D:/python/pycharm/EdgeNeXt-main/main.py", line 370, in main
print("Max WD = %.7f, Min WD = %.7f" % (max(wd_schedule_values), min(wd_schedule_values)))
ValueError: max() arg is an empty sequence

Hello, this is a problem I'm having while trying to run your code, do you know why? Thank you!

Convert model to libtorch.

Hello!
I would like to use the edgenext model in libtorch.
There are errors that appear when you use two methods to convert to a .pt file:

1. Create a .pt file with torch.jit.trace (model, input) and run it with libtorch

  • Debug mode : "abort() has been called" error occurred

  • Release mode : The output value is different from the original model.

    • python original output shape : (1, 7)

    • C++ libtorch output shape : (1, 2)

2. When creating a .pt file with torch.jit.script(model) in python
##################################################
forward(torch.torch.nn.modules.conv.___torch_mangle_253.Conv2d self, Tensor input) -> (Tensor):
Expected a value of type 'Tensor' for argument 'input' but instead found type 'Optional[Tensor]'.
##################################################
Error Occurred

Reproduced 79.1 for edgenext_small, a random error?

I retrain the edgenext_small using the official code and coomand with 8*V100, but a get 79.1 that has a minor gap with the paper's result 79.4. Is this gap a random error or something maybe left out? Thanks for the great work

Jetson model for benchmarking

Hello,

thanks for providing the source code from your work! :)

I only have one simple question regarding your benchmark table here on GitHub. What kind of Jetson model have you used for this comparison with the A100?

Thanks

Benchmark

Suggested network configs for student model for knowledge distillation

Hi! thank you for sharing your amazing work!

I have tried EdgeNext for Semantic segmentation and it works really well! However, I am targeting an even lighter model << 1 Million params, hence, I have been trying Knowledge Distillation from a Larger EdgeNext model to a student network (which is basically a smaller EdgeNext model with number of channels halved) , however it seems that I was a bit too aggressive with dropping channels and performance is hard stuck at around 0.3 mIoU while the teacher EdgeNext_XXS was able to reach as high as 0.93 mIoU on the same dataset. Would you have any suggestions about how to create the student EdgeNext model ?

Any help will be super appreciated! :)

classification

I would like to try training edgenext_xx.
Can you tell me the parameter settings?
And can i see your train log?

other task---edgenext

I used Edgenext for crowd counting, but the effect was not as good as VGG. I tried to change the different learning rate, but it still didn't help. Do you have any suggestions?

Edgenext checkpoints on 224X224 resolution.

Hi,

Could you also share the checkpoints of the Edgenext models(Edgenext-S specifically) trained on 224X224 images(results were mentioned in the paper)?

I would like to compare the metrics with other SOTA models which have been pretrained on 224X224 imagenet-1k resolution images.

About ablation

Table 7. Ablation on different components of EdgeNeXt and SDTA encoder. The results show the benefits of SDTA encoders and adaptive kernels in our design. Further, adaptive branching and positional encoding (PE) are also required in SDTA module.

image

In the table 7. I'm curious why the delay is the same with the base model when there is no adaptive branching? Looking forward to your reply~

about channels

Hi,
Can you please tell me what's the reason for your choice of channels changes like[48,96,160,304]at each block respectively?

EdgeNeXt for semantic segmentation

Hi, Thanks for your work.
Is there any plan to share your implementation for semantic segmentation with PascalVOC dataset? Perhaps instructions on how to build the modules or pth checkpoints / onnx models?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.