mmaaz60 / edgenext Goto Github PK
View Code? Open in Web Editor NEW[CADL'22, ECCVW] Official repository of paper titled "EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications".
License: MIT License
[CADL'22, ECCVW] Official repository of paper titled "EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications".
License: MIT License
When I test the fps of edgenext_x_small and edgenext_small with RTX 2060(notebook), find edgenext_small is faster than edgenext_x_small ?
the following is the result . Each row is the average fps, time_mean, time_std of 100 inferences, and the last row(“result_average”) is the average of the 10 rows above.
# the command:python get_fps.py --model edgenext_small --finetune weights/edgenext_small.pth
{'fps': 112.7, 'time_mean': 8.9, 'time_std': 0.4}
{'fps': 115.1, 'time_mean': 8.7, 'time_std': 0.4}
{'fps': 115.6, 'time_mean': 8.6, 'time_std': 0.3}
{'fps': 115.2, 'time_mean': 8.7, 'time_std': 0.3}
{'fps': 114.9, 'time_mean': 8.7, 'time_std': 0.3}
{'fps': 113.1, 'time_mean': 8.8, 'time_std': 0.5}
{'fps': 111.7, 'time_mean': 9.0, 'time_std': 0.4}
{'fps': 114.9, 'time_mean': 8.7, 'time_std': 0.4}
{'fps': 114.4, 'time_mean': 8.7, 'time_std': 0.5}
{'fps': 117.4, 'time_mean': 8.5, 'time_std': 0.3}
result_average:
{'fps': 114.5, 'time_mean': 8.7, 'time_std': 0.4}
# the command:python get_fps.py --model edgenext_x_small --finetune weights/edgenext_x_small.pth
{'fps': 108.8, 'time_mean': 9.2, 'time_std': 0.5}
{'fps': 112.7, 'time_mean': 8.9, 'time_std': 0.5}
{'fps': 114.6, 'time_mean': 8.7, 'time_std': 0.4}
{'fps': 114.3, 'time_mean': 8.7, 'time_std': 0.4}
{'fps': 111.8, 'time_mean': 8.9, 'time_std': 0.6}
{'fps': 110.9, 'time_mean': 9.0, 'time_std': 0.5}
{'fps': 111.2, 'time_mean': 9.0, 'time_std': 0.5}
{'fps': 109.8, 'time_mean': 9.1, 'time_std': 0.5}
{'fps': 113.9, 'time_mean': 8.8, 'time_std': 0.5}
{'fps': 96.0, 'time_mean': 10.4, 'time_std': 2.9}
result_average:
{'fps': 110.4, 'time_mean': 9.1, 'time_std': 0.7}
Here is the code I used(just add get_fps function in the main.py)
def get_fps(args, repetitions=120, num_warmup=20, infer_epoch=10):
utils.init_distributed_mode(args)
print(args)
device = torch.device(args.device)
# Eval/USI_eval configurations
if args.eval:
if args.usi_eval:
args.crop_pct = 0.95
model_state_dict_name = 'state_dict'
else:
model_state_dict_name = 'model_ema'
else:
model_state_dict_name = 'model'
# load mmodel
model = create_model(
args.model,
pretrained=False,
num_classes=args.nb_classes,
drop_path_rate=args.drop_path,
layer_scale_init_value=args.layer_scale_init_value,
head_init_scale=1.0,
input_res=args.input_size,
classifier_dropout=args.classifier_dropout,
)
if args.finetune:
checkpoint = torch.load(args.finetune, map_location="cpu")
state_dict = checkpoint[model_state_dict_name]
utils.load_state_dict(model, state_dict)
from mmcv.cnn import fuse_conv_bn
model = fuse_conv_bn(model)
model.to(device)
model.eval()
# open cudnn speed up
torch.backends.cudnn.benchmark = True
# init data
data = torch.randn(1, 3, 256, 256, dtype=torch.float).to(device)
# test fps
result_average = {'fps': 0, 'time_mean': 0, 'time_std': 0}
for _ in range(infer_epoch):
result = {}
infer_time = []
for i in range(repetitions):
torch.cuda.synchronize()
start_time = time.perf_counter()
# infer
with torch.no_grad():
model(data)
torch.cuda.synchronize()
elapsed = (time.perf_counter() - start_time)
if i >= num_warmup:
infer_time.append(elapsed)
result['fps'] = (repetitions - num_warmup) / sum(infer_time)
result['time_mean'] = np.mean(infer_time) * 1000
result['time_std'] = np.std(infer_time) * 1000
result_average['fps'] += result['fps']
result_average['time_mean'] += result['time_mean']
result_average['time_std'] += result['time_std']
for key, value in result.items():
result[key] = round(value, 1)
print(result)
for key, value in result_average.items():
result_average[key] = round(value / infer_epoch, 1)
print("result_average:")
print(result_average)
if __name__ == '__main__':
parser = argparse.ArgumentParser('EdgeNeXt training and evaluation script', parents=[get_args_parser()])
args = parser.parse_args()
if args.output_dir:
Path(args.output_dir).mkdir(parents=True, exist_ok=True)
# main(args)
get_fps(args)
Thanks for your great work! Could you support source codes for down-stream tasks of detection and segmentation? That would be very helpful, thanks
Hello.
I did a model test and the performance is very good.
I'd like to learn with a different data set.
Can I learn Gray Scale Image or other sizes of Image (64x64x1)??
@mmaaz60 Can you please provide the code for converting this model to int8? It is successfully converting to ONNX and I am able to infer using ONNXRuntime. However, is there any way to decrease the inference time further? Int8 or quantisation or pruning? I'm using edgenext_xx_small_bn_hs. Thanks a lot! Love your work!
Hi, I'm using EdgeNext as my Backbone for feature extraction for my image classification task, but I find it is very slow to converge (loss 35 vs loss 7 compare to efficientnetb0) so I'm not really sure if the model not really fits with my data or my config was wrong?
Can anyone share some experience training this type of model? Thanks :D
Hi
I want to finetune EdgeNeXt_xx_small with my custom dataset.(class = 3)
When i using 'resume' argument, it makes error of size mismatch for head.weight and head.bias
How can i finetune EdgeNeXt_xx_small with my custom data?
Thanks :)
Traceback (most recent call last):
File "D:/python/pycharm/EdgeNeXt-main/main.py", line 501, in
main(args)
File "D:/python/pycharm/EdgeNeXt-main/main.py", line 370, in main
print("Max WD = %.7f, Min WD = %.7f" % (max(wd_schedule_values), min(wd_schedule_values)))
ValueError: max() arg is an empty sequence
Hello, this is a problem I'm having while trying to run your code, do you know why? Thank you!
Hello!
I would like to use the edgenext model in libtorch.
There are errors that appear when you use two methods to convert to a .pt file:
1. Create a .pt file with torch.jit.trace (model, input) and run it with libtorch
Debug mode : "abort() has been called" error occurred
Release mode : The output value is different from the original model.
python original output shape : (1, 7)
C++ libtorch output shape : (1, 2)
2. When creating a .pt file with torch.jit.script(model) in python
##################################################
forward(torch.torch.nn.modules.conv.___torch_mangle_253.Conv2d self, Tensor input) -> (Tensor):
Expected a value of type 'Tensor' for argument 'input' but instead found type 'Optional[Tensor]'.
##################################################
Error Occurred
I retrain the edgenext_small using the official code and coomand with 8*V100, but a get 79.1 that has a minor gap with the paper's result 79.4. Is this gap a random error or something maybe left out? Thanks for the great work
Hi! thank you for sharing your amazing work!
I have tried EdgeNext for Semantic segmentation and it works really well! However, I am targeting an even lighter model << 1 Million params, hence, I have been trying Knowledge Distillation from a Larger EdgeNext model to a student network (which is basically a smaller EdgeNext model with number of channels halved) , however it seems that I was a bit too aggressive with dropping channels and performance is hard stuck at around 0.3 mIoU while the teacher EdgeNext_XXS was able to reach as high as 0.93 mIoU on the same dataset. Would you have any suggestions about how to create the student EdgeNext model ?
Any help will be super appreciated! :)
I would like to try training edgenext_xx.
Can you tell me the parameter settings?
And can i see your train log?
Hi,
Great works. What the datasets of ImageNet-1K to use. It have many
ILSVRC 2017
ILSVRC 2016
ILSVRC 2015
ILSVRC 2014
ILSVRC 2013
ILSVRC 2012
ILSVRC 2011
ILSVRC 2010
I used Edgenext for crowd counting, but the effect was not as good as VGG. I tried to change the different learning rate, but it still didn't help. Do you have any suggestions?
Hi,
Could you also share the checkpoints of the Edgenext models(Edgenext-S specifically) trained on 224X224 images(results were mentioned in the paper)?
I would like to compare the metrics with other SOTA models which have been pretrained on 224X224 imagenet-1k resolution images.
Table 7. Ablation on different components of EdgeNeXt and SDTA encoder. The results show the benefits of SDTA encoders and adaptive kernels in our design. Further, adaptive branching and positional encoding (PE) are also required in SDTA module.
In the table 7. I'm curious why the delay is the same with the base model when there is no adaptive branching? Looking forward to your reply~
Hi,
Can you please tell me what's the reason for your choice of channels changes like[48,96,160,304]at each block respectively?
thanks for your works, i wonder if it performs well on mobile cpus ,like arm chips.
Hi, Thanks for your work.
Is there any plan to share your implementation for semantic segmentation with PascalVOC dataset? Perhaps instructions on how to build the modules or pth checkpoints / onnx models?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.