westlake-ai / moganet Goto Github PK
View Code? Open in Web Editor NEW[ICLR 2024] MogaNet: Efficient Multi-order Gated Aggregation Network
Home Page: https://arxiv.org/abs/2211.03295
License: Apache License 2.0
[ICLR 2024] MogaNet: Efficient Multi-order Gated Aggregation Network
Home Page: https://arxiv.org/abs/2211.03295
License: Apache License 2.0
Does replacing SE with CA in any network have an increase of 0.6?
是否需要冻结前几层的参数,还是全部网络放开训练效果好一点,因为看到代码里面没有写关于冻结参数的部分,所以有这个小问题。
Hi! Thanks for your code release!
when I use moganet_base, I set pretrained=True, the error as follows:
RuntimeError: Error(s) in loading state_dict for MogaNet: Unexpected key(s) in state_dict: "head.weight", "head.bias"
Can u give me some advice?
I'd like to make an interference code.
I'm trying to create an interference code, but I keep getting this error: "EncoderDecoder: 'MogaNet_feat is not in the models registry'.
It doesn't go back to the mmsegmentation demo code, so please.
hi, thank you for your nice work.
could you offer us your code of Distributions of the interaction strength
, which, i believe, is a new perspective of networks.
请问为什么xtiny的depths是(3,3,12,2)但是small版本的depths是(2,3,12,2),small版本第一阶段depths为2是一个特殊的设置吗?
Congratulations on the ICLR24 acceptance.
I apologize if I missed it, but I was unable to find the cascade rcnn config file. Would it be possible to share it, or provide me with a link to its location?
Hi! Mentioned that your paper has two Subtract operations which confuses me. Can I just consider them as decouple?
Hi,
Is there code for training the baseline?
Best,
Jiahui.Li
Thank you for your great work!
As far as I know, models such as DeiT and ConvNext do not use "cooldown_epochs".
However, the code looks like MogaNet was trained in 310 epochs rather than 300 epochs. Are the accuracies in the paper posted on openreview all learned from 310 epochs?
Sorry to bother you. Hello, I have read your paper and found it very impressive. I have a small question: can I use the CA module you proposed to replace the FFN layer in ViT? Again, I apologize for my interruption and look forward to your reply and suggestions. Thank you!
I hope this helps! If you need any further assistance, feel free to ask.
In paper, authors wrote " we propose FD(·) to dynamically exclude trivial interactions" and "By re-weighting the trivial interaction component Y − GAP(Y ), FD(·) also increase feature diversities"
What exactly is this "trivial interactions"? And why taking Y - GAP(Y) can increase feature diversities?
Thanks for your significant paper. However, I encountered a problem when I ran the instruction code for training:
File "/usr/local/lib/python3.10/dist-packages/mmcv/utils/registry.py", line 72, in build_from_cfg
raise type(e)(f'{obj_cls.name}: {e}')
urllib.error.URLError: <urlopen error MaskRCNN: <urlopen error MogaNet_feat: <urlopen error [Errno 104] Connection reset by peer>>>
I appreciate your help!
Lines 264 to 333 in cd53ea0
Hi! Thank you for your great work! MultiOrderGatedAggregation模块的实现与论文不符,论文图中并没有shortcut,且FD的激活函数用的GELU。请问,我应该遵循哪个呢?
Thank you for your work on MogaNet. When can the pre-training model be released? Thank you!
Sorry to bother you. Hello, I have read your paper and found it very impressive. I have a small question: can I use the CA module you proposed to replace the FFN layer in ViT? Again, I apologize for my interruption and look forward to your reply and suggestions. Thank you!
Thank you for the great work. I have a concern about how to draw the figure of the multi-order interactions.
Could you share the code? Thank you in advance.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.