unites-lab / mc-smoe Goto Github PK View Code? Open in Web Editor NEW 62.0 4.0 9.0 1.85 MB [ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy" Home Page: https://arxiv.org/abs/2310.01334 License: MIT License Python 98.39% Shell 1.61% efficiency merging mixture-of-experts Introduction ยท People ยท Discuss
Finetuning / Distillation for Mixtral? Thank you for sharing the work! Congratulations! Do you have plan to release the fine-tuning / distillation code for mixtral? Really appreciate it!
Only MoE experts Fuse test Hello, Thank you for sharing your work! I would like to ask, if I only perform expert merging without compression and further training, how significant is the decline in model performance? Are there any relevant tests available? Thanks!
Vue.js ๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
javascript JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Machine learning Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Facebook We are working to build community through open source technology. NB: members must have two-factor auth.