Question: How does this model respond to pruning? As it is an adapter model, have you

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

potential avenues of size reduction. about llama-adapter HOT 2 OPEN

opengvlab commented on June 18, 2024

potential avenues of size reduction.

from llama-adapter.

Comments (2)

aojunzz commented on June 18, 2024

@Alignment-Lab-AI thanks for your insightful questions,

Q1.: have you attempted reducing the precision then training each layer on an adapter and swapping in the adapters on the needed layers during inference?

We don't reduce the precision then training, could you discuss more insight of this method? if we use the low precision in training and then swapping the adapters, it can improve the performance or others?

Q2: what have you tried so far to sparsity it?

Currently, we don't sparsify the models. In our second version, we introduce the scale layer, the scale layer may is an important metric to remove the unimportant neuron. And you can also use other sparse method to reduce the model size.

from llama-adapter.

Alignment-Lab-AI commented on June 18, 2024

sorry! i missed the notification!, i explained the process poorly, i meant to ask if you had attempted to quantize the full model and returned adapters to the important layers during inference that had been trained more accurately.

https://www.deepspeed.ai/tutorials/MoQ-tutorial/

however this may honestly work better alone. sorry for the out of scope line of questioning, haha i was working on the outline for my next project and it is always important to me to make them as small as possible so i dont have to pay for as many a100s!

im sure you are quite busy but i was going to engage in my own project concerning a multimodal model inspired by this repository and a few others, would it be appropriate to discuss it?

from llama-adapter.

Recommend Projects

potential avenues of size reduction. about llama-adapter HOT 2 OPEN

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs