luuyin / owl Goto Github PK

Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"

Home Page: https://arxiv.org/pdf/2310.05175.pdf

License: MIT License

Python 99.33% Shell 0.67%

largelanguagemodel llm pruning sparsity

owl's Issues

Why reproduced result is different from the one in paper?

Hi, I am reproducing the llama-7b pruning experiment, but the perplexities after pruning 70% by sparsegpt, wanda, owl w. are different from the result in paper, as table in below:

second col is my res, third col is paper's result.
i wander if i miss any point or if hyperparams in this repo are different from ones in paper?

How to use the zero_shot_benchmark code?

Hi there!

How to use the code in the zero_shot_benchmark folder in order to evaluate, as done in table 4?

An example usage in the README would be helpful.
Thanks!

About LOD

hello, thanks for your work.

I have a question about Layerwise Outlier Distribution. Why is the value in LOD larger than 1 in Figure 1? I suppose the outlier ratio should be smaller than 1?

No change in model size after pruning

Hey, I wanted to know whether the size of model will be same after pruning or it'll be reduced?
I've tried to prune OPT-125M model but the size of the model is same as before which is 250MB.
Thanks in advance

missing lm_eval?

I believe the lm_eval file is missing for zero_shot_benchmark.

lm_eval missing

I believe the lm_eval file is missing for zero_shot_benchmark. Could you please update?

Thanks in advance

Fine-tuning Performance.

Hi, I cannot locate the fine-tuning part, mentioned in the paper, in your code. Would it be possible to upload it?

License

Hi,
Great project! Would you mind adding a license?
Thanks!

luuyin / owl Goto Github PK

owl's Issues

Why reproduced result is different from the one in paper?

How to use the zero_shot_benchmark code?

About LOD

No change in model size after pruning

missing lm_eval?

lm_eval missing

Fine-tuning Performance.

License

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs