GithubHelp home page GithubHelp logo

aiot-mlsys-lab / svd-llm Goto Github PK

View Code? Open in Web Editor NEW
49.0 49.0 6.0 752 KB

Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"

Home Page: https://arxiv.org/abs/2403.07378

License: Apache License 2.0

Python 99.58% Shell 0.42%
efficient-model generative-ai large-language-models

svd-llm's People

Contributors

mi-zhang avatar tuidan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

svd-llm's Issues

fail to apply on llama-13b

Hello,I have some trouble to reproduce the results on llama-13b.An error "scaling_matrix_inv = torch.linalg.inv(scaling_diag_matrix) torch._C._LinAlgError: linalg.inv: The diagonal element 6940 is zero, the inversion could not be completed because the input matrix is singular" occurs on line 203, in whitening function.
How can I sovle this problem? Thanks.

Incorrect Model Responses after compression

I tried to use the provided scripts to compress LLAMA 2 with 0.2 compression ratio. The model evaluation script shows a perplexity of 7.2 on wikitext, but the model responses are mostly incoherent. I am getting responses like

Instruction: tell me about you==\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ selecting\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\

where as original model is giving decent responses.

Is there any modification to be done for the inference script or the tokeniser after model compression? , Is there an inference script within the repository?

Thanks for your help

still fail to apply on llama-13b

Hi, thank you for your reply. But I still get the same problem as mentioned before.
Traceback (most recent call last):
File "/home/xxx/SVD-LLM/SVDLLM_new.py", line 193, in whitening
scaling_matrix_inv = torch.linalg.inv(scaling_diag_matrix)
torch._C._LinAlgError: linalg.inv: The diagonal element 6940 is zero, the inversion could not be completed because the input matrix is singular. "
My python environment is built on requirements.txt. And I run the code on 2 3090 GPUs

Request for Code Integration of SVD-LLM with GPTQ

Hello,

Firstly, I want to express my gratitude for the fascinating work you've been doing. It's been inspiring.

I've recently come across your paper where you describe the integration of SVD-LLM with GPTQ, and I'm eager to explore the implementation further.
Could you please share the code where you've integrated SVD-LLM with GPTQ as described in the paper?

Your assistance in providing access to this code would be appreciated. Thank you for your time and consideration.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.