GithubHelp home page GithubHelp logo

Comments (7)

lantiga avatar lantiga commented on August 24, 2024 2

This should help: #111

Just append --dtype bfloat16 to the conversion arguments and it will keep half precision tensors in memory during conversion.

@psych0v0yager can you try this out on your system?

from lit-llama.

chrisociepa avatar chrisociepa commented on August 24, 2024

most likely it means you don't have enough RAM

from lit-llama.

psych0v0yager avatar psych0v0yager commented on August 24, 2024

from lit-llama.

lantiga avatar lantiga commented on August 24, 2024

@psych0v0yager I don't think you have enough RAM to hold the 13B model at 32bit precision (you need 48GB).

As a check, try instantiating the 13B model

from lit_llama.model import LLaMA

model = LLaMA.from_name("13B")

BTW are you loading the original checkpoints? We could provide an option to load incrementally in bfloat16 to reduce the requirements.

from lit-llama.

psych0v0yager avatar psych0v0yager commented on August 24, 2024

Thank you for the prompt replies.

I attempted the instantiation and additional conversion argument and received the following results.

Python 3.10.10 (main, Mar 21 2023, 18:45:11) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

from lit_llama.model import LLaMA

model = LLaMA.from_name("13B")

Killed

python scripts/convert_checkpoint.py --output_dir checkpoints/lit-llama --ckpt_dir /dalai/llama/models --tokenizer_path /dalai/llama/models/tokenizer.model --model_size 13B --dtype bfloat16
50%|█████████████████████████████████████████████████████████████████▌ | 1/2 [00:46<00:46, 46.04s/it]
Killed

Could it potentially be a VRAM issue? I only have 12 gb in my Nvidia 3060. Furthermore the model weights come from the dalai llama repo (https://github.com/cocktailpeanut/dalai) and I believe they are the full precision weights.

Thanks again for the support

from lit-llama.

chrisociepa avatar chrisociepa commented on August 24, 2024

Killed means that the program was killed by your OS. From my experience it appears in 99% when you try using more RAM and SWAP than you have. My advice: run htop (or top) and check the memory consumption when the script is running to confirm the root cause of the problem. If you dont have enough VRAM, you see OOM exception.

from lit-llama.

lantiga avatar lantiga commented on August 24, 2024

Closing this one, feel free to reopen

from lit-llama.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.