View Code? Open in Web Editor
NEW
This project forked from kennethleungty/llama-2-open-source-llm-cpu-inference
Copy of Llama2 CPU inference project by Kenneth Leungty, adjusted to add more features
Home Page: https://towardsdatascience.com/running-llama-2-on-cpu-inference-for-document-q-a-3d636037a3d8
License: MIT License
llama-2-cpu-inference's People
Stargazers
Watchers
llama-2-cpu-inference's Issues
@Vlassie I have the following queries
- Will it run on XLSX\XLS\CSV files?
- How to make it run on GPU?
- Will it be able to answer the queries in, let's say less than 25secs?
Right now the code adds two different prompts together, one from main_st.py and one from prompts.py.
Only one prompt should be pushed to the model.
Right now, the chatbot only remembers the first conversation context you ask it. After clearing chat history, it should reset to 0 context.
The current readme file is still based on the original fork. A lot of things have changed, especially the instructions for using the program.