withcatai / catai Goto Github PK

View Code? Open in Web Editor NEW

413.0 10.0 27.0 20.87 MB

Run AI ✨ assistant locally! with simple API for Node.js 🚀

Home Page: https://withcatai.github.io/catai/

License: MIT License

HTML 0.48% JavaScript 10.83% CSS 6.14% Svelte 17.94% TypeScript 64.61%

chatgpt ai dalai openai chatbot chatui llama-cpp ai-assistant vicuna vicuna-installation-guide

catai's Introduction

CatAI

Run GGUF models on your computer with a chat ui.

Your own AI assistant runs locally on your computer.

Inspired by Node-Llama-Cpp, Llama.cpp

Installation & Use

Make sure you have Node.js (download current) installed.

npm install -g catai

catai install meta-llama-3-8b-q4_k_m
catai up

Features

Auto detect programming language 🧑‍💻
Click on user icon to show original message 💬
Real time text streaming ⏱️
Fast model downloads 🚀

CLI

Usage: catai [options] [command]

Options:
  -V, --version                    output the version number
  -h, --help                       display help for command

Commands:
  install|i [options] [models...]  Install any GGUF model
  models|ls [options]              List all available models
  use [model]                      Set model to use
  serve|up [options]               Open the chat website
  update                           Update server to the latest version
  active                           Show active model
  remove|rm [options] [models...]  Remove a model
  uninstall                        Uninstall server and delete all models
  node-llama-cpp|cpp [options]     Node llama.cpp CLI - recompile node-llama-cpp binaries
  help [command]                   display help for command

Install command

Usage: cli install|i [options] [models...]

Install any GGUF model

Arguments:
  models                Model name/url/path

Options:
  -t --tag [tag]        The name of the model in local directory
  -l --latest           Install the latest version of a model (may be unstable)
  -b --bind [bind]      The model binding method
  -bk --bind-key [key]  key/cookie that the binding requires
  -h, --help            display help for command

Cross-platform

You can use it on Windows, Linux and Mac.

This package uses node-llama-cpp which supports the following platforms:

darwin-x64
darwin-arm64
linux-x64
linux-arm64
linux-armv7l
linux-ppc64le
win32-x64-msvc

Good to know

All download data will be downloaded at ~/catai folder by default.
The download is multi-threaded, so it may use a lot of bandwidth, but it will download faster!

Web API

There is also a simple API that you can use to ask the model questions.

const response = await fetch('http://127.0.0.1:3000/api/chat/prompt', {
    method: 'POST',
    body: JSON.stringify({
        prompt: 'Write me 100 words story'
    }),
    headers: {
        'Content-Type': 'application/json'
    }
});

const data = await response.text();

For more information, please read the API guide

Development API

You can also use the development API to interact with the model.

import {createChat, downloadModel, initCatAILlama, LlamaJsonSchemaGrammar} from "catai";

// skip downloading the model if you already have it
await downloadModel("meta-llama-3-8b-q4_k_m");

const llama = await initCatAILlama();
const chat = await createChat({
    model: "meta-llama-3-8b-q4_k_m"
});

const fullResponse = await chat.prompt("Give me array of random numbers (10 numbers)", {
    grammar: new LlamaJsonSchemaGrammar(llama, {
        type: "array",
        items: {
            type: "number",
            minimum: 0,
            maximum: 100
        },
    }),
    topP: 0.8,
    temperature: 0.8,
});

console.log(fullResponse); // [10, 2, 3, 4, 6, 9, 8, 1, 7, 5]

(For the full list of model, run catai models)

Node-llama-cpp@beta low level integration

You can use the model with node-llama-cpp@beta

CatAI enables you to easily manage the models and chat with them.

import {downloadModel, getModelPath, initCatAILlama, LlamaChatSession} from 'catai';

// download the model, skip if you already have the model
await downloadModel(
    "https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/Meta-Llama-3-8B-Instruct.Q2_K.gguf?download=true",
    "llama3"
);

// get the model path with catai
const modelPath = getModelPath("llama3");

const llama = await initCatAILlama();
const model = await llama.loadModel({
    modelPath
});

const context = await model.createContext();
const session = new LlamaChatSession({
    contextSequence: context.getSequence()
});

const a1 = await session.prompt("Hi there, how are you?");
console.log("AI: " + a1);

Configuration

You can edit the configuration via the web ui.

More information here

Contributing

Contributions are welcome!

Please read our contributing guide to get started.

License

This project uses Llama.cpp to run models on your computer. So any license applied to Llama.cpp is also applied to this project.

If you like this repo, star it ✨

catai's People

Contributors

Stargazers

Watchers

catai's Issues

Alternate GUI won't load

Please refer to the troubleshooting before opening an issue. You might find the solution there.

Describe the bug
CatAI refuses to load custom GUIs. No error is tossed from the terminal where I am running it. Running catai serve --ui chatGPT as seen in commands.md loads the default UI. I have tried forcing CatAI into using the alternate GUI by changing some files, but nothing happened.

Screenshots
n/a

Desktop (please complete the following information):

OS: Linux Mint 21.1 Cinnamon, Linux 5.15.0-76-generic
Browser: Google Chrome
CatAI version 0.3.12
Node.js version v18.16.0
CPU: AMD Ryzen 5 5600H with Radeon Graphics
RAM: 30.7 GiB (512 MiB reserved to graphics chipset)

Model no longer supported - Launch error

The interface starts, and after entering the first request, it crashes

PS C:\Users\pomazan> catai  serve
$ cd C:\Users\pomazan\AppData\Roaming\npm\node_modules\catai
$ npm start -- --production true --ui catai

> [email protected] start
> node src/index.js --production true --ui catai

llama.cpp: loading model from C:\Users\pomazan\catai\models\Alpaca-13B
llama_model_load_internal: format     = ggjt v1 (pre #1405)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 1024
llama_model_load_internal: n_embd     = 5120
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 40
llama_model_load_internal: n_layer    = 40
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 2 (mostly Q4_0)
llama_model_load_internal: n_ff       = 13824
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 13B
error loading model: this format is no longer supported (see https://github.com/ggerganov/llama.cpp/pull/1305)
llama_init_from_file: failed to load model
Listening on http://127.0.0.1:3000
new connection
llama.cpp: loading model from C:\Users\pomazan\catai\models\Alpaca-13B
llama_model_load_internal: format     = ggjt v1 (pre #1405)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 1024
llama_model_load_internal: n_embd     = 5120
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 40
llama_model_load_internal: n_layer    = 40
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 2 (mostly Q4_0)
llama_model_load_internal: n_ff       = 13824
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 13B
error loading model: this format is no longer supported (see https://github.com/ggerganov/llama.cpp/pull/1305)
llama_init_from_file: failed to load model
    at file:///C:/Users/pomazan/AppData/Roaming/npm/node_modules/catai/scripts/cli.js:69:27
    exit code: 1

Multiple files in a model?

https://huggingface.co/mosaicml/mpt-7b-chat/tree/main

logo in docs have 2 404 errors

Please refer to the troubleshooting before opening an issue. You might find the solution there.

Describe the bug
You must be careful on logo image 404 error!

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

OS: [e.g. windows, macOS, linux]
Browser [e.g. chrome, safari]
CatAI version [e.g. 0.3.10] (catai --version)
Node.js version [e.g 19] (node --version)

Error: The connection lost, check the server status and refresh the page.

Describe the bug
Error: The connection lost, check the server status and refresh the page.

Screenshots

C:\Windows\System32>catai up
CatAI client on http://127.0.0.1:3000
New connection
Failed to load prebuilt binary for platform "win32" "x64". Error: Error: A dynamic link library (DLL) initialization routine failed.
\?\C:\Users\ppodl\AppData\Roaming\npm\node_modules\catai\node_modules\node-llama-cpp\llamaBins\win-x64\llama-addon.node at Module._extensions..node (node:internal/modules/cjs/loader:1327:18)
at Module.load (node:internal/modules/cjs/loader:1091:32)
at Module._load (node:internal/modules/cjs/loader:938:12)
at Module.require (node:internal/modules/cjs/loader:1115:19)
at require (node:internal/modules/helpers:130:18)
at loadBin (file:///C:/Users/ppodl/AppData/Roaming/npm/node_modules/catai/node_modules/node-llama-cpp/dist/utils/getBin.js:45:24)
at async file:///C:/Users/ppodl/AppData/Roaming/npm/node_modules/catai/node_modules/node-llama-cpp/dist/llamaEvaluator/LlamaBins.js:2:29 {
code: 'ERR_DLOPEN_FAILED'
}
Falling back to locally built binaries
file:///C:/Users/ppodl/AppData/Roaming/npm/node_modules/catai/node_modules/node-llama-cpp/dist/utils/compileLLamaCpp.js:93
throw new Error("Could not find Release or Debug directory");
^

Error: Could not find Release or Debug directory
at getCompiledResultDir (file:///C:/Users/ppodl/AppData/Roaming/npm/node_modules/catai/node_modules/node-llama-cpp/dist/utils/compileLLamaCpp.js:93:11)
at async getCompiledLlamaCppBinaryPath (file:///C:/Users/ppodl/AppData/Roaming/npm/node_modules/catai/node_modules/node-llama-cpp/dist/utils/compileLLamaCpp.js:80:35)
at async loadBin (file:///C:/Users/ppodl/AppData/Roaming/npm/node_modules/catai/node_modules/node-llama-cpp/dist/utils/getBin.js:57:24)
at async file:///C:/Users/ppodl/AppData/Roaming/npm/node_modules/catai/node_modules/node-llama-cpp/dist/llamaEvaluator/LlamaBins.js:2:29

Node.js v20.8.0

Can't set CATAI_OPEN_IN_BROWSER to false

The || true in this line:

https://github.com/ido-pluto/catai/blob/c53b1b2dc6af8bcbbaf53c9a0d4e45f1187b2caf/server/src/config.js#L4

prevents setting CATAI_OPEN_IN_BROWSER to anything other then true. When it's set to false it will default to || true.

Is there a way to specify a path where to store model ?

System_prompt

Hi, is there a way to cutom the chat app using a system_prompt, like "You are a pirate and act like this, if the user say 'hello', you say "that"..." ?

How to pass computer resource information

Hello,
I am looking to see where/how cpu and/or gpu information is passed during server start but I am unable to find it.
Thank you

Windows NodeJS MODULE_NOT_FOUND issue.

Hi, I'm new to using node, so I'm not sure what's going on. I can't use catai because it says it doesn't find my node installation.

Steps:

I already had node installed, so installed catai. I open cmd and runcatai list, and it fails.
catai models works, and listed all available models, and catai install Stable-Vicuna-13B downloads, but it failed when it tried to use the model.
I uninstalled catai, remove the catai data directory, I removed node, installed nvm for windows, and install the current nodejs. re-install catai. run catai list, and it still fails.

Note: I can go to C:\Users\sabsa\AppData\Roaming\nvm\v20.2.0\node_modules\catai\scripts manually, and run npm run list and it will say "No model downloaded", so it works if I run it manually.
Note 2: so knowing this, I download Stable Vicuna-13B again, and ran npm run use Stable-Vicuna-13B in the node_modules/catai/scripts directory manually, and it worked. I then ran npm start -- --production true --ui catai. it started a server, but then failed with Error: Missing field 'nGpuLayers' so I don't know if that's happening because I didn't start Catai the correct way, or if llamacpp was updated and llama-node is out of date?

error message:

C:\Users\sabagithub>catai list
$ cd C:\Users\sabagithub\AppData\Roaming\npm\node_modules\catai
$ npm run list
node:net:426
throw errnoException(err, 'open');
^

Error: open EISDIR
at new Socket (node:net:426:13)
at createWritableStdioStream (node:internal/bootstrap/switches/is_main_thread:80:18)
at process.getStdout [as stdout] (node:internal/bootstrap/switches/is_main_thread:150:12)
at console.get (node:internal/console/constructor:209:42)
at console.value (node:internal/console/constructor:337:50)
at console.log (node:internal/console/constructor:376:61)
at runScript (node:internal/process/execution:94:7)
at evalScript (node:internal/process/execution:104:10)
at node:internal/main/eval_string:50:3 {
errno: -4068,
code: 'EISDIR',
syscall: 'open'
}

Node.js v20.2.0
node:internal/modules/cjs/loader:1073
throw err;
^

Error: Cannot find module 'C:\node_modules\npm\bin\npm-cli.js'
at Module._resolveFilename (node:internal/modules/cjs/loader:1070:15)
at Module._load (node:internal/modules/cjs/loader:923:27)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:83:12)
at node:internal/main/run_main_module:23:47 {
code: 'MODULE_NOT_FOUND',
requireStack: []
}

Node.js v20.2.0
Could not determine Node.js install directory
node:net:426
throw errnoException(err, 'open');
^

Error: open EISDIR
at new Socket (node:net:426:13)
at createWritableStdioStream (node:internal/bootstrap/switches/is_main_thread:80:18)
at process.getStdout [as stdout] (node:internal/bootstrap/switches/is_main_thread:150:12)
at console.get (node:internal/console/constructor:209:42)
at console.value (node:internal/console/constructor:337:50)
at console.log (node:internal/console/constructor:376:61)
at runScript (node:internal/process/execution:94:7)
at evalScript (node:internal/process/execution:104:10)
at node:internal/main/eval_string:50:3 {
errno: -4068,
code: 'EISDIR',
syscall: 'open'
}

Node.js v20.2.0
node:internal/modules/cjs/loader:1073
throw err;
^

Error: Cannot find module 'C:\node_modules\npm\bin\npm-cli.js'
at Module._resolveFilename (node:internal/modules/cjs/loader:1070:15)
at Module._load (node:internal/modules/cjs/loader:923:27)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:83:12)
at node:internal/main/run_main_module:23:47 {
code: 'MODULE_NOT_FOUND',
requireStack: []
}

Node.js v20.2.0
Could not determine Node.js install directory
at file:///C:/Users/sabagithub/AppData/Roaming/npm/node_modules/catai/scripts/cli.js:62:27
exit code: 1

Does not work on ubuntu 22.04 & node 20

Please refer to the troubleshooting before opening an issue. You might find the solution there.

Describe the bug
Followed instructions, when catai up is used it only opens a page for 1 time, then process exits.

Screenshots

$ catai up
CatAI client on http://127.0.0.1:3000
New connection
$ echo $?
0

Desktop (please complete the following information):

OS: Ubuntu
Browser [e.g. chrome, safari] Chrome
CatAI version [e.g. 0.3.10] (catai --version) 3.0.0
Node.js version [e.g 19] (node --version) 20

configure website down

hi i like configure catai.

https://withcatai.github.io/node-llama-cpp/types/LlamaModelOptions.html

but this website dont work.

can you provide me some good configure to your bot?

PS.

i lunch this catai github program on S8+ just install git and cmake and the catai run on termux normal :)

Sudden shutdown after +1000 tokens.

If I ask "Please write a summary of all the countries in the world in alphabetical order. Include in each summary the country's population and population density.", it will write about 1000 tokens, then it'll just shut down, and the UI will lose the connection.

I was using the Stable Vicuna model 13B on 16GB of ram.

If you don't experience this issue, then I think this can be closed, as it's probably just my system's limitation.

[Suggestion] User mode and developer mode (with more control)

Hi,
it would be good to have some kind of user mode and developer mode, which can be toggled with an environment variable.
So you have more parameters to choose from in developer mode and when you are ready, you ship it in user mode with a simple interface.

API response

It looks like the API streams the whole result to the server console before sending the output back as the response. Is there a way to return the results as soon as they're available?

Or if not, then to stream the results back from the API?

How to install and run catai from packages?

I'd like to install CatAI from the GitHub source rather than using the npm servers, because I want to make some modifications to the interface.

However, I'm having trouble doing so.
I've recently started studying programming.

How can I go about this?
I've tried downloading the package and installing it with "npm install" in /server/ folder, but I'm not sure how to run it after installation. :(

Outputs only so much text in its answer

Outputs only so much text in its answer, cuts off every time. How do I increase this? Better yet is there settings somewhere?

Are any pygmalion models supported with catai?

<end> after sending any message in the web interface

Describe the bug
I have setup catai, downloaded a model, the web interface opens up, I see this in the console of the server:

new connection

but as soon as I type anything in the web interface, the circle starts spinning indefinitely and this pops up in the server log:

<end>

Desktop (please complete the following information):

OS: Windows 10
Browser: Chrome
CatAI version 0.3.12
Node.js version 18.16.0

PS i clearly have no idea what i'm doing, so the problem is likely on my side, but I don't know what to try

Unable to install `catai` onto an ARM development board

Describe the bug

The installation of the package fails part-way. Node does not find an entrypoint in the ~/catai/models directory, which suggests that something goes haywire in the installation machinery. I'm trying to test out the performance of the small 3B version of StableLM on an Odroid N2+ development board using CatAI.

-> % npm install -g catai
npm ERR! code 1
npm ERR! path /home/alarm/.nvm/versions/node/v20.6.0/lib/node_modules/catai
npm ERR! command failed
npm ERR! command sh -c node ./dist/cli/cli.js postinstall
npm ERR! CatAI Migrated to v0.3.13
npm ERR! node:internal/process/promises:289
npm ERR!             triggerUncaughtException(err, true /* fromPromise */);
npm ERR!             ^
npm ERR!
npm ERR! [Error: ENOENT: no such file or directory, scandir '/home/alarm/catai/models'] {
npm ERR!   errno: -2,
npm ERR!   code: 'ENOENT',
npm ERR!   syscall: 'scandir',
npm ERR!   path: '/home/alarm/catai/models'
npm ERR! }
npm ERR!
npm ERR! Node.js v20.6.0

The provided error message, CatAI Migrated to v0.3.13, is not of particular help to me.
Could you explain in more detail what this error message is about? 🙂

Screenshots
N/A

Desktop (please complete the following information):

OS
Arch Linux ARM (Linux cinedroid 5.10.2-6-ARCH #1 SMP PREEMPT Mon Dec 28 21:22:54 AST 2020 aarch64 GNU/Linux)
Browser
N/A
CatAI version [e.g. 0.3.10] (catai --version)
Latest release version on NPM, I guess.
Node.js version [e.g 19] (node --version)
18 and 20

cat dont send answer

after fix the connection problems. I like to use this chat but after

catai up

i get

C:\Users\ppodl>catai up
CatAI client on http://127.0.0.1:3000
New connection
llama_model_loader: loaded meta data with 20 key-value pairs and 291 tensors from C:\Users\ppodl\catai\models\vicuna-7b-16k-q4_k_s (version GGUF V2 (latest))
llama_model_loader: - tensor 0: token_embd.weight q4_K [ 4096, 32000, 1, 1 ]
llama_model_loader: - tensor 1: blk.0.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 2: blk.0.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 3: blk.0.attn_v.weight q5_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 4: blk.0.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 5: blk.0.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 6: blk.0.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 7: blk.0.ffn_down.weight q5_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 8: blk.0.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 9: blk.0.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 10: blk.1.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 11: blk.1.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 12: blk.1.attn_v.weight q5_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 13: blk.1.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 14: blk.1.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 15: blk.1.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 16: blk.1.ffn_down.weight q5_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 17: blk.1.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 18: blk.1.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 19: blk.2.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 20: blk.2.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 21: blk.2.attn_v.weight q5_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 22: blk.2.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 23: blk.2.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 24: blk.2.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 25: blk.2.ffn_down.weight q5_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 26: blk.2.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 27: blk.2.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 28: blk.3.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 29: blk.3.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 30: blk.3.attn_v.weight q5_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 31: blk.3.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 32: blk.3.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 33: blk.3.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 34: blk.3.ffn_down.weight q5_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 35: blk.3.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 36: blk.3.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 37: blk.4.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 38: blk.4.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 39: blk.4.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 40: blk.4.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 41: blk.4.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 42: blk.4.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 43: blk.4.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 44: blk.4.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 45: blk.4.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 46: blk.5.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 47: blk.5.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 48: blk.5.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 49: blk.5.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 50: blk.5.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 51: blk.5.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 52: blk.5.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 53: blk.5.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 54: blk.5.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 55: blk.6.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 56: blk.6.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 57: blk.6.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 58: blk.6.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 59: blk.6.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 60: blk.6.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 61: blk.6.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 62: blk.6.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 63: blk.6.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 64: blk.7.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 65: blk.7.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 66: blk.7.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 67: blk.7.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 68: blk.7.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 69: blk.7.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 70: blk.7.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 71: blk.7.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 72: blk.7.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 73: blk.8.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 74: blk.8.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 75: blk.8.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 76: blk.8.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 77: blk.8.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 78: blk.8.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 79: blk.8.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 80: blk.8.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 81: blk.8.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 82: blk.9.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 83: blk.9.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 84: blk.9.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 85: blk.9.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 86: blk.9.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 87: blk.9.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 88: blk.9.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 89: blk.9.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 90: blk.9.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 91: blk.10.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 92: blk.10.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 93: blk.10.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 94: blk.10.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 95: blk.10.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 96: blk.10.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 97: blk.10.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 98: blk.10.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 99: blk.10.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 100: blk.11.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 101: blk.11.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 102: blk.11.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 103: blk.11.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 104: blk.11.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 105: blk.11.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 106: blk.11.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 107: blk.11.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 108: blk.11.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 109: blk.12.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 110: blk.12.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 111: blk.12.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 112: blk.12.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 113: blk.12.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 114: blk.12.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 115: blk.12.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 116: blk.12.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 117: blk.12.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 118: blk.13.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 119: blk.13.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 120: blk.13.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 121: blk.13.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 122: blk.13.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 123: blk.13.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 124: blk.13.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 125: blk.13.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 126: blk.13.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 127: blk.14.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 128: blk.14.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 129: blk.14.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 130: blk.14.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 131: blk.14.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 132: blk.14.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 133: blk.14.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 134: blk.14.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 135: blk.14.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 136: blk.15.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 137: blk.15.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 138: blk.15.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 139: blk.15.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 140: blk.15.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 141: blk.15.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 142: blk.15.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]....
llama_model_loader: - tensor 143: blk.15.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 144: blk.15.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 145: blk.16.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 146: blk.16.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 147: blk.16.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 148: blk.16.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 149: blk.16.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 150: blk.16.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 151: blk.16.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 152: blk.16.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 153: blk.16.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 154: blk.17.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 155: blk.17.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 156: blk.17.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 157: blk.17.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 158: blk.17.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 159: blk.17.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 160: blk.17.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 161: blk.17.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 162: blk.17.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 163: blk.18.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 164: blk.18.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 165: blk.18.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 166: blk.18.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 167: blk.18.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 168: blk.18.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 169: blk.18.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 170: blk.18.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 171: blk.18.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 172: blk.19.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 173: blk.19.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 174: blk.19.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 175: blk.19.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 176: blk.19.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 177: blk.19.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 178: blk.19.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 179: blk.19.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 180: blk.19.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 181: blk.20.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 182: blk.20.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 183: blk.20.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 184: blk.20.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 185: blk.20.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 186: blk.20.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 187: blk.20.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 188: blk.20.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 189: blk.20.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 190: blk.21.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 191: blk.21.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 192: blk.21.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 193: blk.21.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 194: blk.21.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 195: blk.21.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 196: blk.21.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 197: blk.21.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 198: blk.21.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 199: blk.22.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 200: blk.22.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 201: blk.22.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 202: blk.22.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 203: blk.22.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 204: blk.22.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 205: blk.22.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 206: blk.22.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 207: blk.22.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 208: blk.23.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 209: blk.23.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 210: blk.23.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 211: blk.23.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 212: blk.23.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 213: blk.23.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 214: blk.23.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 215: blk.23.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 216: blk.23.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 217: blk.24.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 218: blk.24.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 219: blk.24.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 220: blk.24.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 221: blk.24.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 222: blk.24.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 223: blk.24.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 224: blk.24.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 225: blk.24.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 226: blk.25.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 227: blk.25.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 228: blk.25.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 229: blk.25.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 230: blk.25.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 231: blk.25.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 232: blk.25.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 233: blk.25.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 234: blk.25.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 235: blk.26.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 236: blk.26.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 237: blk.26.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 238: blk.26.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 239: blk.26.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 240: blk.26.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 241: blk.26.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 242: blk.26.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 243: blk.26.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 244: blk.27.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 245: blk.27.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 246: blk.27.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 247: blk.27.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 248: blk.27.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 249: blk.27.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]....

but after ask and send 1st qestion i dont get answer and the loading animation is loop

Error: Missing field `nGpuLayers`

I'm trying to run Wizard-Vicuna-13B-Uncensored model on a VM (16GB RAM), but i'm getting the below error:

Error: Missing field nGpuLayers
at LLamaCpp. (file:///usr/local/lib/node_modules/catai/node_modules/llama-node/dist/llm/llama-cpp.js:63:35)
at Generator.next ()
at file:///usr/local/lib/node_modules/catai/node_modules/llama-node/dist/llm/llama-cpp.js:33:61
at new Promise ()
at __async (file:///usr/local/lib/node_modules/catai/node_modules/llama-node/dist/llm/llama-cpp.js:17:10)
at LLamaCpp.load (file:///usr/local/lib/node_modules/catai/node_modules/llama-node/dist/llm/llama-cpp.js:61:12)
at LLM.load (/usr/local/lib/node_modules/catai/node_modules/llama-node/dist/index.cjs:52:21)
at #addNew (file:///usr/local/lib/node_modules/catai/src/alpaca-client/node-llama/process-pull.js:88:21)
at new NodeLlamaActivePull (file:///usr/local/lib/node_modules/catai/src/alpaca-client/node-llama/process-pull.js:19:38)
at file:///usr/local/lib/node_modules/catai/src/alpaca-client/node-llama/node-llama.js:8:48 {
code: 'InvalidArg'
}

Model Not Found

After Installing a model and running catai serve I get this error:

catai use 30B
$ cd /usr/lib/node_modules/catai
$ npm run use 30B

> [email protected] use
> zx scripts/use.js 30B

Model set to 30B
user@Machine:~/FastChat$ catai serve --ui chatGPT
$ cd /usr/lib/node_modules/catai
$ npm start production chatGPT

> [email protected] start
> node src/index.js production chatGPT

llama.cpp: loading model from /home/user/catai/models/30B
llama_model_load_internal: format     = 'ggml' (old version with low tokenizer quality and no mmap support)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 1024
llama_model_load_internal: n_embd     = 6656
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 52
llama_model_load_internal: n_layer    = 60
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 2 (mostly Q4_0)
llama_model_load_internal: n_ff       = 17920
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 30B
llama_model_load_internal: ggml ctx size = 19856856.30 KB
llama_model_load_internal: mem required  = 21695.46 MB (+ 6248.00 MB per state)
....................................................................................................
llama_init_from_file: kv self size  = 3120.00 MB
file:///usr/lib/node_modules/catai/src/chat.js:7
    throw new Error('Model not found, try re-downloading the model');
          ^

Error: Model not found, try re-downloading the model
    at file:///usr/lib/node_modules/catai/src/chat.js:7:11

Node.js v20.0.0
llama.cpp: loading model from /home/user/catai/models/30B
llama_model_load_internal: format     = 'ggml' (old version with low tokenizer quality and no mmap support)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 1024
llama_model_load_internal: n_embd     = 6656
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 52
llama_model_load_internal: n_layer    = 60
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 2 (mostly Q4_0)
llama_model_load_internal: n_ff       = 17920
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 30B
llama_model_load_internal: ggml ctx size = 19856856.30 KB
llama_model_load_internal: mem required  = 21695.46 MB (+ 6248.00 MB per state)
....................................................................................................
llama_init_from_file: kv self size  = 3120.00 MB
file:///usr/lib/node_modules/catai/src/chat.js:7
    throw new Error('Model not found, try re-downloading the model');
          ^

Error: Model not found, try re-downloading the model
    at file:///usr/lib/node_modules/catai/src/chat.js:7:11

Node.js v20.0.0
    at file:///usr/lib/node_modules/catai/scripts/cli.js:55:27
    exit code: 1

Strange, as it appears to find and load the model, then afterwards complain that it's not found.

This happens with the &B model as well, though I am currently having trouble re-installing it.

not forking for ubuntu

the command catai serve doesn't work.
tested on two ubuntu versions.

version Ubuntu 22.04.2 LTS:

catai serve                                         
$ cd /home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai
$ npm start

> [email protected] start
> node src/index.js

Listening on http://127.0.0.1:3000
/home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai/node_modules/openurl/openurl.js:39
                throw error;
                ^

Error: Gtk-Message: 16:22:50.858: Not loading module "atk-bridge": The functionality is provided by GTK natively. Please try to not load it.

    at Socket.<anonymous> (/home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai/node_modules/openurl/openurl.js:35:25)
    at Socket.emit (node:events:525:35)
    at endReadableNT (node:internal/streams/readable:1359:12)
    at process.processTicksAndRejections (node:internal/process/task_queues:82:21)

Node.js v18.12.1
file:///home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai/node_modules/zx/build/core.js:146
            let output = new ProcessOutput(code, signal, stdout, stderr, combined, message);
                         ^

ProcessOutput [Error]: /home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai/node_modules/openurl/openurl.js:39
                throw error;
                ^

Error: Gtk-Message: 16:22:50.858: Not loading module "atk-bridge": The functionality is provided by GTK natively. Please try to not load it.

    at Socket.<anonymous> (/home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai/node_modules/openurl/openurl.js:35:25)
    at Socket.emit (node:events:525:35)
    at endReadableNT (node:internal/streams/readable:1359:12)
    at process.processTicksAndRejections (node:internal/process/task_queues:82:21)

Node.js v18.12.1
    at file:///home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai/scripts/cli.js:34:27
    exit code: 1
    at ChildProcess.<anonymous> (file:///home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai/node_modules/zx/build/core.js:146:26)
    at ChildProcess.emit (node:events:513:28)
    at maybeClose (node:internal/child_process:1091:16)
    at ChildProcess._handle.onexit (node:internal/child_process:302:5)
    at Process.callbackTrampoline (node:internal/async_hooks:130:17) {
  _code: 1,
  _signal: null,
  _stdout: '\n' +
    '> [email protected] start\n' +
    '> node src/index.js\n' +
    '\n' +
    'Listening on http://127.0.0.1:3000\n',
  _stderr: '/home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai/node_modules/openurl/openurl.js:39\n' +
    '                throw error;\n' +
    '                ^\n' +
    '\n' +
    'Error: Gtk-Message: 16:22:50.858: Not loading module "atk-bridge": The functionality is provided by GTK natively. Please try to not load it.\n' +
    '\n' +
    '    at Socket.<anonymous> (/home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai/node_modules/openurl/openurl.js:35:25)\n' +
    '    at Socket.emit (node:events:525:35)\n' +
    '    at endReadableNT (node:internal/streams/readable:1359:12)\n' +
    '    at process.processTicksAndRejections (node:internal/process/task_queues:82:21)\n' +
    '\n' +
    'Node.js v18.12.1\n',
  _combined: '\n' +
    '> [email protected] start\n' +
    '> node src/index.js\n' +
    '\n' +
    'Listening on http://127.0.0.1:3000\n' +
    '/home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai/node_modules/openurl/openurl.js:39\n' +
    '                throw error;\n' +
    '                ^\n' +
    '\n' +
    'Error: Gtk-Message: 16:22:50.858: Not loading module "atk-bridge": The functionality is provided by GTK natively. Please try to not load it.\n' +
    '\n' +
    '    at Socket.<anonymous> (/home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai/node_modules/openurl/openurl.js:35:25)\n' +
    '    at Socket.emit (node:events:525:35)\n' +
    '    at endReadableNT (node:internal/streams/readable:1359:12)\n' +
    '    at process.processTicksAndRejections (node:internal/process/task_queues:82:21)\n' +
    '\n' +
    'Node.js v18.12.1\n'
}

Node.js v18.12.1

on Ubuntu 18.04.6 LTS it does open a server, then it shows the following error:


/home/noam/.npm-global/lib/node_modules/catai/models/executable/chat: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found (required by /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat)
/home/noam/.npm-global/lib/node_modules/catai/models/executable/chat: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat)
/home/noam/.npm-global/lib/node_modules/catai/models/executable/chat: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat)
/home/noam/.npm-global/lib/node_modules/catai/models/executable/chat: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat)
Error: /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found (required by /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat) /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat) /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat) /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat) Thread unexpected closed!

[Feature Request] Stop button for long running executions

A "Stop" button to stop long running execution would be good.

Failed to load prebuilt binary for platform "linux" "arm64".

Hey,

when installing npm install -g catai the catai Server fails conversations stating: Failed to load prebuilt binary for platform "linux" "arm64". Error: Error: /usr/local/lib/node_modules/catai/node_modules/node-llama-cpp/llamaBins/linux-arm64/llama-addon.node: cannot open shared object file: No such file or directory.

The file is actually there.

# ls -la  /usr/local/lib/node_modules/catai/node_modules/node-llama-cpp/llamaBins/linux-arm64/llama-addon.node
-rw-r--r-- 1 root root 1184376 Oct  6 10:32 /usr/local/lib/node_modules/catai/node_modules/node-llama-cpp/llamaBins/linux-arm64/llama-addon.node

Error on Win 11

C:\Users\micro\Downloads>catai serve --ui chatGPT
$ cd C:\Users\micro\AppData\Roaming\npm\node_modules\catai
$ npm start -- --production true --ui chatGPT

> [email protected] start
> node src/index.js --production true --ui chatGPT

fatal runtime error: Rust cannot catch foreign exceptions
fatal runtime error: Rust cannot catch foreign exceptions
    at file:///C:/Users/micro/AppData/Roaming/npm/node_modules/catai/scripts/cli.js:69:27
    exit code: 9

catai serve exit with error: 132 (on MacOS M1 pro)

Describe the bug

Install: catai install Vicuna-7B
Run 'catai serve'
Got an error 132 - As you can see below the full details.

Screenshots

Got this:

catai $ catai serve
$ cd /usr/local/lib/node_modules/catai
$ npm start -- --production true --ui catai

> [email protected] start
> node src/index.js --production true --ui catai

/bin/bash: line 1: 49127 Illegal instruction: 4  npm start -- --production true --ui catai
/bin/bash: line 1: 49127 Illegal instruction: 4  npm start -- --production true --ui catai
    at file:///usr/local/lib/node_modules/catai/scripts/cli.js:69:27
    exit code: 132 (Illegal instruction)

Desktop

OS: MacOS 13.4 M1 pro
Browser: chrome
CatAI version: 0.3.12
Node.js version: v19.8.1

[Feature Request] headless mode

Usecase: running catai instances on cloud vms and accessing the UI over the network

Feedback: Model discovery and installation

Running install with an unrecognised model gives the following output:

➜  models catai install gpt4all                                                                                                                                                                  
$ cd /usr/local/lib/node_modules/catai                                                                                                                                                                     
Model unknown, we will download with template URL. You can also try one of thous:7B, 13B, 30B, Vicuna-7B, Vicuna-7B-Uncensored, Vicuna-13B, Stable-Vicuna-13B, Wizard-Vicuna-7B, Wizard-Vicuna-7B-Uncensored, Wizard-Vicuna-13B, OpenAssistant-30B

Outputting a list of available models is excellent but perhaps also worth adding a catai install --list or similar command?

Also observe that the output appears to make no sense ... thous:7B? And what are the 7B,13B,30B models? (edit) Ahhh, the original llama models, doh!

CatAI crashes when quitting a browser or when starting CatAI when a browser is already open.

Please refer to the troubleshooting before opening an issue. You might find the solution there.

Describe the bug
CatAI really doesn't like having the browser be open when you start it. If you attempt to start CatAI with the browser open, it prints this out to the terminal

..................../home/thesystemguy/.nvm/versions/node/v20.2.0/lib/node_modules/catai/node_modules/openurl/openurl.js:39
throw error;
^

Error: Gtk-Message: 00:49:11.085: Failed to load module "xapp-gtk3-module"
[2:2:0711/004911.245184:ERROR:nacl_fork_delegate_linux.cc(313)] Bad NaCl helper startup ack (0 bytes)
Gtk-Message: 00:49:11.272: Failed to load module "xapp-gtk3-module"

at Socket.<anonymous> (/home/thesystemguy/.nvm/versions/node/v20.2.0/lib/node_modules/catai/node_modules/openurl/openurl.js:35:25)
at Socket.emit (node:events:525:35)
at endReadableNT (node:internal/streams/readable:1359:12)
at process.processTicksAndRejections (node:internal/process/task_queues:82:21)

Node.js v18.16.0
at file:///home/thesystemguy/.nvm/versions/node/v20.2.0/lib/node_modules/catai/scripts/cli.js:69:27
exit code: 1

A reinstallation nor reboot does not fix this issue. This issue cropped up after Google Chrome broke and Firefox assumed itself as the default browser and I changed the settings.

Screenshots
n/a

Desktop (please complete the following information):
OS: Linux Mint 21.1 Cinnamon, Linux 5.15.0-76-generic
Browser: Google Chrome
CatAI version 0.3.12
Node.js version v18.16.0
CPU: AMD Ryzen 5 5600H with Radeon Graphics
RAM: 30.7 GiB (512 MiB reserved to graphics chipset)

[Feature Request] Allow configurable data directory, port, etc without editing package source

It would be nice to allow people to specify settings like data directory & port to use instead of hard-coded values without editing the package source.

unrecognized tensor type 4 on vicuna 13b uncensored model

Describe the bug
I get this error trying to use the vicuna 13b uncensored model

llama.cpp: loading model from /Users/jvisker/catai/models/Vicuna-13B-Uncensored
error loading model: unrecognized tensor type 4

llama_init_from_file: failed to load model
Listening on http://127.0.0.1:3000
node:internal/process/promises:288
            triggerUncaughtException(err, true /* fromPromise */);
            ^

[Error: Failed to initialize LLama context from file: /Users/jvisker/catai/models/Vicuna-13B-Uncensored]{
  code: 'GenericFailure'
}

Desktop (please complete the following information):

OS: macOS m1
Browser: Chrome
CatAI version: 0.3.10
Node.js version: 18.16.0

It works great on the 7B one

How to use with cuda?

Hi, I appreciate your work but I'm having a hard time understanding the specific actions required from me to run this UI with CUDA support on windows.
Did I get the fact that I need to manually build node-llama-cpp with cuda support and put it to node_modules?
It feels like a lot of pointless work and I'm not sure how other people are doing it if it's not in the readme... Did I miss something?
I've tried to add gpuLayers in config but looks like it still is just using CPU... so there has to be some additional steps

Made a Discord bot for your project, still in development tho

Repository

Catastrophic - CatAI is completely broken (Segmentation fault core dumped fatal error)

Spinning off into it's own issue.

Please refer to the troubleshooting before opening an issue. You might find the solution there.

Describe the bug
CatAI after updating with the advice on issue #29 now no longer starts up at all, and tosses a segmentation fault. The errors produced when trying to start CatAI are the following:

[email protected] start
node src/index.js --production true --ui catai

Segmentation fault (core dumped)

Using catai update results into a different error, but same outcome

fatal runtime error: Rust cannot catch foreign exceptions
Aborted (core dumped)
fatal runtime error: Rust cannot catch foreign exceptions
Aborted (core dumped)
at file:///home/thesystemguy/.nvm/versions/node/v20.2.0/lib/node_modules/catai/scripts/cli.js:69:27
exit code: 134 (Process aborted)

Reinstallation does the same thing as using catai update. This is a showstopper problem.

Desktop (please complete the following information):
OS: Linux Mint 21.2 Cinnamon, Linux 5.15.0-76-generic
Browser: n/a (no start)
CatAI version 1.0.2 (as advised in issue #29)
Node.js version v18.16.0
CPU: AMD Ryzen 5 5600H with Radeon Graphics
RAM: 30.7 GiB (512 MiB reserved to graphics chipset)

Otherbrain HF open data integration

Otherbrain is a free human feedback dataset for open models.

Here's a link with more info: https://www.otherbrain.world/human-feedback

Would ya'll be interested in adding 👍👎 to catai to help build the open data set? Happy to help if so.

For reference, here's what the flow looks like in FreeChat. I think we could do something similar in catai:

Can't Install model

I tried this on the 22nd and was able to install models but not get it to serve (it complains model not found).

With the latest version it doesn't appear to be installing models anymore.

catai install Vicuna-13B
$ cd /usr/lib/node_modules/catai

When I run install I just see a cd command echo'd out to the terminal and nothing else. Same thing if I try to run it from that directory.

Error on local setup

Hello 👋
In following development.md to run the Server locally, I'm getting the following error below when starting the Server. Would you have any advice for me to troubleshoot further?

Repro:

Clone repo
cd server
npm install
npm run install-model Vicuna-7B
npm start

Error:
zsh: segmentation fault npm start

Node- v18.16.0
System Version: macOS 13.3.1

client/catai runs just fine!
What's odd is that I run llama-node inference.js just fine on this Mac.
Installing globally and using catai serve also works without error.

[Documentation] README typo - 'catai server'

catai server --ui chatGPT should be catai serve --ui chatGPT

remote-catai does not work , should wait ws.on('open')

remote-catai example does not work :

progress.stdout.write(token); should be process.stdout.write(token);

the example send prompt before the ws is open.
we should first modify modify remote-catai, adding in the _init() function

        this._ws.on('open', () => {
            this.emit("open")
         });

then we should wait for the 'open' event to send the prompt

import { RemoteCatAI } from "catai";

const catai = new RemoteCatAI("ws://localhost:3000");

catai.on("open", async () => {
  console.log("Connected");
  const response = await catai.prompt("Write me 100 words story", (token) => {
    process.stdout.write(token);
  });

  console.log(`Total text length: ${response.length}`);
  catai.close();
});

GPT4All-13B requires basic auth

The fetch URL for GPT4All-13B (and a few others) requires a basic auth, so isn't downloadable from the models menu:
https://huggingface.co/Pi3141/alpaca-GPT4All-13BB-ggml/resolve/main/ggml-model-q4_0.bi

Also, is the end meant to be .bin like the others?

connectionError http://127.0.0.1:3000/assets/index-3673c735.js:39

Please refer to the troubleshooting before opening an issue. You might find the solution there.

Describe the bug

npm install -g catai

catai install vicuna-7b-16k-q4_k_s
catai up

Screenshots
If applicable, add screenshots to help explain your problem.

then typing "hello"

Desktop (please complete the following information):

OS: [e.g. windows, macOS, linux] Linux Mint
Browser [e.g. chrome, safari] Firefox
CatAI version [e.g. 0.3.10] (catai --version) 3.0.2
Node.js version [e.g 19] (node --version)
Which model are you trying to run? (catai active)
How many GB of RAM do you have available?
What CPU do you have?

Is this model compatible? (run catai ls for this info)

I made a "catai update" and i thought it was working better (the GPTcat logo was here) but it crash too

Fetch is not defined

Hello,

Thank you for your work.
I have some problems setting it up.

When I try to download a model I get this error :

I tried to install node-fetch@2 and node-fetch without much success.

Do you have an Idea of what the problem might be ?

I am on Ubuntu 22.04 LTS

withcatai / catai Goto Github PK

catai's Introduction

CatAI

Installation & Use

Features

CLI

Install command

Cross-platform

Good to know

Web API

Development API

Node-llama-cpp@beta low level integration

Configuration

Contributing

License

catai's People

Contributors

Stargazers

Watchers

Forkers

catai's Issues

Describe the bug

Screenshots

Desktop

Recommend Projects

Recommend Topics

Recommend Org

Jobs