Light

helixml / helix Goto Github PK

View Code? Open in Web Editor NEW

206.0 6.0 14.0 36.75 MB

Multi-node production AI stack. Run the best of open source AI easily on your own servers. Create your own AI by fine-tuning open source models

Home Page: https://docs.helix.ml

License: Other

Shell 0.65% Go 56.36% TypeScript 39.71% HTML 0.13% Dockerfile 0.14% Python 2.68% Mako 0.05% Smarty 0.28%

golang llama llm mistral openai self-hosted codellama mixtral qwen sdxl

helix's Introduction

SaaS • Private Deployment • Docs • Discord

HelixML

Private GenAI platform. Deploy the best of open AI in your own data center or VPC and retain complete data security & control.

Including support for fine-tuning models that's as easy as drag'n'drop.

Looking for a private GenAI platform? From language models to image models and more, Helix brings the best of open source AI to your business in an ergonomic, scalable way, while optimizing the tradeoff between GPU memory and latency.

Docker

git clone https://github.com/helixml/helix.git
cd helix

Create an .env file with settings based on the example values and edit it:

cp .env.example-prod .env

Ensure keycloak realm settings are up to date with your .env file

./update-realm-settings.sh

To start the services:

docker-compose up -d

The dashboard will be available on http://localhost.

Attach GPU runners: see runners docs

License

Helix is licensed under a similar license to Docker Desktop. You can run the source code (in this repo) for free for:

Personal Use: individuals or people personally experimenting
Educational Use: schools/universities
Small Business Use: companies with under $10M annual revenue and less than 250 employees

If you fall outside of these terms, please contact us to discuss purchasing a license for large commercial use. If you are an individual at a large company interested in experimenting with Helix, that's fine under Personal Use until you deploy to more than one GPU on company-owned or paid-for infrastructure.

You are not allowed to use our code to build a product that competes with us.

Contributions to the source code are welcome, and by contributing you confirm that your changes will fall under the same license.

Why these clauses in your license?

We generate revenue to support the development of Helix. We are an independent software company.
We don't want cloud providers to take our open source code and build a rebranded service on top of it.

If you would like to use some part of this code under a more permissive license, please get in touch.

helix's People

Contributors

Stargazers

Watchers

Forkers

drasaadmoosa barshy praveenaki k2m5t2 gvc0461082002 trevyn xiejiss krzynio ivanfioravanti waterfrog chocobar akollegger flankerad everblake

helix's Issues

multiple SDXL at the same time causes error

this could be in same session or not

fine tuning hangs

why? do we need to automatically restart things if they haven't started in a timeout?

automatically notify users when their training is completed

also, make it automatically proceed to training when qapairs is completed

switch to isStale everywhere

the logic for whether a model instance is stale is currently in 3 places (search for stale := and nonStale :=)

move it to one

Use huggingface tokenizer chat template for inference

In the llm model go code (e.g. here) we build up a prompt that is a formatted string based on the chat template associated with the model.

We could instead store a generic json-ised version of the chat history in task.Prompt, like:

[{"role": "user", "content": "What's the capital of France'?"}, {"role": "assistant", "content": "It's Paris."}]

and the use the model's tokenizer to format the message for us inside axolotl at inference time:

messages = json.loads(json_messages)
tokenizer = AutoTokenizer.from_pretrained(model_name)
encoded_messages = tokenizer.apply_chat_template(new_messages, tokenize=False)

This will reduce the effort needed to add subsequent models with potentially different chat templates.

"ignore errors" button is broken

it goes to notfound

when done, reply to user

Remove analytics from default images

If a user does a private deploy they probably don't want our GA or crisp widget

We should enable an opt out version check though

Mixtral or yi for qapair generation

For fully private deploys

Let's test out 4bit mixtral, and adapt pkg/dataprep/text/dynamic.go to call into it

autoscale spot runpod instances

autoscale spot runpod instances to match our queue depth: https://graphql-spec.runpod.io/ https://docs.runpod.io/recipes/

why do finetunes stick around in GPU memory?

once they're done they should exit right?

Can we run mistral on older GPUs?

E.g 1080

the delete button shows for read only folders in filestore

Old list "done"

Things we did whilst using the old "list"

url filename empty

when pasting the link into text fine tuning https://techycompany.com/ the text file has no name

check URLs have text

some URLs are just javascript and break unstructured - we need a better error: https://www.reuters.com/legal/colorado-ballot-case-adds-fuel-trumps-nomination-drive-2023-12-20/

GPU OOM avoiding by memory tracking & kill any pid that shows up in nvidia-smi that it doesn't own

nignx 500's in the runner "load session from api" handler

we are getting nignx 500's in the runner "load session from api" handler https://mlops-community.slack.com/archives/C0675EX9V2Q/p1702369315736539

When using "Fine Tune" option, it does not open and save a new session like it does with the "Create" option - Cannot Restore a fine tune session

If I start a conversation / generation with the Create option selected, then it opens and saves a session in the sidebar, but this does not happen when I use the Finetune option.

I cannot find a way to restore my Fine Tune session

add your own runner

new activity dot

show a dot next to sessions that are currently active or have new replies

scheduler not hitting spun up model

quite often there's a model ready to serve and a new one gets spun up on the other node - maybe the clocks are drifting between the machines so the 2 second head start doesn't work? or the python processes aren't polling every 100ms or something?

timestamps on the log events for runner scheduling decisions

speedway empty error

https://mlops-community.slack.com/archives/C0675EX9V2Q/p1701725991656799

dashboard session search

we need a way to see what happened to a single session in the dashboard

empty response messages error

https://mlops-community.slack.com/archives/C0675EX9V2Q/p1701727773319809

the session page scrolls to the bottom randomly

There is some useMemo that is reloading (possibly from keycloak) that is causing the "the session has changed scroll to the bottom" behaviour even when the session clearly has not changed - it's annoying because you are actively scrolling up and down just reading and then it will just jump to the bottom of the page

check URL type

make it clear that URLs need to be of text content - for example a youtube URL will not work

dashboard memory reporting

can you make it use GiB not GB? as in, gibibytes 1GB = 1024 * 1024 * 1024 bytes

place in the queue indiciation

if it's more than 5 seconds

we already have the "this is taking a while" window - this is to show the place in the queue also

Model seems obsessed with more fine tuning of dataset

Having submitted a document (random doc, outline of a fictional story), then asking what a character should do in the story, I keep being met with "Character should continue fine-tuning the data to improve the accuracy of the model." This seemed to be an inescapable answer, no matter how I posed the question.

It also does not appear to learn from any further conversation I have after the dataset is submitted.

url box mime type detection

if you put a URL to a file in the URL box - detect the bloody mime type so we don't split docs that are downloaded

the URL box should download files first

Multi GPU support

Support multiple GPUs on a single node. Initially we can workaround this just by running N runners with CUDA_VISIBLE_DEVICES passed through to the runner python processes

Jsonl input data

If the user uploads their own qapairs, skip the qapair generation phase

ensure the order of things in the dashboard

make sure we are seeing things date ordered

title and authors of paper don't show up

example: fine tune on https://arxiv.org/pdf/2311.14465.pdf
no authors show up in plain text version, so of course qapairs is unable to generate questions about who the authors are (or even what the paper is called)

fine tuning fails when filename ends in JPG

in all caps

reverse the color of the active session

make it more obvious what session we are looking at

consider speeding up sdxl fine tuning

probably doesn't need that many epochs, or, at least you could choose to tune it for longer if not happy with the results

render markdown

sometimes model parallelism on single gpu is desirable

when the queue depth for a certain model on a certain runner is high, we might actually want to start multiple instances of a model on a single gpu

too few questions in small dataset

If you put a small bit of text like:

Bob lives at 6 Crow Terrace

It will generate a single question / answer pair and then axotl complains there are too few questions in the training data set

long-running finetunes could get killed prematurely

we should probably distinguish between ACTIVE and IDLE sessions and not kill the active ones

analyse all sessions in the database

for each one:

are there errors? if so, add issues to github. calculate which issues caused the most errors
is there a trained model with no interactions? if so add to chris's spreadsheet and ping him. also, #43
were they successful at doing anything?

overall: what % of sessions were successful and what were the biggest pain points? categorise the use cases

sessions that are too long cause weird python errors

instead we need to truncate the start of the session and allow the user to carry on chatting

reply to user when issue is fixed

Improve deploy it yourself docs

See discord. In particular what URL to put in the browser to open the app

non-english language qapairs

currently the qapair thing seems to translate non-english input data to english, however we have users who want to be able to do it all in, say, french

when working get back to french user on crisp

multi-model group

(i.e. train an image and text and combine them into one chat)

OpenAI compatible API

Vote for this from a user on discord and karolis

show API calls to replicate many actions

(e.g. text & image inference to start with)

basically show the curl equivalent of the UI action - i.e. make it clear that you can use the API for each of these actions

url error reporting

detect when we did not manage to extract any text and tell the user that is the error

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs