GithubHelp home page GithubHelp logo

lilypad-tech / lilypad Goto Github PK

View Code? Open in Web Editor NEW
42.0 11.0 10.0 4.75 MB

run AI workloads easily in a decentralized GPU network. https://www.youtube.com/watch?v=zeG2F-JANjI

Home Page: https://lilypad.tech

License: Apache License 2.0

Go 56.34% Dockerfile 1.26% Solidity 14.69% Shell 1.70% TypeScript 24.18% Cuda 1.84%
ai blockchain compute crypto mistral-7b sdxl stable-diffusion wasm web3 depin

lilypad's Introduction

LIKE THIS PROJECT?

PLEASE STAR US AND HELP US GROW! <3

Lilypad ๐Ÿƒ

Lilypad enables users to run containerised AI workloads easily in a decentralized GPU network, where anyone can get paid to connect their compute nodes to the network and run container jobs. Users have access to easily run jobs such as Stable Diffusion XL and cutting edge open source LLMs both on chain, from CLI and via Lilypad AI Studio on the web.

Visit the Lilypad Docs site for a more comprehensive overview to getting up and running including a Quick Start Guide

Getting started running container jobs on Lilypad

Jobs (containers) can be run on Lilypad by utilising the Installable CLI, also available for installation through the Go toolchain. After setting up the necessary pre-requisites, the CLI enables users to run jobs as described below:

lilypad run cowsay:v0.0.4 -i Message="moo"

Watch the video

The current list of modules can be found in the following repositories:

Containerised job modules can be built and added to the available module list; for more details visit the building a job documentation. If you would like to contribute, open a pull request on this repository to add your link to the list above.

Getting started running a Node on Lilypad Network

As a distributed network Lilypad also brings with it the ability to run as a node and contribute to the GPU and compute capabilities. See the documentation on running a node which contains more details instructions and overview for getting set up.

The Lilypad Community

Read our Blog

Join the Discord

Follow us on Twitter/X

Check out our videos on YouTube

lilypad's People

Contributors

alvin-reyes avatar apquinit avatar aquigorka avatar arsen3d avatar bgins avatar binocarlos avatar developerally avatar developersteve avatar eltociear avatar hollygrimm avatar hunjixin avatar lilypad-releases[bot] avatar lukemarsden avatar narbs91 avatar noryev avatar rhochmayr avatar richardbremner avatar taoshengshi avatar walkah avatar zorlin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lilypad's Issues

Error: No solver service specified - please use SERVICE_SOLVER or --service-solver

Using the mac m1 binary provided by Hiro Hamada here https://bacalhauproject.slack.com/archives/C055K39J9QW/p1696537153091329?thread_ts=1696438693.499459&cid=C055K39J9QW.

Running: lilypad run github.com/username/repo:tag -i Message=moo
Error: No solver service specified - please use SERVICE_SOLVER or --service-solver
export SERVICE_SOLVER="0x3C44CdDdB6a900fa2b585dd299e03d12FA4293BC" solves this.

Running: lilypad run github.com/username/repo:tag -i Message=moo. again
Error: No mediators services specified - please use SERVICE_MEDIATORS or --service-mediators
export SERVICE_MEDIATORS="0x90F79bf6EB2c4f870365E785982E1f101E93b906"

Works fine .

Headless CLI

Request/suggestion from someone at the Ethglobal hackathon is to add a --silent flag to the lilypad run command that nulls all output except the returned output result directory response. This would allow the CLI to be built into a serverside/code workflow more seamlessly.

Lilypad module local template file referencing

As a module builder it would be handy to include a -f /directory/lilypad_module.json.tmpl during module build testing, referencing instead from a local development directory to speed up the module development process. This would remove the need to run lilypad jobs from tagged github deployments.

export private key bash history dangers

https://protos.com/brazilian-crypto-streamer-loses-60k-after-showing-private-keys-recovers-it/#:~:text=A%20Brazilian%20crypto%20streamer%20lost,but%20it%20was%20too%20late.

In the docs one is asked to export their ethereum private key into a bash variable, this exposes it in bash history and thus reverse search.

If its possible I would change this setup to use docker compose with the .env_file variable, so one can create a .env and run the docker container that way

Local network usage of Arbitrum

Arbitrum

The current infra has a docker instance for the chain, it exposes two ports one for http and one for ws.
Once the chain container starts we need to 1) fund the admin account and 2) boot (fund accounts, compile and deploy contracts).

Let's keep the same processes/api/flow and if possible the same addresses for accounts and contracts.

Fix error: module does not exist

This happens when the code at pkg/module/utils.go checks if the directory exists but not the actual code inside the repo (also checks that .git exists).

Potential fix: check for the templ file and if does not, trigger the process to clone the repo.

The current workaround (remove /tmp/lilypad/data/repos) works, because it removes the empty dir and the next execution clones the repo.

cc/ @noryev

Schedule jobs to run with Lilypad

We want to be ready to load up the Lilypad network with jobs that RPs can run when the incentivized testnet is launched. A simple way of doing this is using cron or systemd timers as a JC.

To start just get the scheduling running with any job (like the cowsay example). Later, we can aim to get it working with something more useful (like a data processing flow that uses LLMs).

"JC has already agreed"

image

I was testing cancelling the spinner on SIGINT, which meant hitting ^C up enter a bunch of times on lilypad run. After doing this 3-4 times, I started getting the error above. After getting into this state, it seems never to recover. An issue in the smart contracts perhaps?

Once lilypad has pulled a module's git repo, it never updates it

Lilypad clones and uses the contents of git repos when using modules.

The code for this is here:
https://github.com/bacalhau-project/lilypad/blob/main/pkg/module/shortcuts/shortcuts.go
https://github.com/bacalhau-project/lilypad/blob/main/pkg/module/utils.go

The repos get cloned both in the CLI/client (job creator) and the server (resource provider).

Currently if you're a module developer, if any lilypad process (client or server) clones a repo, then you push more changes to it, the client or server will look in the existing cached checkout for the new refs (e.g. tags) and fail to find them, resulting in errors like #9

One solution would be to always pull the git repo before trying to find a ref in it (repo.ResolveRevision), but the problem with that is that it makes github.com (or wherever people are hosting their modules) into a single point of failure for our nice decentralized system. It would be better to only pull updates into the git repo if finding the ref fails. This way, if nodes have already run modules (which will normally be the case) and even if github is down, they'll still be able to proceed from their local cached state.

So:

Ideally write a test to cover this case. A fix without a test is acceptable though!

Add Automatic Pricing Update Feature to Solver Control Loop

Currently, the pricing for the onchain job manager is set manually by administrators, which is not ideal for maintaining a dynamic and efficient marketplace. To address this, we propose adding an automatic pricing update feature to the solver control loop. This will enable the solver to dynamically adjust the pricing based on the current market conditions and resource offers.

Proposed Changes

  • Implement a mechanism in the solver control loop that periodically queries the current pool of resource offers to determine the current market price.
  • Utilize this information to automatically update the pricing by calling the setRequiredDeposit function on the contract.
  • Repeat this process at regular intervals to ensure that the pricing remains up-to-date and responsive to changing market conditions.

Lilypad-ify four new modules for SDXL & Mistral-7B fine-tuning and inference

We want four new Lilypad modules:

  • sdxl-finetune
  • sdxl-inference
  • mistral-finetune
  • mistral-inference

Dockerfiles:

Docker images:

They should copy the formula of: https://github.com/bacalhau-project/lilypad-module-lora-training and https://github.com/bacalhau-project/lilypad-module-lora-inference which is the same for stable diffusion 1.5

Here are the commands you need to run inside the containers:

sdxl-finetune

bind-mount /config.toml, /input and /output
config.toml should contain

# for sdxl fine tuning

[general]
enable_bucket = true                        # Whether to use Aspect Ratio Bucketing

[[datasets]]
resolution = 1024                           # Training resolution
batch_size = 4                              # Batch size

  [[datasets.subsets]]
  image_dir = '/input' # Specify the folder containing the training images
  caption_extension = '.txt'                # Caption file extension; change this if using .txt
  num_repeats = 10                          # Number of repetitions for training images
accelerate launch --num_cpu_threads_per_process 1 sdxl_train_network.py \
	  --pretrained_model_name_or_path=./sdxl/sd_xl_base_1.0.safetensors \
  	--dataset_config=/config.toml \
  	--output_dir=./output \
  	--output_name=lora \
  	--save_model_as=safetensors \
  	--prior_loss_weight=1.0 \
  	--max_train_steps=400 \
  	--vae=madebyollin/sdxl-vae-fp16-fix \
  	--learning_rate=1e-4 \
  	--optimizer_type=AdamW8bit \
  	--xformers \
  	--mixed_precision=fp16 \
  	--cache_latents \
  	--gradient_checkpointing \
  	--save_every_n_epochs=1 \
  	--network_module=networks.lora

The input should be a folder of images with captions of images in text files e.g. foo.jpg should have a foo.txt with a caption
Based on https://github.com/kohya-ss/sd-scripts

sdxl-inference

Given an input lora in bind-mounted /input directory, inference is then just:

accelerate launch --num_cpu_threads_per_process 1 sdxl_minimal_inference.py \
	--ckpt_path=sdxl/sd_xl_base_1.0.safetensors \
	--lora_weights=/input/lora.safetensors \
	--prompt="cj hole for sale sign in front of a posh house with a tesla in winter with snow" \
	--output_dir=/output

mistral-finetune

accelerate launch -m axolotl.cli.train examples/mistral/qlora-instruct.yml

mistral-inference

accelerate launch -m axolotl.cli.inference examples/mistral/qlora-instruct.yml

Mac and ARM builds

We should ship builds for Mac and for ARM.

Go is good at cross-compiling binaries, so we just need to:

This might be as simple as

GOOS=darwin GOARCH=arm64 go build -o build/Darwin-arm64/lilypad
GOOS=darwin GOARCH=x86_64 go build -o build/Darwin-x86_64/lilypad
GOOS=linux GOARCH=arm64 go build -o build/Linux-arm64/lilypad
GOOS=linux GOARCH=x86_64 go build -o build/Linux-x86_64/lilypad

Maybe even do Windows for extra points? lilypad.exe ๐Ÿ˜Ž

[core] Define release strategy

We want to define the release strategy. Currently, we have CI+CD pipelines on changes to main for devnet, with the intention of testing things there and then releasing to testnet (manually). We may want to take a look at Release please for the strategy.

We also want to add

  • semver to releases (that are in sync or work ok with go versioning)

cc @AquiGorka @bgins

Specs: core automations

  • How to update contracts?
  • How to deploy? Manual? Set the pk in ci seems to risky
  • Does it make sense to have CI+CD for upgrading contracts?
  • Tests

[core] Add tracing to the lilypad job lifecycle (use open telemetry standard)

We would like to trace the events in the job lifecycle to determine where jobs fail or execute slowly. In a broad view, the job lifecycle starts with a job offer and resource offer that are sent to the solver. The solver matches the offers to create a deal. The resource provider executes the job, and sends the job creator a reference to the results.

We can break this task into a few pieces:

  • Job execution on the resource provider (#293)
  • Sending the job offer
  • Sending the resource offer
  • Delivering results
  • On-chain deal events
  • Likely more

Co-authored by @bgins

Specs: how to correctly match Jobs to GPU sizes

Current workaround: RAM hack (use RAM to determine job matching).

The goal of this task is to define how to correctly do the matching so that when works gets started on it whoever takes on the task has clarity on how to do it.

Escaped json values are awkward

We json encode the string inputs to modules (i.e. the -i in lilypad run cowsay:v0.0.1 -i Message=moo) here: https://github.com/bacalhau-project/lilypad/blob/main/pkg/module/utils.go#L191-L199

Users are not trusted by nodes. Nodes should be able to trust module authors. If user input can make arbitrary changes to the job spec, this breaks the trust relationship.

Hence we want to avoid users doing nefarious things like including newlines and quotes in the untrusted user input.

This means that values in the json template, like https://github.com/bacalhau-project/lilypad-module-cowsay/blob/main/lilypad_module.json.tmpl#L16

like {{.Message}}, don't need to be escaped by the user with double quotes, but instead get automatically quoted for them. (Because the json serialization of a string has double quotes around it).

However, this makes it very awkward to substitute a value into an existing string, which many module authors want to do. In fact, we want to do it ourselves e.g. here: https://github.com/bacalhau-project/lilypad-module-sdxl/blob/main/lilypad_module.json.tmpl#L24

It would be much nicer to write

                    "PROMPT={{if .Prompt}}{{.Prompt}}{{else}}question mark floating in space{{end}}",

instead of

                    {{if .PromptEnv}}{{.PromptEnv}}{{else}}"PROMPT=question mark floating in space"{{end}},

And then the user would be able to specify the prompt like lilypad run sdxl:v0.9-lilypad1 -i Prompt="hoo haw" instead of lilypad run sdxl:v0.9-lilypad1 -i PromptEnv="PROMPT=hoo haw" which is sad and annoying. But of course that would be insecure, because the user could write unquoted newlines and double quotes into PromptEnv and mess up the template.

To solve this, we need a way to securely substitute strings into other strings while still ensuring no JSON escape.

In text/template, which we use to parse the template you can define custom functions: https://pkg.go.dev/text/template#Template.Funcs

So,

  • define a custom function which allows, given an untrusted user input string and, say, a printf style template string, for the module author to write:
                    {{subst "PROMPT=%s" .Prompt}},

Or something like that, where subst would:

  • JSON decode .Prompt (we still want it encoded by default, for security reasons)
  • printf it into the first given argument
  • JSON encode the resulting string for security

[core] Specs: Market matching

Need to put in research here to figure out how to do something like this, find someone that's done this before and copy them (hopefully clone someone that's been audited)

  • Spec allowing prices to float for jobs instead of being fixed by the network.
  • Once the spec is ready, start development on floating prices

[infra] Unit of effort

Part of: #56

Enable different billing for different size jobs within an individual module (steps = 500 vs steps = 50 - maps one-to-one to execution time)

Keep in mind:

  • Abusable (setting infinite steps on all nodes)
  • Important to prevent abuse as this lets us know how much to pay for executing jobs

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.