GithubHelp home page GithubHelp logo

googleforgames / genai-quickstart Goto Github PK

View Code? Open in Web Editor NEW
23.0 8.0 14.0 1.2 MB

Google for Games Generative AI Quickstart

License: Apache License 2.0

Shell 0.60% Python 82.85% Dockerfile 8.02% HCL 8.52%
games generative-ai google google-cloud

genai-quickstart's Introduction

GenAI Quickstart for Games

This project provides set of quickstarts aimed at accelerating GenAI integration and personalization within of live game enviroments using Google Cloud products and open source gaming solutions.

In today's gaming industry, providing a personalized and engaging experience for players is crucial. This project offers game developers a set of quickstart resources to help them integrate GenAI capabilities into their live game environments. By leveraging Google Cloud products and open-source gaming solutions, you can enhance player engagement, unlock new use cases with Generative AI, and create memorable gaming experiences.

NOTE: This is a rapidly evolving repo and is being adapted for a variety of use cases. If you would like to contribute or notice any bugs, please open an issue and/or feel free to submit a PR for review.

If you’re using this project, please ★Star this repository to show your interest!

Project Structure

Folder Description
terraform Infrastructure deployment scripts based on Terraform
examples Individual quickstarts that can be tested and deployed based on your use case
src Core source code that is used as part of our quickstarts

Architecture

Architecture

Prerequisites

Getting started

The following steps below will walk you through the setup guide for GenAI Quickstart. The process will walk through enabling the proper Google Cloud APIs, creating the resources via Terraform, and deployment of the Kubernetes manifests needed to run the project.

Note: These steps assume you already have a running project in Google Cloud for which you have IAM permissions to deploy resources into.

1) Clone this git repository

git clone https://github.com/googleforgames/GenAI-quickstart.git

cd GenAI-quickstart

2) Set ENV variable

Set your unique Project ID for Google Cloud

# To just use your current project
export PROJECT_ID=$(gcloud config list --format 'value(core.project)' 2>/dev/null)

# Otherwise set it to the project you wish to use.

Set default location for Google Cloud

export LOCATION=us-central1

To better follow along with this quickstart guide, set CUR_DIR env variable

export CUR_DIR=$(pwd)

3) Confirm user authentication to Google Cloud project

gcloud auth list

Check if your authentication is ok and your PROJECT_ID is valid.

gcloud projects describe ${PROJECT_ID:?}

You should see the your PROJECT_ID listed with an ACTIVE state.

4) Enable Google Cloud APIs

gcloud services enable --project ${PROJECT_ID:?} \
  aiplatform.googleapis.com \
  artifactregistry.googleapis.com \
  cloudbuild.googleapis.com \
  cloudresourcemanager.googleapis.com \
  compute.googleapis.com \
  container.googleapis.com \
  containerfilesystem.googleapis.com \
  containerregistry.googleapis.com \
  iam.googleapis.com \
  servicecontrol.googleapis.com \
  spanner.googleapis.com

5) Deploy infrastructure with Terraform

cd ${CUR_DIR:?}/terraform

cat terraform.example.tfvars | sed -e "s:your-unique-project-id:${PROJECT_ID:?}:g" > terraform.tfvars

terraform init

terraform plan

terraform apply

The deployment of cloud resources can take between 5 - 10 minutes. For a detailed view of the resources deployed see README in terraform directory.

6) Setup GKE credentials

After cloud resources have successfully been deployed with Terraform, get newly created GKE cluster credentials.

gcloud container clusters get-credentials genai-quickstart --region us-central1 --project ${PROJECT_ID:?}

Test your Kubernetes client credentials.

kubectl get nodes

7) Deploy GenAI workloads on GKE

Switch to the genai directory and render common templates that use your unique project id.

# Find all files named .template.yaml, replace `your-unique-project-id` with PROJECT_ID, and output to .yaml.
cd ${CUR_DIR:?}/genai && find common -type f -name "*.template.yaml" -exec \
  bash -c "template_path={}; sed \"s:your-unique-project-id:${PROJECT_ID:?}:g\" < \${template_path} > \${template_path/%.template.yaml/.yaml} " \;

Build and run GenAI workloads with Skaffold

gcloud auth configure-docker ${LOCATION:?}-docker.pkg.dev

export SKAFFOLD_DEFAULT_REPO=${LOCATION:?}-docker.pkg.dev/${PROJECT_ID:?}/repo-genai-quickstart

cd ${CUR_DIR:?}/genai

# To run all apis and models (requires a GPU node for stable-diffusion)
skaffold run --build-concurrency=0

After workloads are deployed, you can swap to using GPU deployments instead:

# Scale up a 2xL4 Mixtral 8x7B Deployment:
kubectl scale -n genai deployment huggingface-tgi-mixtral-small --replicas=1

# Or scale up a 8xL4 Mixtral 8x7B Deployment:
kubectl scale -n genai deployment huggingface-tgi-mixtral-big --replicas=1

# Scale down CPU Deployment:
kubectl scale -n genai deployment huggingface-tgi-mistral-cpu --replicas=0

# Note that the `huggingface-tgi-api` Service matches all of the huggingface-tgi-*
# Deployments, so if you have multiple replicas running, it will load balance
# between them.

You can also run the individual backends in isolation:

# To run only stable-diffusion (requires a GPU node)
#skaffold run --module stable-diffusion-api-cfg,stable-diffusion-endpt-cfg

# To run only Vertex chat (Vertex AI is required)
#skaffold run --module vertex-chat-api-cfg

8) Tests

Access the API - You can test the application and all the APIs from here :)

The cluster creates an internal passthrough Network Load Balancer (ILB). To access the APIs run:

kubectl port-forward svc/genai-api -n genai 8080:80

then in another window run:

export EXT_IP=localhost:8080
echo "Browse to http://${EXT_IP}/genai_docs to try out the GenAI APIs!"

and then navigate to the URL in your browser.

Test the API using curl:

curl -X 'POST' "http://${EXT_IP}/genai/text" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{"prompt": "Who are the founders of Google?"}'

Or test the API using the api-caller container inside the cluster:

# See available service endpoints. The `genai` endpoint wraps them all.
kubectl get svc -ngenai

# Start `api-caller` pod interactively
kubectl run -it -ngenai --rm --restart=Never api-caller --image=${SKAFFOLD_DEFAULT_REPO}/api-caller:latest

# Examples:

# See available example scripts
root@api-caller:/app# ls
embeddings.py  genai_api.py  huggingface_tgi.py  npc_chat_api.py  stable_diffusion_api.py  vertex_chat_api.py  vertex_code_api.py  vertex_gemini_api.py  vertex_image_api.py  vertex_text_api.py

# The genai_api script works for text prompts
root@api-caller:/app# python3 genai_api.py --endpoint=http://genai-api/genai/text --prompt "Describe a wombat"
INFO:root:Status Code: 200
INFO:root:Response:    "A wombat is a marsupial native to Australia. [...]"

# To try the Smart NPC, first reset the world data:
root@api-caller:/app# python3 npc_chat_api.py --endpoint http://genai-api/genai/npc_chat/reset_world_data --empty
INFO:root:Status Code: 200
INFO:root:Response:    {"status":"ok"}

# Then you can use the interactive chat:
root@api-caller:/app# python3 npc_chat_api.py --endpoint http://genai-api/genai/npc_chat --chat
>>> hey, how are you?
<<< I am doing my best here at the distribution center. It's a tough situation, but I am staying focused on helping those in need. How about you? How are you holding up?

# You can also interact with the services underneath, e.g.: Hugging Face TGI supports an interactive chat
root@api-caller:/app# python3 huggingface_tgi.py --endpoint=http://huggingface-tgi-api:8080/v1
>>> hello!
INFO:httpx:HTTP Request: POST http://huggingface-tgi-api:8080/v1/chat/completions "HTTP/1.1 200 OK"
<<<  Hello! How can I help you today? If you have any questions or need assistance with something, feel free to ask and I'll do my best to help. If you just want to chat, we can talk about pretty much anything. What's on your mind?

Project cleanup

Remove Kubernetes resources

In genai directory

cd ${CUR_DIR:?}/genai

skaffold delete

Remove infrastructure

In terraform directory

cd ${CUR_DIR:?}/terraform

terraform destroy

Troubleshooting

Not authenticated with Google Cloud project

If you are not running the above project in Google Cloud shell, make sure you are logged in and authenticated with your desired project:

gcloud auth application-default login

gcloud config set project ${PROJECT_ID:?}

and follow the authentication flow.


Contributing

The entire repo can be cloned and used as-is, or in many cases you may choose to fork this repo and keep the code base that is most useful and relevant for your use case. If you'd like to contribute, then more info can be found witin our contributing guide.

genai-quickstart's People

Contributors

dependabot[bot] avatar ggiovanejr avatar gongmax avatar igooch avatar kalaiselvi84 avatar mbychkowski avatar roberthbailey avatar smithpatrick12 avatar thisisnotapril avatar zaratsian avatar zmerlynn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

genai-quickstart's Issues

Os Environment Variables and LLM Calls Should be Mocked for Tests

What happened?

When running pytest -v in a local virtualenv on the test_main.py files:

  • genai_api/src/test_main.py fails when running pytest because the OS environment variables are not in place.
  • api/vertex_*/src/test_main.py fail when running pytest because there is no mock for Google_Cloud_GenAI, so the test tries to connect to a real project and call a LLM in that project.
  • api/vertex_*/src/test_main.py do not have the correct expected_response since they return a single text string response.text and not a json object {'mocked_key': 'mocked_value'}.

What you expected to happen:

Running the unit tests should not require setting up environment variables or calls to an external service.

How to reproduce it (as minimally and precisely as possible):

For the genai_api test:

  1. Create and activate a python virtual environment
  2. Run pytest in (myvirtualenv) me@me:~/GenAI-quickstart/genai/api/genai_api/src$ pytest -v

For example on the vertex_chat_api test:

  1. Create and activate a python virtual environment
  2. Run pytest in (myvirtualenv) me@me:~/GenAI-quickstart/genai/api/vertex_chat_api/src$ pytest -v (you'll notice this fails with warning that the 403 Vertex AI API has not been used in project)
  3. Change project_id_response.text to the name of your Google Cloud Project.
    project_id = project_id_response.text if project_id_response.status_code == 200 else "Unavailable"
  4. Run pytest again in (myvirtualenv) me@me:~/GenAI-quickstart/genai/api/vertex_chat_api/src$ pytest -v. Now the test fails with AssertionError: assert 'test response' == {'mocked_key': 'mocked_value'}.

Anything else we need to know?:

The value test response is coming from the actual call to the LLM. If you run the GenAI Quickstart cluster and navigate to the http://${EXT_IP}/genai_docs (following instructions on the main readme) and enter the same payload prompt as the test "prompt": "test prompt", the response is test response.

Add Message History Parameter into the Vertex Chat API

As is stands right now we don't have a way of passing the "messages" field with the chat history to the Vertex chat API.

def vertex_llm_chat(payload: Payload_Vertex_Chat):
try:
request_payload = {
'prompt': payload.prompt,
'context': payload.context,
'max_output_tokens': payload.max_output_tokens,
'temperature': payload.temperature,
'top_p': payload.top_p,
'top_k': payload.top_k,
}
response = model_vertex_llm_chat.call_llm(**request_payload)
return response.text

The current workaround is to pass the message history through as context.

if 'chatHistory' in request_payload:
context += f"\nHere the chat history for {request_payload['characterContext']} that can be used when answering questions:\n"
seen_chat = []
for chat in request_payload['chatHistory']:
if chat not in seen_chat:
if chat["sender"].upper() in ['USER', request_payload['characterName'].upper()]:
context += f'{chat["sender"].upper()}: {chat["message"]}\n'
seen_chat.append(chat)
payload = {
'prompt': f'''{request_payload["message"]}''',
'context': context,
}

Based on manual testing I've done with one Vertex Chat API endpoint "chatting" to another Vertex Chat API endpoint, this results in the chat quickly going into a loop.

The most straightforward way to fix this is to add "messages" as an optional field to the Vertex Chat API module, so that the message history does not need to be sent as part of the context. This is in keeping with the Vertex Text Chat schema.

In the future the API should be updated to be endpoint agnostic, and use the OpenAI chat completion schema

Additional Documentation on Which Endpoint use Vertex vs GKE

We got feedback from user that it is not clear which endpoints call Vertex vs. which endpoints call a LLM on the GKE cluster, and how to switch between the two.

We should:

  1. Update documentation to make clear which endpoints call Vertex (/genai, /genai/chat, /genai/code, /genai/image, /genai/text, etc.)
  2. Update documentation for instructions on how to switch between running a LLM on GKE (current default) and running on Vertex in the NPC chat I
    # GenAI provider - GKEGenAI or VertexAI. Note that switching GenAI implementations switches the
    # embedding model requiring a data regeneration using the /reset_world_data endpoint.
    genai = "GKEGenAI"
    # genai = "VertexAI"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.