GithubHelp home page GithubHelp logo

astronomicaly / realchar Goto Github PK

View Code? Open in Web Editor NEW

This project forked from shaunwei/realchar

0.0 0.0 0.0 54.52 MB

๐ŸŽ™๏ธ๐Ÿค–Create, Customize and Talk to your AI Character/Companion in Realtime (All in One Codebase!). Have a natural seamless conversation with AI everywhere (mobile, web and terminal) using LLM OpenAI GPT3.5/4, Anthropic Claude2, Chroma Vector DB, Whisper Speech2Text, ElevenLabs Text2Speech๐ŸŽ™๏ธ๐Ÿค–

Home Page: https://RealChar.ai/

License: MIT License

Shell 0.16% JavaScript 37.57% Python 29.70% Kotlin 10.41% CSS 0.23% Swift 21.35% Mako 0.08% Dockerfile 0.49%

realchar's Introduction

RealChar. - Your Realtime AI Character


RealChar-logo

๐ŸŽ™๏ธ๐Ÿค–Create, customize and talk to your AI Character/Companion in realtime๐ŸŽ™๏ธ๐Ÿค–

โœจ Demo

Try our site at RealChar.ai

Not sure how to pronounce RealChar? Listen to this ๐Ÿ‘‰ audip

Demo 1 - with Santa Claus!

Santa_tw.mp4

Demo 2 - with AI Elon about cage fight!

elon-edit-camera.mp4

Demo 3 - with AI Raiden about AI and "real" memory

raiden.mp4

Demo settings: Web, GPT4, ElevenLabs with voice clone, Chroma, Google Speech to Text

๐ŸŽฏ Key Features

  • Easy to use: No coding required to create your own AI character.
  • Customizable: You can customize your AI character's personality, background, and even voice
  • Realtime: Talk to or message your AI character in realtime
  • Multi-Platform: You can talk to your AI character on web, terminal and mobile(Yes. we open source our mobile app)
  • Most up-to-date AI: We use the most up-to-date AI technology to power your AI character, including OpenAI, Anthropic Claude 2, Chroma, Whisper, ElevenLabs, etc.
  • Modular: You can easily swap out different modules to customize your flow. Less opinionated, more flexible. Great project to start your AI Engineering journey.

๐Ÿ”ฌ Tech stack

RealChar-tech-stack

๐Ÿ“š Comparison with existing products

๐Ÿ“€ Quick Start - Installation via Docker

  1. Create a new .env file

    cp .env.example .env

    Paste your API keys in .env file. A single ReByte or OpenAI API key is enough to get started.

    You can also configure other API keys if you have them.

  2. Start the app with docker-compose.yaml

    docker compose up

    If you have issues with docker (especially on a non-Linux machine), please refer to https://docs.docker.com/get-docker/ (installation) and https://docs.docker.com/desktop/troubleshoot/overview/ (troubleshooting).

  3. Open http://localhost:3000 and enjoy the app!

๐Ÿ’ฟ Developers - Installation via Python

  • Step 1. Clone the repo

    git clone https://github.com/Shaunwei/RealChar.git && cd RealChar
  • Step 2. Install requirements

    Install portaudio and ffmpeg for audio

    # for mac
    brew install portaudio
    brew install ffmpeg
    # for ubuntu
    sudo apt update
    sudo apt install portaudio19-dev
    sudo apt install ffmpeg

    Note:

    • ffmpeg>=4.4 is needed to work with torchaudio>=2.1.0

    • Mac users may need to add ffmpeg library path to DYLD_LIBRARY_PATH for torchaudio to work:

      export DYLD_LIBRARY_PATH=/opt/homebrew/lib:$DYLD_LIBRARY_PATH

    Then install all python requirements

    pip install -r requirements.txt

    If you need a faster local speech to text, install whisperX

    pip install git+https://github.com/m-bain/whisperx.git
  • Step 3. Create an empty sqlite database if you have not done so before

    sqlite3 test.db "VACUUM;"
  • Step 4. Run db upgrade

    alembic upgrade head

    This ensures your database schema is up to date. Please run this after every time you pull the main branch.

  • Step 5. Setup .env:

    cp .env.example .env

    Update API keys and configs following the instructions in the .env file.

    Note that some features require a working login system. You can get your own OAuth2 login for free with Firebase if needed. To enable, set USE_AUTH to true and fill in the FIREBASE_CONFIG_PATH field. Also fill in Firebase configs in client/next-web/.env.

  • Step 6. Run backend server with cli.py or use uvicorn directly

    python cli.py run-uvicorn
    # or
    uvicorn realtime_ai_character.main:app
  • Step 7. Run frontend client:

    • web client:

      Create an .env file under client/next-web/

      cp client/next-web/.env.example client/next-web/.env

      Adjust .env according to the instruction in client/next-web/README.md.

      Start the frontend server:

      python cli.py next-web-dev
      # or
      cd client/next-web && npm run dev
      # or
      cd client/next-web && npm run build && npm run start

      After running these commands, a local development server will start, and your default web browser will open a new tab/window pointing to this server (usually http://localhost:3000).

    • (Optional) Terminal client:

      Run the following command in your terminal

      python client/cli.py
    • (Optional) mobile client:

      open client/mobile/ios/rac/rac.xcodeproj/project.pbxproj in Xcode and run the app

  • Step 8. Select one character to talk to, then start talking. Use GPT4 for better conversation and Wear headphone for best audio(avoid echo)

Note if you want to remotely connect to a RealChar server, SSL set up is required to establish the audio connection.

๐Ÿ‘จโ€๐Ÿš€ API Keys and Configurations

1. LLMs

1.1 ReByte API Key

To get your ReByte API key, follow these steps:

  1. Go to the ReByte website and sign up for an account if you haven't already.
  2. Once you're logged in, go to Settings > API Keys.
  3. Generate a new API key by clicking on the "Generate" button.

1.2 (Optional) OpenAI API Token

๐Ÿ‘‡click me This application utilizes the OpenAI API to access its powerful language model capabilities. In order to use the OpenAI API, you will need to obtain an API token.

To get your OpenAI API token, follow these steps:

  1. Go to the OpenAI website and sign up for an account if you haven't already.
  2. Once you're logged in, navigate to the API keys page.
  3. Generate a new API key by clicking on the "Create API Key" button.

(Optional) To use Azure OpenAI API instead, refer to the following section:

  1. Set API type in your .env file: OPENAI_API_TYPE=azure

If you want to use the earlier version 2023-03-15-preview:

OPENAI_API_VERSION=2023-03-15-preview

  1. To set the base URL for your Azure OpenAI resource. You can find this in the Azure portal under your Azure OpenAI resource.

OPENAI_API_BASE=https://your-base-url.openai.azure.com

  1. To set the OpenAI model deployment name for your Azure OpenAI resource.

OPENAI_API_MODEL_DEPLOYMENT_NAME=gpt-35-turbo-16k

  1. To set the OpenAIEmbeddings model deployment name for your Azure OpenAI resource.

OPENAI_API_EMBEDDING_DEPLOYMENT_NAME=text-embedding-ada-002

1.3 (Optional) Anthropic(Claude 2) API Token

๐Ÿ‘‡click me

To get your Anthropic API token, follow these steps:

  1. Go to the Anthropic website and sign up for an account if you haven't already.
  2. Once you're logged in, navigate to the API keys page.
  3. Generate a new API key by clicking on the "Create Key" button.

1.4 (Optional) Anyscale API Token

๐Ÿ‘‡click me

To get your Anyscale API token, follow these steps:

  1. Go to the Anyscale website and sign up for an account if you haven't already.
  2. Once you're logged in, navigate to the Credentials page.
  3. Generate a new API key by clicking on the "Generate credential" button.

2. Speech to Text

We support faster-whisper and whisperX as the local speech to text engines. Work with CPU and NVIDIA GPU.

2.1 (Optional) Google Speech-to-Text API

๐Ÿ‘‡click me

To get your Google Cloud API credentials.json, follow these steps:

  1. Go to the GCP website and sign up for an account if you haven't already.
  2. Follow the guide to create a project and enable Speech to Text API
  3. Put google_credentials.json in the root folder of this project. Check Create and delete service account keys
  4. Change SPEECH_TO_TEXT_USE to use GOOGLE in your .env file

2.2 (Optional) OpenAI Whisper API

๐Ÿ‘‡click me

Same as OpenAI API Token

3. Text to Speech

Edge TTS is the default and is free to use.

3.1 (Optional) ElevenLabs API Key

๐Ÿ‘‡click me
  1. Creating an ElevenLabs Account

    Visit ElevenLabs to create an account. You'll need this to access the text to speech and voice cloning features.

  2. In your Profile Setting, you can get an API Key.

3.2 (Optional) Google Text-to-Speech API

๐Ÿ‘‡click me

To get your Google Cloud API credentials.json, follow these steps:

  1. Go to the GCP website and sign up for an account if you haven't already.
  2. Follow the guide to create a project and enable Text to Speech API
  3. Put google_credentials.json in the root folder of this project. Check Create and delete service account keys

(Optional) ๐Ÿ”ฅ Create Your Own Characters

๐Ÿ‘‡click me

Create Characters Locally

see realtime_ai_character/character_catalog/README.md

Create Characters on ReByte.ai

see docs/rebyte_agent_clone_instructions.md

(Optional) โ˜Ž๏ธ Twilio Integration

๐Ÿ‘‡click me

To use Twilio with RealChar, you need to set up a Twilio account. Then, fill in the following environment variables in your .env file:

TWILIO_ACCOUNT_SID=YOUR_TWILIO_ACCOUNT_SID
TWILIO_ACCESS_TOKEN=YOUR_TWILIO_ACCESS_TOKEN
DEFAULT_CALLOUT_NUMBER=YOUR_PHONE_NUMBER

You'll also need to install torch and torchaudio to use Twilio.

Now, you can receive phone calls from your characters by typing /call YOURNUMBER in the text box when chatting with your character.

Note: only US phone numbers and Elevenlabs voiced characters are supported at the moment.

๐Ÿ†•! Anyscale and LangSmith integration

๐Ÿ‘‡click me

Anyscale

You can now use Anyscale Endpoint to serve Llama-2 models in your RealChar easily! Simply register an account with Anyscale Endpoint. Once you get the API key, set this environment variable in your .env file:

ANYSCALE_ENDPOINT_API_KEY=<your API Key>

By default, we show the largest servable Llama-2 model (70B) in the Web UI. You can change the model name (meta-llama/Llama-2-70b-chat-hf) to other models, e.g. 13b or 7b versions.

LangSmith

If you have access to LangSmith, you can edit these environment variables to enable:

LANGCHAIN_TRACING_V2=false # default off
LANGCHAIN_ENDPOINT=https://api.smith.langchain.com
LANGCHAIN_API_KEY=YOUR_LANGCHAIN_API_KEY
LANGCHAIN_PROJECT=YOUR_LANGCHAIN_PROJECT

And it should work out of the box.


๐Ÿ“ Roadmap

  • Launch v0.0.4
  • Create a new character via web UI
  • Lower conversation latency
  • Support Twilio
  • Support ReByte
  • Persistent conversation*
  • Session management*
  • Support RAG*
  • Support Agents/GPTs*
  • Add additional TTS service*

$*$ These features are powered by ReByte platform.

๐Ÿซถ Contribute to RealChar

Please check out our Contribution Guide!

๐Ÿ’ช Contributors

๐ŸŽฒ Community

realchar's People

Contributors

amethystlei avatar bennykok avatar eltociear avatar ezioruan avatar faker2048 avatar gladiopeace avatar hanweilang avatar hksfang avatar hongsiu avatar imccccc avatar inhabitants avatar kaisic1224 avatar kevin-free avatar kivinju avatar liltom-eth avatar lynchee-owo avatar peterzjx avatar prodonly avatar pycui avatar san45600 avatar sbs2001 avatar shaunwei avatar sulaymaanajmal avatar tc-zheng avatar uncierick avatar y1guo avatar ya010 avatar yaohaizhou avatar zcaelestis avatar zongziwang avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.