GithubHelp home page GithubHelp logo

clemlesne / claim-ai-phone-bot Goto Github PK

View Code? Open in Web Editor NEW
65.0 5.0 14.0 170.88 MB

AI-powered call center solution with Azure and OpenAI GPT.

License: Apache License 2.0

Makefile 3.11% Python 84.96% Dockerfile 0.59% Bicep 7.23% Jinja 4.11%

claim-ai-phone-bot's Introduction

Call center claim AI phone bot

AI-powered call center solution with Azure and OpenAI GPT.

Last release date Project license

Overview

A French demo is avaialble on YouTube. Do not hesitate to watch the demo in x1.5 speed to get a quick overview of the project.

French demo

Main interactions shown in the demo:

  1. User calls the call center
  2. The bot answers and the conversation starts
  3. The bot stores conversation, claim and todo list in the database

Extract of the data stored during the call:

{
  "claim": {
    "incident_date_time": "2024-01-11T19:33:41",
    "incident_description": "The vehicle began to travel with a burning smell and the driver pulled over to the side of the freeway.",
    "policy_number": "B01371946",
    "policyholder_phone": "[number masked for the demo]",
    "policyholder_name": "Clémence Lesne",
    "vehicle_info": "Ford Fiesta 2003"
  },
  "reminders": [
    {
      "description": "Check that all the information in Clémence Lesne's file is correct and complete.",
      "due_date_time": "2024-01-18T16:00:00",
      "title": "Check Clémence file"
    }
  ]
}

Features

Note

This project is a proof of concept. It is not intended to be used in production. This demonstrates how can be combined Azure Communication Services, Azure Cognitive Services and Azure OpenAI to build an automated call center solution.

  • Access the claim on a public website
  • Access to customer conversation history
  • Allow user to change the language of the conversation
  • Bot can be called from a phone number
  • Bot use multiple voice tones (e.g. happy, sad, neutral) to keep the conversation engaging
  • Company products (= lexicon) can be understood by the bot (e.g. a name of a specific insurance product)
  • Create by itself a todo list of tasks to complete the claim
  • Customizable prompts
  • Disengaging from a human agent when needed
  • Filter out inappropriate content from the LLM, like profanity or concurrence company names
  • Fine understanding of the customer request with GPT-4 Turbo
  • Follow a specific data schema for the claim
  • Has access to a documentation database (few-shot training / RAG)
  • Help the user to find the information needed to complete the claim
  • Lower AI Search cost by usign a Redis cache
  • Monitoring and tracing with Application Insights
  • Responses are streamed from the LLM to the user, to avoid long pauses
  • Send a SMS report after the call
  • Take back a conversation after a disengagement
  • Call back the user when needed
  • Simulate a IVR workflow

User report after the call

A report is available at https://[your_domain]/report/[phone_number] (like http://localhost:8080/report/%2B133658471534). It shows the conversation history, claim data and reminders.

User report

High level architecture

---
title: System diagram (C4 model)
---
graph
  user(["User"])
  agent(["Agent"])

  api["Claim AI"]

  api -- Transfer to --> agent
  api -. Send voice .-> user
  user -- Call --> api

Component level architecture

---
title: Claim AI component diagram (C4 model)
---
graph LR
  agent(["Agent"])
  user(["User"])

  subgraph "Claim AI"
    ai_search[("RAG\n(AI Search)")]
    api["API"]
    communication_service_sms["SMS gateway\n(Communication Services)"]
    communication_service["Call gateway\n(Communication Services)"]
    constent_safety["Moderation\n(Content Safety)"]
    db[("Conversations and claims\n(Cosmos DB or SQLite)")]
    event_grid[("Broker\n(Event Grid)")]
    gpt["GPT-4 Turbo\n(OpenAI)"]
    redis[("Cache\n(Redis)")]
    translation["Translation\n(Cognitive Services)"]
  end

  api -- Answer with text --> communication_service
  api -- Ask for translation --> translation
  api -- Few-shot training --> ai_search
  api -- Generate completion --> gpt
  api -- Get cached data --> redis
  api -- Save conversation --> db
  api -- Send SMS report --> communication_service_sms
  api -- Test for profanity --> constent_safety
  api -- Transfer to agent --> communication_service
  api -. Watch .-> event_grid

  communication_service -- Notifies --> event_grid
  communication_service -- Transfer to --> agent
  communication_service -. Send voice .-> user

  communication_service_sms -- Send SMS --> user

  user -- Call --> communication_service

Sequence diagram

sequenceDiagram
    autonumber

    actor Customer
    participant PSTN
    participant Text to Speech
    participant Speech to Text
    actor Human agent
    participant Event Grid
    participant Communication Services
    participant Content Safety
    participant API
    participant Cosmos DB
    participant OpenAI GPT
    participant AI Search

    API->>Event Grid: Subscribe to events
    Customer->>PSTN: Initiate a call
    PSTN->>Communication Services: Forward call
    Communication Services->>Event Grid: New call event
    Event Grid->>API: Send event to event URL (HTTP webhook)
    activate API
    API->>Communication Services: Accept the call and give inbound URL
    deactivate API
    Communication Services->>Speech to Text: Transform speech to text

    Communication Services->>API: Send text to the inbound URL
    activate API
    alt First call
        API->>Communication Services: Send static SSML text
    else Callback
        API->>AI Search: Gather training data
        API->>OpenAI GPT: Ask for a completion
        OpenAI GPT-->>API: Answer (HTTP/2 SSE)
        loop Over buffer
            loop Over multiple tools
                alt Is this a claim data update?
                    API->>Content Safety: Ask for safety test
                    alt Is the text safe?
                        API->>Communication Services: Send dynamic SSML text
                    end
                    API->>Cosmos DB: Update claim data
                else Does the user want the human agent?
                    API->>Communication Services: Send static SSML text
                    API->>Communication Services: Transfer to a human
                    Communication Services->>Human agent: Call the phone number
                else Should we end the call?
                    API->>Communication Services: Send static SSML text
                    API->>Communication Services: End the call
                end
            end
            alt Is there a text?
                alt Is there enough text to make a sentence?
                    API->>Content Safety: Ask for safety test
                    alt Is the text safe?
                        API->>Communication Services: Send dynamic SSML text
                    end
                end
            end
        end
        API->>Cosmos DB: Persist conversation
    end
    deactivate API
    Communication Services->>PSTN: Send voice
    PSTN->>Customer: Forward voice

Remote deployment

Container is available on GitHub Actions, at:

  • Latest version from a branch: ghcr.io/clemlesne/claim-ai-phone-bot:main
  • Specific tag: ghcr.io/clemlesne/claim-ai-phone-bot:0.1.0 (recommended)

Create a local config.yaml file (most of the fields are filled automatically by the deployment script):

# config.yaml
workflow:
  agent_phone_number: "+33612345678"
  bot_company: Contoso
  bot_name: Robert
  lang: {}

communication_service:
  phone_number: "+33612345678"

prompts:
  llm: {}
  tts: {}

Steps to deploy:

  1. Create an Communication Services resource, a Phone Number with inbound call capability, make sure the resource have a managed identity
  2. Create the local config.yaml file (like the example above)
  3. Connect to your Azure environment (e.g. az login)
  4. Run deployment with make deploy name=my-instance
  5. Wait for the deployment to finish (if it fails for a 'null' not found error, retry the command)
  6. Link the AI multi-service account named [my-instance]-communication to the Communication Services resource
  7. Create a AI Search index named trainings

Get the logs with make logs name=my-instance.

Local installation

Prerequisites

Place a file called config.yaml in the root of the project with the following content:

# config.yaml
monitoring:
  application_insights:
    connection_string: xxx

resources:
  public_url: "https://xxx.blob.core.windows.net/public"

workflow:
  agent_phone_number: "+33612345678"
  bot_company: Contoso
  bot_name: Robert

communication_service:
  access_key: xxx
  endpoint: https://xxx.france.communication.azure.com
  phone_number: "+33612345678"

cognitive_service:
  # Must be of type "AI services multi-service account"
  endpoint: https://xxx.cognitiveservices.azure.com

openai:
  api_key: xxx
  endpoint: https://xxx.openai.azure.com
  gpt_backup_context: 16385
  gpt_backup_deployment: gpt-35-turbo-1106
  gpt_backup_model: gpt-35-turbo-1106
  gpt_context: 128000
  gpt_deployment: gpt-4-1106-preview
  gpt_model: gpt-4-1106-preview

ai_search:
  access_key: xxx
  endpoint: https://xxx.search.windows.net
  index: trainings
  semantic_configuration: default

content_safety:
  access_key: xxx
  endpoint: https://xxx.cognitiveservices.azure.com

To use a Service Principal to authenticate to Azure, you can also add the following in a .env file:

AZURE_CLIENT_ID=xxx
AZURE_CLIENT_SECRET=xxx
AZURE_TENANT_ID=xxx

To override a specific configuration value, you can also use environment variables. For example, to override the openai.endpoint value, you can use the OPENAI__ENDPOINT variable:

OPENAI__ENDPOINT=https://xxx.openai.azure.com

Then run:

# Install dependencies
make install

Also, a public file server is needed to host the audio files. Upload the files with make copy-resources name=myinstance (myinstance is the storage account name), or manually.

For your knowledge, this resources folder contains:

Run

Finally, in two different terminals, run:

# Expose the local server to the internet
make tunnel
# Start the local API server
make dev

Advanced usage

Add my custom training data with AI Search

Training data is stored on AI Search to be retrieved by the bot, on demand.

Required index schema:

Field Name Type Retrievable Searchable Dimensions Vectorizer
id Edm.String Yes No
content Edm.String Yes Yes
source_uri Edm.String Yes No
title Edm.String Yes Yes
vectors Collection(Edm.Single) No No 1536 OpenAI ADA

An exampe is available at examples/import-training.ipynb. It shows how to import training data from a PDF files dataset.

Customize the prompts

Note that prompt examples contains {xxx} placeholders. These placeholders are replaced by the bot with the corresponding data. For example, {bot_name} is internally replaced by the bot name.

Be sure to write all the TTS prompts in English. This language is used as a pivot language for the conversation translation.

# config.yaml
[...]

prompts:
  tts:
    hello_tpl: |
      Hello, I'm {bot_name}, from {bot_company}! I'm an IT support specialist.

      Here's how I work: when I'm working, you'll hear a little music; then, at the beep, it's your turn to speak. You can speak to me naturally, I'll understand.

      Examples:
      - "I've got a problem with my computer, it won't turn on".
      - "The external screen is flashing, I don't know why".

      What's your problem?
  llm:
    default_system_tpl: |
      Assistant is called {bot_name} and is in a call center for the company {bot_company} as an expert with 20 years of experience in IT service.

      # Context
      Today is {date}. Customer is calling from {phone_number}. Call center number is {bot_phone_number}.
    chat_system_tpl: |
      # Objective
      Assistant will provide internal IT support to employees. Assistant requires data from the employee to provide IT support. The assistant's role is not over until the issue is resolved or the request is fulfilled.

      # Rules
      - Answers in {default_lang}, even if the customer speaks another language
      - Cannot talk about any topic other than IT support
      - Is polite, helpful, and professional
      - Rephrase the employee's questions as statements and answer them
      - Use additional context to enhance the conversation with useful details
      - When the employee says a word and then spells out letters, this means that the word is written in the way the employee spelled it (e.g. "I work in Paris PARIS", "My name is John JOHN", "My email is Clemence CLEMENCE at gmail GMAIL dot com COM")
      - You work for {bot_company}, not someone else

      # Required employee data to be gathered by the assistant
      - Department
      - Description of the IT issue or request
      - Employee name
      - Location

      # General process to follow
      1. Gather information to know the employee's identity (e.g. name, department)
      2. Gather details about the IT issue or request to understand the situation (e.g. description, location)
      3. Provide initial troubleshooting steps or solutions
      4. Gather additional information if needed (e.g. error messages, screenshots)
      5. Be proactive and create reminders for follow-up or further assistance

      # Support status
      {claim}

      # Reminders
      {reminders}

Customize the languages

The bot can be used in multiple languages. It can understand the language the user chose.

See the list of supported languages for the Text-to-Speech service.

# config.yaml
[...]

workflow:
  lang:
    default_short_code: "fr-FR"
    availables:
      - pronunciations_en: ["French", "FR", "France"]
        short_code: "fr-FR"
        voice_name: "fr-FR-DeniseNeural"
      - pronunciations_en: ["Chinese", "ZH", "China"]
        short_code: "zh-CN"
        voice_name: "zh-CN-XiaoxiaoNeural"

Customize the moderation levels

Levels are defined for each category of Content Safety. The higher the score, the more strict the moderation is, from 0 to 7.

Moderation is applied on all bot data, including the web page and the conversation.

# config.yaml
[...]

content_safety:
  category_hate_score: 0
  category_self_harm_score: 0
  category_sexual_score: 5
  category_violence_score: 0

Customize the claim data schema

Customization of the data schema is not supported yet through the configuration file. However, you can customize the data schema by modifying the application source code.

The data schema is defined in models/claim.py. All the fields are required to be of type Optional[str] (except the immutable fields).

# models/claim.py
class ClaimModel(BaseModel):
    # Immutable fields
    # [...]
    # Editable fields
    additional_notes: Optional[str] = None
    device_info: Optional[str] = None
    error_messages: Optional[str] = None
    follow_up_required: Optional[bool] = None
    incident_date_time: Optional[datetime] = None
    issue_description: Optional[str] = None
    resolution_details: Optional[str] = None
    steps_taken: Optional[str] = None
    ticket_id: Optional[str] = None
    user_email: Optional[EmailStr] = None
    user_name: Optional[str] = None
    user_phone: Optional[PhoneNumber] = None

    # Depending on requirements, you might also include fields for:
    # - Software version
    # - Operating system
    # - Network details (if relevant to the issue)
    # - Any attachments like screenshots or log files (consider how to handle binary data)

    # Built-in functions
    [...]

claim-ai-phone-bot's People

Contributors

clemlesne avatar dependabot[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

claim-ai-phone-bot's Issues

Ability to scale

Externalize the Event Grid subscription to an automation, Makefile, for example.

Ability to send images by chat

Add multichannel ability to the conversation. Allows a user to send a image to the bot and being analyzed by voice to the customer.

Generate the configuration by itself during a IasC deployment

As of today, when a new instance is deployed, the configuration file is took from the local directory and sent directly to the application.

Issues are many:

  • It is difficult to know the URL of the services before their creation
  • Tye process requires a deployment process on two times, first deploy (the app crashes because the config is not aligned with the infra), then update the config, and finally redeploy

Perspectives:

  • The Bicep deployment override some values from the config file, values from environment variables are prioritized
  • Bicep create a JSON config autonomously

Web page generated for end user

For simplicity, this can be a static webpage generated from an endpoint of the API server. Example with Hugo.

Endpoint: /call/review/{call_id}
UI framework: Tailwind CSS

Allow end-user access the report

Report could be accessed from a URL knowledgable in advance.

Like:

  • /report/{claim_id}, easy to understand, claim ID can be gave by the bot himself
  • /report/{phone_number}, end-user know the URL even without calling the bot, but history is lost every time

Or, create a history page in addition to the report:

  • /report/{phone_number}
  • /report/{phone_number}/{call_id}

Questions:

  • Is it useful to display the claim ID in the URL?
  • Do we use the phone number from the call (from Communication Service) or from the claim (reported by the user)?

Sometimes text are still truncated

Logs:

INFO:     10.240.4.232:0 - "POST /call/event/86494159-a2ed-484a-adfa-38a7a0ba4a1a HTTP/1.1" 204 No Content
INFO:main:Tool call new_or_updated_reminder with parameters {"description":"Le client, Martin Sliwka, a besoin d'\u00eatre relog\u00e9 temporairement car son appartement est inhabitable suite \u00e0 un d\u00e9g\u00e2t des eaux. V\u00e9rifier les options de relogement selon les termes de son contrat d'assurance habitation.","due_date_time":"2024-01-13T13:00:00Z","title":"Contacter client pour relogement","customer_response":"Je vais cr\u00e9er un rappel pour notre \u00e9quipe afin qu'elle vous contacte pour discuter des solutions de relogement disponibles selon les termes de votre contrat d'assurance habitation."}
INFO:main:Chat (86494159-a2ed-484a-adfa-38a7a0ba4a1a): content="Je comprends que votre appartement est actuellement inhabitable et que vous avez besoin d'un logement temporaire. Je vais créer un rappel pour notre équipe afin qu'elle vous contacte pour discuter des solutions de relogement disponibles selon les termes de votre contrat d'assurance habitation. Je vais créer un rappel pour notre équipe afin qu'elle vous contacte pour discuter des solutions de relogement disponibles selon les termes de votre contrat d'assurance habitation. " intent=<Indent.NEW_OR_UPDATED_REMINDER: 'new_or_updated_reminder'>
WARNING:main:Text is too long to be processed by TTS, truncating to 400 characters, fix this!
INFO:main:Chat (86494159-a2ed-484a-adfa-38a7a0ba4a1a): content="Le rappel est créé. Notre équipe vous contactera pour discuter des options de relogement temporaire. Y a-t-il autre chose que je puisse faire pour vous aujourd'hui ?" intent=<Indent.CONTINUE: 'continue'>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.