azure / vector-search-ai-assistant Goto Github PK

Microsoft Official Build Modern AI Apps reference solutions and content. Demonstrate how to build Copilot applications that incorporate Hero Azure Services including Azure OpenAI Service, Azure Container Apps (or AKS) and Azure Cosmos DB for NoSQL with Vector Search.

License: MIT License

C# 54.96% Dockerfile 0.97% HTML 7.61% PowerShell 10.69% Bicep 24.86% Mustache 0.85% JavaScript 0.05%

vector-search-ai-assistant's Introduction

Build Your Own Copilot with Azure Cosmos DB

This solution demonstrates how to design and implement a RAG Pattern solution that incorporates Azure Cosmos DB with Azure OpenAI Service along with other key Azure services, to build a Generative AI solution with an AI assistant user interface.

The scenario for this solution is for a consumer retail "Intelligent Agent" for a retail bike shop that sells bicycles, biking accessories, components and clothing. The dataset in this solution is the Cosmic Works sample for Azure Cosmos DB, which is adapted from the Adventure Works 2017 dataset.

This solution demonstrates many concepts developers will encounter when building Generative-AI applications including:

Generating and storing vectors in real-time on transactional data.
Performing vector searches on data in a database.
Generating completions from a large language model.
Managing conversational context and chat history.
Token management for large langage models.
Managing data models for defining what data gets vectorized for search.
How to use Semantic Kernel SDK connectors and plug-ins.

What is RAG?

RAG is an acronym for Retrieval Augmented Generation, a term that essentially means retrieving additional data to provide as a context to a large language model so it can generate a response (completion) based not just on a user's question (prompt) but also on that context. The data can be any kind of text. However, there is a limit to how much text can be sent due to the limit of tokens for each model that can be consumed in a single request/response from OpenAI and other large language models. This solution will highlight this challenge and provide an example of how to address it.

User Experience

The application frontend acts as an Intelligent Agent. The left-hand navigation contains individual chat sessions. Users type questions, the service queries vectorized data, then sends the question and query results toAzure OpenAI Service to generate a completion which is then displayed to the user. When the user types a second question, the chat session is summarized using a different Azure OpenAI completion and renamed to match the topic for that chat session. The chat session on the left side displays all of the tokens consumed for that session. Each message in the chat also includes a token count consumed in generating it - the user tokens are the tokens used in the call to Azure OpenAI Service and the assistant tokens are the ones used to generate the completion.

Solution Architecture

The solution architecture is represented by this diagram:

Getting Started

To deploy the solution follow the Deployment steps below. Once deployed, follow these links to get familiar with and explore the solution.

Deployment

This solution deploys to either Azure Kubernetes Service (AKS) or Azure Container Apps (ACA). The deployment scripts are located in the aks folder. The deployment scripts are designed to be run from the root of the repository. To deploy the solution, run the following commands from the root of the repository:

AKS deployment

cd ./aks
azd up

After running azd up on the AKS deployment and the deployment finishes, you will see the output of the script which will include the URL of the web application. You can click on this URL to open the web application in your browser. The URL is beneath the "Done: Deploying service web" message, and is the second endpoint (the Ingress endpoint of type LoadBalancer).

If you closed the window and need to find the external IP address of the service, you can open the Azure portal, navigate to the resource group you deployed the solution to, and open the AKS service. In the AKS service, navigate to the Services and Ingress blade, and you will see the external IP address of the LoadBalancer service, named nginx:

ACA deployment

cd ./aca
azd up

After running azd up on the ACA deployment and the deployment finishes, you can locate the URL of the web application by navigating to the deployed resource group in the Azure portal. Click on the link to the new resource group in the output of the script to open the Azure portal.

In the resource group, you will see the ca-search-xxxx Azure Container Apps service.

Select the service to open it, then select the Application Url to open the web application in your browser.

Note

There are many options for deployment, including using an existing Azure OpenAI account and models. For deployment options and prerequisistes, please see How to Deploy page.

Before moving to the next section, be sure to validate the deployment is successful. More information can be found in the How to Deploy page.

Clean-up

From a command prompt, navigate to the aks or aca folder, depending on which deployment type you used, and run the following command to delete the resources created by the deployment script:

AKS clean-up

cd ./aks
azd down --purge

ACA clean-up

cd ./aca
azd down --purge

Note

The --purge flag purges the resources that provide soft-delete functionality in Azure, including Azure KeyVault and Azure OpenAI. This flag is required to remove all resources.

Resources

vector-search-ai-assistant's People

Stargazers

Watchers

vector-search-ai-assistant's Issues

Upgrade model GPT-3.5 '0301'

0301 is not available in as many Regions as 0613, i.e. AustraliaEast and on the chopping block.

"Version 0613 of gpt-35-turbo and gpt-35-turbo-16k will be retired no earlier than June 13, 2024. Version 0301 of gpt-35-turbo will be retired no earlier than July 5, 2024. See model updates for model upgrade behavior."

So plan for later model, potentially: 1106.

Reference: GPT-3.5 models

the documented appsettings.Development.json on the readme.md does not match what is expected with the second Cosmos DB account.

@markjbrown @joelhulen @ciprianjichici

In the ChatAPI project, create an appsettings.Development.json file with the following content (replace all <...> placeholders with the values from your deployment):

{
"MSCosmosDBOpenAI": {
"OpenAI": {
"Endpoint": "[https://%3c...%3e.openai.azure.com/]https://<...>.openai.azure.com/",
"Key": "<...>"
},
"CosmosDB": {
"Endpoint": "[https://%3c...%3e.documents.azure.com:443/]https://<...>.documents.azure.com:443/",
"Key": "<...>"
},
"DurableSystemPrompt": {
"BlobStorageConnection": "<...>"
},
"BlobStorageMemorySource": {
"ConfigBlobStorageConnection": "<...>"
}
}
}
With this is not work because you have to add for cosmosdbVectorstore too , so it will should be 😂

{
"MSCosmosDBOpenAI": {
"OpenAI": {
"Endpoint": "[https://%3c...%3e.openai.azure.com/]https://<...>.openai.azure.com/",
"Key": "<...>"
},
"CosmosDB": {
"Endpoint": "[https://%3c...%3e.documents.azure.com:443/]https://<...>.documents.azure.com:443/",
"Key": "<...>"
},
"CosmosDBVectorStore": {
"Endpoint": "[https://%3c...%3e.documents.azure.com:443/]https://<...>.documents.azure.com:443/",
"Key": "<...>"
},
"DurableSystemPrompt": { "BlobStorageConnection": "<...>" },
"BlobStorageMemorySource": { "ConfigBlobStorageConnection": "<...>" } } }

Build/deploy containers using ACR vs Docker Desktop

We should explore using ACR to handle this step vs Docker Desktop.

Entering holding pattern to wait for proper backend API initialization loop

Entering holding pattern to wait for proper backend API initialization
Attempting to retrieve status from https://**.eastus.aksapp.io/api/status every 20 seconds with 50 retries

I've attempted to deploy making use of both deployment-standard.md and deployment-cloudshell.md, both resulting in the same outcome. Is this a bug or am I doing something wrong?

This repo is missing important files

There are important files that Microsoft projects should all have that are not present in this repository. A pull request has been opened to add the missing file(s). When the pr is merged this issue will be closed automatically.

Microsoft teams can learn more about this effort and share feedback within the open source guidance available internally.

Merge this pull request

tokens used not always updating in left-nav

The value for total tokens used in the left nav is not always getting updated in the left-nav in web app. Works when running debug mode. does sometimes update though so not sure why this is happening. I left the app over the weekend then came back and refreshed and was working again. I suspect the event that causes refresh in Blazor is not getting fired or caught.

Not working in deployment.

Working in local debug

Template will not depoy

The following error occurs when this is deployed in US East.
I have deleted all openAI resources and purged them. Still get this error. There is zero information I have found that is helpful in resolving this on the internet.

Another operation is being performed on the parent resource '/subscriptions/xxx-xxx-xxx-xxx-xxx/resourceGroups/CosmosRAG1/providers/Microsoft.CognitiveServices/accounts/jeffmoore-openai'. Please try again later. (Code: RequestConflict)

Config values don't match for local debug

The config values for appsettings.Development.json do not match what is expected in the app. View in run locally and debug.

Then see what the code is expecting on startup in the ChatServiceWebApi. The result here is Cog Search will not connect because the config values are empty.

I am deploying the Azure "Vector-Search-AI-Assistant" github code Using Azure open AI service but facing multiple issues

After github url I am following mentioned document in the github which we are deploying using azure container apps . I have faced the issues mentioned below.

1.This is the first Issue I got and I tried to resolve this by replacing the / into - but getting multiple new issues
ERROR: error executing step command 'provision': deployment failed: error deploying infrastructure: deploying to subscription:

Deployment Error Details:
InvalidTemplateDeployment: The template deployment 'ChatServiceWebApi' is not valid according to the validation procedure. The tracking id is '8446901e-49ad-4ef5-aaac-b271e76d1dbe'. See inner errors for details.
ValidationForResourceFailed: Validation failed for a resource. Check 'Error.Details[0]' for more information.
ContainerAppInvalidSecretName: Secret name has an invalid value 'kv-ycutmbn4mmj2c/openai-apikey'. A value must consist of lower case alphanumeric characters, '-', and must start and end with an alphanumeric character. The length must not be more than 253 characters.

TraceID: 5102be5ff0e98bc8c5f1a345e548c48c

2.The issue comes after resolving the first issue

ERROR: error executing step command 'provision': deployment failed: error deploying infrastructure: deploying to subscription: Deployment Error Details: InvalidTemplate: Deployment template validation failed: 'The template resource 'kv-73sevvsljywq4-openai-apikey' for type 'Microsoft.KeyVault/vaults/secrets' at line '1' and column '776' has incorrect segment lengths. A nested resource type must have identical number of segments as its resource name. A root resource type must have segment length one greater than its resource name. Please see https://aka.ms/arm-syntax-resources for usage details.'.

We are stuck here so please provide me solution
Can you please help me out to resolve the template validation error by verifying the code once again

Exception while generating vector

After an error-free deployment to ACA, following the suggested post deployment checks I found no traces in Application Insights.
In ca-chatservicew-* Container App logs I found many Ks of errors like the following one:

fail: BuildYourOwnCopilot.SemanticKernel.Plugins.Memory.VectorMemoryStore[0]
      Exception while generating vector for [2574856F-DC91-4D8E-B18D-2253AF2064D9 of type SalesOrder]: This model does not support specifying dimensions.
      Status: 400 (model_error)
      Content:
      {
        "error": {
          "message": "This model does not support specifying dimensions.",
          "type": "invalid_request_error",
          "param": null,
          "code": null
        }
      }

Examining the CosmosDb there is no sales-order container and *-vector-store containers have no records.

Any hint on what is wrong?

error in collection execution main-vector-store is never call

Hello , when you run locally the system go only in the cache-vector-store and never in the main-vector-store to retrieve the data , i have update the last code and re configure , but nothing change , the data are in the main-vector-store but this collections is never call

This repo is missing important files

Microsoft teams can learn more about this effort and share feedback within the open source guidance available internally.

Merge this pull request

Deployment errors with ACA (AKS too) due to path error

Following the instructions for azd deployment with the ACA deployment option I get multiple errors.

ERROR: error executing step command 'provision': initializing provisioning manager: failed to compile bicep template: failed running bicep build: exit code: 1, stdout: , stderr: 
.... lots of warning ....
C:\repos\sandbox\Vector-Search-AI-Assistant\aca\infra\main.bicep(219,34) : Error BCP091: An error occurred reading file. Could not find a part of the path 'C:\repos\sandbox\Vector-Search-AI-Assistant\SystemPrompts\RetailAssistant\Default.txt'.
C:\repos\sandbox\Vector-Search-AI-Assistant\aca\infra\main.bicep(226,34) : Error BCP091: An error occurred reading file. Could not find a part of the path 'C:\repos\sandbox\Vector-Search-AI-Assistant\SystemPrompts\RetailAssistant\Limited.txt'.
C:\repos\sandbox\Vector-Search-AI-Assistant\aca\infra\main.bicep(233,34) : Error BCP091: An error occurred reading file. Could not find a part of the path 'C:\repos\sandbox\Vector-Search-AI-Assistant\SystemPrompts\Summarizer\TwoWords.txt'.
C:\repos\sandbox\Vector-Search-AI-Assistant\aca\infra\main.bicep(240,34) : Error BCP091: An error occurred reading file. Could not find a part of the path 'C:\repos\sandbox\Vector-Search-AI-Assistant\MemorySources\BlobMemorySourceConfig.json'.
C:\repos\sandbox\Vector-Search-AI-Assistant\aca\infra\main.bicep(247,34) : Error BCP091: An error occurred reading file. Could not find a part of the path 'C:\repos\sandbox\Vector-Search-AI-Assistant\MemorySources\return-policies.txt'.
C:\repos\sandbox\Vector-Search-AI-Assistant\aca\infra\main.bicep(254,34) : Error BCP091: An error occurred reading file. Could not find a part of the path 'C:\repos\sandbox\Vector-Search-AI-Assistant\MemorySources\shipping-policies.txt'.

All of these are because the paths to the SK prompts are incorrect:
It is

module storage './shared/storage.bicep' = {
  name: 'storage'
  params: {
    containers: [
      {
        name: 'system-prompt'
      }
      {
        name: 'memory-source'
      }
      {
        name: 'product-policy'
      }
    ]
    files: [
      {
        name: 'retailassistant-default-txt'
        file: 'Default.txt'
        path: 'RetailAssistant/Default.txt'
        content: loadTextContent('../../SystemPrompts/RetailAssistant/Default.txt')
        container: 'system-prompt'
      }

But all the files are in the data folder

...
    files: [
      {
        name: 'retailassistant-default-txt'
        file: 'Default.txt'
        path: 'RetailAssistant/Default.txt'
        content: loadTextContent('../../data/SystemPrompts/RetailAssistant/Default.txt')
        container: 'system-prompt'
      }

Same applies to AKS bicep file.