GithubHelp home page GithubHelp logo

akshata29 / entaoai Goto Github PK

View Code? Open in Web Editor NEW
824.0 25.0 247.0 520.27 MB

Chat and Ask on your own data. Accelerator to quickly upload your own enterprise data and use OpenAI services to chat to that uploaded data and ask questions

License: MIT License

Python 11.18% HTML 0.01% TypeScript 56.21% CSS 0.55% Bicep 0.64% TSQL 31.38% Dockerfile 0.02%
azure azureopenai chatgpt cognitive-search openai pinecone redis-search vector-store azure-functions azure-webapp

entaoai's Introduction

Chat with your enterprise data using LLM

This sample demonstrates a few approaches for creating ChatGPT-like experiences over your own data. It uses Azure OpenAI Service to access the ChatGPT model (gpt-35-turbo and gpt3), and vector store (Pinecone, Redis and others) or Azure cognitive search for data indexing and retrieval.

The repo provides a way to upload your own data so it's ready to try end to end.

Updates

  • 3/30/2024 - Refactored to keep on Chat, Chat Stream, QnA, Upload and Admin functionality. All others will be moved to it's own repo.
  • 3/10/2024 - Move the Prompt Flow version to entaoaipf
  • 3/9/2024 - Initial version of advanced RAG techniques and Multi-modal RAG pattern
  • 2/28/2024 - Removed SEC analysis features and it's moved into it's own repo at sec
  • 1/28/2024 - Remove PitchBook features as they are moved into it's own repo at pib
  • 1/19/2024 - Updated the python package & OpenAI > 1.0. Changes made to all Python API for breaking changes introduced in OpenAI and langchain.
  • 10/12/2023 - Initial version of Autonomous PromptFlow. For now supporting the Pinecone indexes, but support for Cognitive Search and Redis will be updated soon.
  • 9/29/2023 - Added Evaluate PromptFlow. Prompt Flow once created in Azure ML, can be attached to your existing run to evaluate against the following evaluation process :
    • Groundness - The Q&A Groundedness evaluation flow will evaluate the Q&A Retrieval Augmented Generation systems by leveraging the state-of-the-art Large Language Models (LLM) to measure the quality and safety of your responses. Utilizing GPT-3.5 as the Language Model to assist with measurements aims to achieve a high agreement with human evaluations compared to traditional mathematical measurements. gpt_groundedness (against context): Measures how grounded the model's predicted answers are against the context. Even if LLM’s responses are true, if not verifiable against context, then such responses are considered ungrounded.
    • Ada Similarity - The Q&A ada_similarity evaluation flow will evaluate the Q&A Retrieval Augmented Generation systems by leveraging the state-of-the-art Large Language Models (LLM) to measure the quality and safety of your responses. Utilizing GPT-3.5 as the Language Model to assist with measurements aims to achieve a high agreement with human evaluations compared to traditional mathematical measurements. The Ada Similarity evaluation flow allows you to assess and evaluate your model with the LLM-assisted ada similarity metri ada_similarity: Measures the cosine similarity of ada embeddings of the model prediction and the ground truth. ada_similarity is a value in the range [0, 1].
    • Coherence - The Q&A Coherence evaluation flow will evaluate the Q&A Retrieval Augmented Generation systems by leveraging the state-of-the-art Large Language Models (LLM) to measure the quality and safety of your responses. Utilizing GPT-3.5 as the Language Model to assist with measurements aims to achieve a high agreement with human evaluations compared to traditional mathematical measurements. The Coherence evaluation flow allows you to assess and evaluate your model with the LLM-assisted Coherence metric. gpt_coherence: Measures the quality of all sentences in a model's predicted answer and how they fit together naturally. Coherence is scored on a scale of 1 to 5, with 1 being the worst and 5 being the best.
    • Similarity - The Q&A Similarity evaluation flow will evaluate the Q&A Retrieval Augmented Generation systems by leveraging the state-of-the-art Large Language Models (LLM) to measure the quality and safety of your responses. Utilizing GPT-3.5 as the Language Model to assist with measurements aims to achieve a high agreement with human evaluations compared to traditional mathematical measurements. The Similarity evaluation flow allows you to assess and evaluate your model with the LLM-assisted Similarity metric. gpt_similarity: Measures similarity between user-provided ground truth answers and the model predicted answer. Similarity is scored on a scale of 1 to 5, with 1 being the worst and 5 being the best.
    • F1 Score - The Q&A f1-score evaluation flow will evaluate the Q&A Retrieval Augmented Generation systems using f1-score based on the word counts in predicted answer and ground truth. The f1-score evaluation flow allows you to determine the f1-score metric using number of common tokens between the normalized version of the ground truth and the predicted answer. F1-score: Compute the f1-Score based on the tokens in the predicted answer and the ground truth. F1-score is a value in the range [0, 1]. Groundedness metric is scored on a scale of 1 to 5, with 1 being the worst and 5 being the best.
  • 9/22/2023 - Added PromptFlow for SqlAsk. Ensure PFSQLASK_URL and PFSQLASK_KEY configuration values are added to deployed endpoint to enable the feature. Also make sure SynapseName, SynapsePool, SynapseUser and SynapsePassword configuration values are added to entaoai PromptFlow connection. Moved deleting the Session Capability for ChatGpt to Admin Page.
  • 9/20/2023 - Added configuration to allow end user to change the Search Type for Cognitive Search Vector Store index (Hybrid, Similarity/Vector and Hybrid Re-rank), based on the Best Practices we shared. QnA, Chat and Prompt Flow are modified. QnA and Chat are implementing the customized Vector store implementation of Langchain and Prompt Flow using the helper functions. Fixed the issue with QnA/Chat/PromptFlow not generating followup-questions.
  • 9/18/2023 - Refactored SQL NLP to not use Langchain Database Agent/Chain and instead use custom Prompts.
  • 9/15/2023 - Modified the azure search package to 11.4.0b9 and langchain to latest version. Added capability to perform evaluation on PromptFlow for both QnA and Chat. Bert PDF and Evaluation Data can be used to perform Batch and Evaluation in Prompt Flow. Sample Notebook showcasing the flow and E2E process is available. Bert Chat folder allows you to test E2E Prompt Flow, Batch Run and Evaluation in form of Notebook.
  • 9/3/2023 - Added API for Chat using the Prompt Flow. Allow end-user to select between Azure Functions as API (ApiType Configuration in Web App) or using Prompt Flow Managed endpoint.
  • 9/2/2023 - Added API for Question Answering using the Prompt Flow. Allow end-user to select between Azure Functions as API (ApiType Configuration in Web App) or using Prompt Flow Managed endpoint.
  • 8/31/2023 - Added example for LLMOps using Prompt Flow. The repo will be adding the flexibility to use the Prompt Flow Deployed Model as an alternative to current Azure Functions.
  • 8/20/2023 - Added support for the Markdown files (as zip file) and removed the chunk_size=1 from Azure OpenAiEmbedding
  • 8/11/2023 - Fixed the issue with Streaming Chat feature.
  • 8/10/2023 - Breaking Changes - Refactored all code to use OpenAiEndPoint configuration value instead of OpenAiService. It is to support the best practices as they are outlined in Enterprise Logging via Azure API Management. Your OpenAiEndPoint if using APIM will be API Gateway URL and the OpenAiKey will be the Product/Unlimited key. If not using APIM, you don't need to change the key, but ensure OpenAiEndPoint is fully qualified URL of your AOAI deployment. OpenAiService is no longer used. Changes did impact the working on Chat on Stream feature, so it's disabled for now and will be enabled once tested and fixed.
  • 8/9/2023 - Added Function calling in the ChatGpt interface as checkbox. Sample demonstrate ability to call functions. Currently Weather API, Stock API and Bing Search is supported. Function calling is in preview and supported only from "API Version" of "2023-07-01-preview", so make sure you update existing deployment to use that version. Details on calling Functions. For existing deployment add WeatherEndPoint, WeatherHost, StockEndPoint, StockHost and RapidApiKey configuration to Azure Function App.
  • 8/5/2023 - Added Chat Interface with "Stream" Option. This feature allows you to stream the conversation to the client. You will need to add OpenAiChat, OpenAiChat16k, OpenAiEmbedding, OpenAiEndPoint, OpenAiKey, OpenAiApiKey, OpenAiService, OpenAiVersion, PineconeEnv, PineconeIndex, PineconeKey, RedisAddress, RedisPassword, RedisPort property in Azure App Service (Webapp) to enable the feature for existing deployment.
  • 7/30/2023 - Removed unused Code - SummaryAndQa and Chat
  • 7/28/2023 - Started removing the Davinci model usage. For now removed the usage from all functionality except workshop. Refactored Summarization functionality based on the feedback to allow user to specify the prompt and pre-defined Topics to summarize it on.
  • 7/26/2023 - Remove OpenAI Playground from Developer Tools as advanced features of that are available in ChatGPT section.
  • 7/25/2023 - Add tab for the Chat capabilities to support ChatGpt capability directly from the model instead of "Chat on Data". You will need to add CHATGPT_URL property in Azure App Service (Webapp) to enable the feature outside of deploying the new Azure Function.
  • 7/23/2023 - Added the rest of the feature for PIB UI and initial version of generating the PowerPoint deck as the output. For new feature added ensure you add FMPKEY variable to webapp configuration.
  • 7/20/2023 - Added feature to talk to Pib Data (Sec Filings & Earning Call Transcript). Because new Azure function is deployed, ensure PIBCHAT_URL property is added to Azure WebApp with the URL for your deployed Azure Functions
  • 7/18/2023 - Refactored the PIB code to solve some of the performance issue and bug fixes.
  • 7/17/2023 - Removed GPT3 chat interface with retirement of "Davinci" models.
  • 7/16/2023 - Initial version of Pib UI (currently supporting 5 Steps - Company Profile, Call Transcripts, Press Releases, Sec Filings and Ratings/Recommendations). You will need access to Paid subscription (FMP or modify based on what your enterprise have access to). To use with FMP you will need to add FmpKey in Azure Functions. Because of circular dependency you need to manually add SecDocPersistUrl and SecExtractionUrl manually in Azure Functions.
  • 7/14/2023 - Add support for GPT3.5 16K model and ability to chunk document > 4000 tokens with > 500 overlap. For the ChunkSize > 4000, it will default to 16K token for both QnA and Chat functionality. Added identity provider to the application and authentication for QnA and Chat interface. For GPT3.5 16k model, you will need to add OpenAiChat16k property in Azure Function app.
  • 7/13/2023 - Allow end user to select ChunkSize and ChunkOverlap Configuration. Initial version of overriding prompt template.
  • 7/11/2023 - Functional PIB CoPilot in the form of the notebook.
  • 7/8/2023 - Added the feature to Rename the session for ChatGPT. Also added the UI for the Evaluator Tool. This feature focuses on performing the LLM based evaluation on your document. It auto-generates the test dataset (with Question and Answers) and perform the grading on that document using different parameters and generates the evaluation results. It is built on Azure Durable Functions and is implemented using the Function Chaining pattern. You will need to add BLOB_EVALUATOR_CONTAINER_NAME (ensure the same container name is created in storage account) and RUNEVALUATION_URL (URL of the Durable function deployment) configuration in Azure Web App for existing deployment and if you want to use the Evaluator feature. In the Azure function deployment add AzureWebJobsFeatureFlags (value EnableWorkerIndexing) and OpenAiEvaluatorContainer settings.
  • 7/5/2023 - Added the feature to Delete the session. That feature requires the feature that is in preview and you will need to enable that on the CosmosDB account on your subscription. Added simple try/catch block in case if you have not enabled/deployed the CosmosDB to continue chatGPT implementation.
  • 7/4/2023 - Initial version of storing "Sessions" for GPT3.5/ChatGpt interface. Session and messages are stored/retrieved from CosmosDb. Make sure you have CosmosDb service provisioned or create a new one (for existing deployment). You will need to add CosmosEndpoint, CosmosKey, CosmosDatabase and CosmosContainer settings in both Azure Functions App and Web App.
  • 6/25/2023 - Notebook showcasing the evaluation of the answer quality in systematic way (auto generating questions and evaluation chain), supporting LLM QA settings (chunk size, overlap, embedding technique). Refer to Evaluator notebook for more information.
  • 6/18/2023 - Add the admin page supporting Knowledge base management.
  • 6/17/2023 - Added "Question List" button for Ask a question feature to display the list of all the questions that are in the Knowledge base. Following three properties SEARCHSERVICE, SEARCHKEY and KBINDEXNAME (default value of aoaikb) needs to be added to Azure App Service to enable "Question List" button feature.
  • 6/16/2023 - Add the feature to use Azure Cognitive Search as Vector store for storing the cached Knowledge base. The questions that are not in KB are sent to LLM model to find the answer via OAI, or else it is responded back from the Cached Datastore. New Property KbIndexName needs to be added to Azure Function app. Added the Notebook to test out the feature as part of the workshop. TODO : Add the feature to add the question to KB from the chat interface (and make it session based). A feature further to "regenerate" answer from LLM (instead of cached answer) will be added soon.
  • 6/7/2023 - Add OpenAI Playground in Developer Tools and initial version of building the CoPilot (for now with Notebook, but eventually will be moved as CoPilot feature). Add the script, recording and example for Real-time Speech analytics use-case. More to be added soon.
  • 5/27/2023 - Add Workshop content in the form of the notebooks that can be leveraged to learn/execute the scenarios. You can find the notebooks in the Workshop folder. Details about workshop content is available here.
  • 5/26/2023 - Add Summarization feature to summarize the document either using stuff, mapreduce or refine summarization. To use this feature (on existing deployment) ensure you add the OpenAiSummaryContainer configuration to Function app and BLOB_SUMMARY_CONTAINER_NAME configuration to Azure App Service (Ensure that the value you enter is the same as the container name in Azure storage and that you have created the container). You also need to add PROCESSSUMMARY_URL configuration to Azure App Service (Ensure that the value you enter is the same as the Azure Function URL).
  • 5/24/2023 - Add feature to upload CSV files and CSV Agent to answer/chat questions on the tabular data. Smart Agent also supports answering questions on CSV data.
  • 5/22/2023 - Initial version of "Smart Agent" that gives you flexibility to talk to all documents uploaded in the solution. It also allow you to talk to SQL Database Scenario. As more features are added, agent will keep on building upon that (for instance talk to CSV/Excel or Tabular data)
  • 5/21/2023 - Add Developer Tools section - Experimental code conversion and Prompt guru.
  • 5/17/2023 - Change the edgar source to Cognitive search vector store instead of Redis.
  • 5/15/2023 - Add the option to use "Cognitive Search" as Vector store for storing the index. Azure Cognitive Search offers pure vector search and hybrid retrieval – as well as a sophisticated re-ranking system powered by Bing in a single integrated solution. Sign-up. Support uploading WORD documents.
  • 5/10/2023 - Add the options on how document should be chunked. If you want to use the Form Recognizer, ensure the Form recognizer resource is created and the appropriate application settings FormRecognizerKey and FormRecognizerEndPoint are configured.
  • 5/07/2023 - Option available to select either Azure OpenAI or OpenAI. For OpenAI ensure you have OpenAiApiKey in Azure Functions settings. For Azure OpenAI you will need OpenAiKey, OpenAiService and OpenAiEndPoint Endpoint settings. You can also select that option for Chat/Question/SQL Nlp/Speech Analytics and other features (from developer settings page).
  • 5/03/2023 - Password required for Upload and introduced Admin page starting with Index Management
  • 4/30/2023 - Initial version of Task Agent Feature added. Autonomous Agents are agents that designed to be more long running. You give them one or multiple long term goals, and they independently execute towards those goals. The applications combine tool usage and long term memory. Initial feature implements Baby AGI with execution tools
  • 4/29/2023 - AWS S3 Process Integration using S3, AWS Lambda Function and Azure Data Factory (automated deployment not available yet, scripts are available in /Deployment/aws folder)
  • 4/28/2023 - Fix Bugs, Citations & Follow-up questions across QA & Chat. Prompt bit more restrictive to limit responding from the document.
  • 4/25/2023 - Initial version of Power Virtual Agent
  • 4/21/2023 - Add SQL Query & SQL Data tab to SQL NLP and fix Citations & Follow-up questions for Chat & Ask features
  • 4/17/2023 - Real-time Speech Analytics and Speech to Text and Text to Speech for Chat & Ask Features. (You can configure Text to Speech feature from the Developer settings. You will need Azure Speech Services)
  • 4/13/2023 - Add new feature to support asking questions on multiple document using Vector QA Agent
  • 4/8/2023 - Ask your SQL - Using SQL Database Agent or Using SQL Database Chain
  • 3/29/2023 - Automated Deployment script
  • 3/23/2023 - Add Cognitive Search as option to store documents
  • 3/19/2023 - Add GPT3 Chat Implementation
  • 3/18/2023 - API to generate summary on documents & Sample QA
  • 3/17/2023
    • Support uploading Multiple documents
    • Bug fix - Redis Vectorstore Implementation
  • 3/16/2023 - Initial Release, Ask your Data and Chat with your Data

Test Website

Chat and Ask over your data

Features

List of Features

Architecture

Architecture

Azure Architecture

Azure Services

QA over your data with Cache

QA Cache

QA LLM Evaluation

QA LLM Evaluation

Getting Started

Get Started

Configuration

Application and Function App Configuration

Resources

Contributions

We are open to contributions, whether it is in the form of new feature, update existing functionality or better documentation. Please create a pull request and we will review and merge it.

Note

Adapted from the repo at OpenAI-CogSearch, Call Center Analytics, Auto Evaluator and Edgar Crawler

entaoai's People

Contributors

akshata29 avatar hcmarque avatar ishaan-jaff avatar melroy89 avatar rahulbourai avatar vishnu6266 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

entaoai's Issues

Some AOAI Model have been retirements.

I tried to proceed with the deployment today, and it returned the following error message.

{"code":"InvalidTemplateDeployment","details":[{"code":"DeploymentModelNotSupported","message":"Creating account deployment is not supported by the model 'text-davinci-003'. This is usually because there are better models available for the similar functionality."}],"message":"The template deployment 'Microsoft.Template-20230710120955' is not valid according to the validation procedure. The tracking id is 'af2ae21d-44e5-4cf7-a186-32577951880f'. See inner errors for details."}

Checked through the following link and found that there was a Model change.
https://learn.microsoft.com/en-us/azure/cognitive-services/openai/concepts/models

I'd like to ask you to change code.

No connection adapters were found for formrecognizer

Hi, I setup Form recognizer endpoint and key within local.settings.json, however I am getting the below error:

No connection adapters were found for '/formrecognizer/documentModels/prebuilt-layout:analyze?stringIndexType=unicodeCodePoint&api-version=2022-08-31'

I appreciate your support.

Other language support?

Hey is there a way i can ask questions and get answers back in Korean? Regardless whether the document is Korean or not?

Github action error

Hello @akshata29

Thanks share your repo

When i tested it, it show two error like below.

In "LLMOps with Promptflow"

Run pushd './Workshop'
~/work/entaoai/entaoai/Workshop ~/work/entaoai/entaoai
Traceback (most recent call last):
File "/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/promptflow/azure/_restclient/flow_service_caller.py", line 447, in submit_bulk_run
return self.caller.bulk_runs.submit_bulk_run(
File "/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/azure/core/tracing/decorator.py", line 78, in wrapper_use_tracer
return func(*args, **kwargs)
File "/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/promptflow/azure/_restclient/flow/operations/_bulk_runs_operations.py", line 402, in submit_bulk_run
raise HttpResponseError(response=response, model=error)
azure.core.exceptions.HttpResponseError: (UserError) Flow Runtime operation failed with Forbidden: User with objectId(83668576-fa46-4741-95a2-14aa89fa3604) is not authorized to invoke this runtime as the runtime compute instance entaoai is assigned to 18e9cadb-9f0e-449d-997f-ca14e7edf65d.
Code: UserError
Message: Flow Runtime operation failed with Forbidden: User with objectId(83668576-fa46-4741-95a2-14aa89fa3604) is not authorized to invoke this runtime as the runtime compute instance entaoai is assigned to 18e9cadb-9f0e-449d-997f-ca14e7edf65d.


In "LLMOps Deploy with Promptflow"

Run pushd './Workshop'
~/work///Workshop ~/work//
ERROR: (LinkedInvalidPropertyId) Property id '' at path '' is invalid. Expect fully qualified resource Id that start with '/subscriptions/subscriptionId' or '/providers/resourceProviderNamespace/'.
Code: LinkedInvalidPropertyId
Message: Property id '' at path '' is invalid. Expect fully qualified resource Id that start with '/subscriptions/subscriptionId' or '/providers/resourceProviderNamespace/'.
Error: Process completed with exit code 1.

Sorry bother you

help test

Hello, can it be built with docker? any guide please Thank you

Encountered an error (InternalServerError) from host runtime.

There was an error during the deployment process.
What is somthing wrong??

{"code":"DeploymentFailed","target":"/subscriptions/xxxx369f-3d6e-4c3f-87f4-83c1cca16cb5/resourceGroups/chatnanet/providers/Microsoft.Resources/deployments/Microsoft.Template-20230628194148","message":"At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/arm-deployment-operations for usage details.","details":[{"target":"/subscriptions/xxxx369f-3d6e-4c3f-87f4-83c1cca16cb5/resourceGroups/chatnanet/providers/Microsoft.Resources/deployments/Microsoft.Template-20230628194148","message":"Encountered an error (InternalServerError) from host runtime."}]}

Drill down issue.
{"code":"BadRequest","message":"Encountered an error (InternalServerError) from host runtime.","details":[{"message":"Encountered an error (InternalServerError) from host runtime."},{"code":"BadRequest"},{}]}

{
"status": "Failed",
"error": {
"code": "BadRequest",
"message": "Encountered an error (InternalServerError) from host runtime.",
"details": [
{
"message": "Encountered an error (InternalServerError) from host runtime."
},
{
"code": "BadRequest"
},
{}
]
}
}

image

Long Response Time / high latency

So I am based in Seoul, South Korea. I know Azure OpenAI/OpenAI resources are only based in US/Europe and not Asia yet, but for people in US East/US Central are you guys experiencing high latency or long time to respond?

For me i brought Azure resources that i can and Pinecone to Korea Central, but I am wondering @akshata29 what are some other things or ideas you can think of that I can do to speed things up on my end? Not taking cost into a factor. I just want chatpdf to run fast like https://chatpdf.com/

Thank you!!

OPEN AI ISSUE

When i deploy the template on azure, it gives the following error
{"code":"InvalidTemplateDeployment","details":[{"code":"DeploymentModelNotSupported","message":"The model 'Format: OpenAI, Name: embedding, Version: 2, Source: ' of account deployment is not supported."}],"message":"The template deployment 'Microsoft.Template-20230809000923' is not valid according to the validation procedure. The tracking id is 'd3e1e7eb-221a-4e44-a8f1-5a7583a5f191'. See inner errors for details."}

Azure Blob Storage Upload not working.

Running locally and then uploading the document does not upload the file to Azure Blob storage.

I modified the following values in the "local.setting.json" file as guided.
"OpenAiDocStorName": "chatspkstor", // Storage name
"OpenAiDocStorKey": "AzureStorageAccount Access Key value", // Storage doc
"OpenAiDocContainer": "chatpdf", // Storage Container

Also modified .env file.
I tried replacing the Connection String with the value generated by the Blob Shared Access Key policy, but it doesn't work the same.

BLOB_CONNECTION_STRING="https://chatspkstor.blob.core.windows.net/" # Name of your Blob connection string where you will upload PDF to
BLOB_CONTAINER_NAME="chatpdf" # Name of your container to host uploaded files

===============================================================================
Traceback (most recent call last):
File "c:\Users\niceysj\repo\chatgptspk\chatpdf\app\backend\app.py", line 234, in uploadBinaryFile
blobServiceClient = BlobServiceClient.from_connection_string(url)
File "C:\Users\niceysj\AppData\Local\Programs\Python\Python39\lib\site-packages\azure\storage\blob_blob_service_client.py", line 181, in from_connection_string account_url, secondary, credential = parse_connection_str(conn_str, credential, 'blob')
File "C:\Users\niceysj\AppData\Local\Programs\Python\Python39\lib\site-packages\azure\storage\blob_shared\base_client.py", line 406, in parse_connection_str
raise ValueError("Connection string missing required connection details.")
ValueError: Connection string missing required connection details.
127.0.0.1 - - [15/Apr/2023 15:21:55] "POST /uploadBinaryFile HTTP/1.1" 500 -
INFO:werkzeug:127.0.0.1 - - [15/Apr/2023 15:21:55] "POST /uploadBinaryFile HTTP/1.1" 500 -
ERROR:root:Exception in /processDoc
Traceback (most recent call last):
File "c:\Users\niceysj\repo\chatgptspk\chatpdf\app\backend\app.py", line 163, in processDoc
jsonDict = json.loads(resp.text)
File "C:\Users\niceysj\AppData\Local\Programs\Python\Python39\lib\json_init_.py", line 346, in loads
return _default_decoder.decode(s)
File "C:\Users\niceysj\AppData\Local\Programs\Python\Python39\lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Users\niceysj\AppData\Local\Programs\Python\Python39\lib\json\decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
127.0.0.1 - - [15/Apr/2023 15:21:58] "POST /processDoc HTTP/1.1" 500 -
INFO:werkzeug:127.0.0.1 - - [15/Apr/2023 15:21:58] "POST /processDoc HTTP/1.1" 500 -

The Frontend is not deployed automatically

Hi,
I have deployed the solutions with azure cli directly into Azure. Everything worked fine, but the frontend was not deployed. As described in the section on how to deploy it manually to Azure, is it necessary to build the React frontend and deploy it manually?
Cheers

Deploying with azd up is not creating full resources

Hi, thanks for sharing your project with us. I tried to clone, and deploy the project by following commands of azd init, azd up. And I can access my app main page. But I noticed not all resources are fully deployed. For instance if I open main.bicep and go end of line, we see:
image

Which means only app service plan, app service, function app, cognitive search, two storage accounts, and Azure openai services are created. Is there anyway that I could make sure all resources are fully deployed?

Best

web app display: Authentication Not Configured

after sync fork and rerun the action workflow: Deploy Python API and Deploy App, when access the web site, it display
image

I found only 3 tabs have this issue: Chat, Ask a question and PIB.

Unable to access test website

I am unable to access the test website, the error is shown below:

Request Id: 78953c7f-0344-4f83-b341-6518c8143001
Correlation Id: be6d5db1-d5fb-4e2a-ab79-0b4057d8e053
Timestamp: 2023-08-08T22:24:35Z
Message: AADSTS50020: User account '[email protected]' from identity provider 'live.com' does not exist in tenant 'Contoso' and cannot access the application 'cbbe9727-b1ae-4fae-a6c7-656f3643721f'(dataaipdfchatauth) in that tenant. The account needs to be added as an external user in the tenant first. Sign out and sign in again with a different Azure Active Directory user account.

using "Deploy to Azure" encounter errors

  1. error message: The capacity should be null for standard deployment.
    it seems due to ”capacity": 20 when deploy /chat, /davinci, /text-embedding-ada-002.
    remove capacity then works.
image

2.error message: Another operation is being performed on the parent resource '/subscriptions/9dabf3b5-95af-489f-8423-ae43e4b1707c/resourceGroups/rg-aoai-chatpdf/providers/Microsoft.CognitiveServices/accounts/kevingptaoai'. Please try again later. (Code: RequestConflict)
it seems can not deploy the 3 model: chat, davinci and text-embedding-ada-002 in parallel.
refer to this blog: https://techcommunity.microsoft.com/t5/azure-database-support-blog/add-wait-operation-to-arm-template-deployment/ba-p/2915342#:~:text=How%20can%20an%20ARM%20JSON%20template%20be%20forced,to%20set%20the%20dependencies%20of%20the%20steps%20correctly
I can fix this error such as:
{ "type": "Microsoft.CognitiveServices/accounts/deployments", "apiVersion": "2022-12-01", "name": "[concat(variables('aoaiSvcName'), '/chat')]", "dependsOn": [ "[resourceId('Microsoft.CognitiveServices/accounts', variables('aoaiSvcName'))]" ], "properties": { "model": { "format": "OpenAI", "name": "gpt-35-turbo", "version": "0301" }, "scaleSettings": { "scaleType": "Standard" }, "raiPolicyName": "Microsoft.Default" } }, { "type": "Microsoft.Resources/deploymentScripts", "apiVersion": "2020-10-01", "kind": "AzurePowerShell", "name": "Wait5s", "location": "[parameters('location')]", "dependsOn": [ "[resourceId('Microsoft.CognitiveServices/accounts', variables('aoaiSvcName'))]" ], "properties": { "azPowerShellVersion": "9.7", "scriptContent": "start-sleep -Seconds 5", "cleanupPreference": "Always", "retentionInterval": "PT1H" } }, { "type": "Microsoft.CognitiveServices/accounts/deployments", "apiVersion": "2022-12-01", "name": "[concat(variables('aoaiSvcName'), '/davinci')]", "dependsOn": [ "[resourceId('Microsoft.Resources/deploymentScripts','Wait5s')]" ], "properties": { "model": { "format": "OpenAI", "name": "text-davinci-003", "version": "1" }, "scaleSettings": { "scaleType": "Standard" }, "raiPolicyName": "Microsoft.Default" } }, { "type": "Microsoft.Resources/deploymentScripts", "apiVersion": "2020-10-01", "kind": "AzurePowerShell", "name": "Wait10s", "location": "[parameters('location')]", "dependsOn": [ "[resourceId('Microsoft.CognitiveServices/accounts', variables('aoaiSvcName'))]" ], "properties": { "azPowerShellVersion": "9.7", "scriptContent": "start-sleep -Seconds 10", "cleanupPreference": "Always", "retentionInterval": "PT1H" } }, { "type": "Microsoft.CognitiveServices/accounts/deployments", "apiVersion": "2022-12-01", "name": "[concat(variables('aoaiSvcName'), '/text-embedding-ada-002')]", "dependsOn": [ "[resourceId('Microsoft.Resources/deploymentScripts','Wait10s')]" ], "properties": { "model": { "format": "OpenAI", "name": "text-embedding-ada-002", "version": "2" }, "scaleSettings": { "scaleType": "Standard" }, "raiPolicyName": "Microsoft.Default" } }

  1. error message: Encountered an error (InternalServerError) from host runtime.
    I try many times, but finally report this error when deploying func/default
image I cannot find detail error information, hope can get your help on this issue.
  1. redeploy button cannot re-run, if click redeploy button, it'll back to the page of:
image then it will report many conflict errors
  1. Bing resource always report error
    I test and create Bing resource alonely, it report error: (ApiSetDisabledForCreation) It's not allowed to create new accounts with type 'Bing.Search.v7'.
    So, I removed the Bing resource. Could you help check and confirm if it is possible to create Bing resources? and what happens to this repo demo if don't create bing resource?

Thanks!
Kevin

Dead code in the project

Thanks for this effort! Is it okay to remove the dead code in the project so it is easier to navigate?

Thank you.

Circular dependency detected

I've done a few deploys with the automation scripts previously, however all of a sudden I'm getting the following when attempting to run:

{"code":"InvalidTemplate","message":"Deployment template validation failed: 'Circular dependency detected on resource: '/subscriptions/MYSUBSCRIPTIONID/resourceGroups/rcshrchatdemo/providers/Microsoft.Web/sites/rcshrchatfunc'. Please see https://aka.ms/arm-syntax-resources for usage details.'."}

I've tried numerous Resource Group Name and Prefix combinations. I've ensured all my other deploys where cleanly deleted. And have updated my fork with the latest here.

Not sure what else to try.

unable to upload files

How can i upload my person PDF files to test this? When I go to the upload feature, it is asking me for a password

When uploading multiple files only first file is indexed.

It appears that when uploading multiple files using Cognitive Services only the first file is indexed. All of the uploaded files successfully make it into the chatpdf container but the index only contains references to the first document.

Creative Answers when prompted

Do you know what I can enable to have the bot give creative answers? I tried changing the temperature, but if i ask question like "Can you create a poem based on the summary of the knowledge base" it is not able to. I changed the prompt in chatpdf/api/Python/ChatGpt/init.py but didn't get any results back.

Do I have to use a different chain from Langchain? I want chatpdf to do something like what this is doing: https://www.chatpdf.com/

Thank you so much!

prompt templates chat tab is not using chat history

Hi, I had a situation when I used chat tab to have conversation with my document. After having first conversation interaction, if I ask a follow-up question, the history of chat is not being used. I think it might be due to not considering history in the override prompts section (only {summaries} and {question} are used):

image

Clean Deploy To Azure

Using the deploy script, and button in the Documentation, I cannot get a clean deploy. One to two resources fail to deploy every time. It's not always the same resources, but typically it's one of the chat model deployments.

The typical error is 'Conflict' but occasionally 'Bad Request'

Is there a best practice for getting this to deploy with the script?

LiteLLM removal?

Hi @akshata29,

I see litellm got merged here #39

But I see it got removed here - 04c45e5.

What was missing for it to be useful? Any feedback here would be helpful.

Capacity error when Repdeploying

Upon trying to redeploy after receiving the error mentioned in #26 I am now getting this validation error:

{"code":"InvalidTemplateDeployment","details":[{"code":"InsufficientQuota","message":"The specified capacity '120' of account deployment is bigger than available capacity '0' for UsageName 'Tokens Per Minute (thousands) - GPT-35-Turbo'."}],"message":"The template deployment 'Microsoft.Template-20230630090235' is not valid according to the validation procedure. The tracking id is '072a60c3-87f9-4f1e-95e3-f4a259595e09'. See inner errors for details."}

Any advice?

Module Not Found 'azure.ai'

I'm receiving the following error when processing documents. This is seemingly related to Form Recognizer, regardless if I use it or not.

Exception while executing function: Functions.DocGenerator <--- Result: Failure Exception: ModuleNotFoundError: No module named 'azure.ai'. Please check the requirements.txt file for the missing module. For more info, please refer the troubleshooting guide: https://aka.ms/functions-modulenotfound Stack: File "/azure-functions-host/workers/python/3.9/LINUX/X64/azure_functions_worker/dispatcher.py", line 374, in _handle__function_load_request func = loader.load_function( File "/azure-functions-host/workers/python/3.9/LINUX/X64/azure_functions_worker/utils/wrappers.py", line 48, in call raise extend_exception_message(e, message) File "/azure-functions-host/workers/python/3.9/LINUX/X64/azure_functions_worker/utils/wrappers.py", line 44, in call return func(*args, **kwargs) File "/azure-functions-host/workers/python/3.9/LINUX/X64/azure_functions_worker/loader.py", line 132, in load_function mod = importlib.import_module(fullmodname) File "/usr/local/lib/python3.9/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "/home/site/wwwroot/DocGenerator/__init__.py", line 28, in <module> from Utilities.formrecognizer import analyze_layout, chunk_paragraphs File "/home/site/wwwroot/Utilities/formrecognizer.py", line 1, in <module> from azure.ai.formrecognizer import DocumentAnalysisClient

Error deploying citing duplicate SpeechServices

Receiving this, even though I do not have any of these on my account:

{"code":"InvalidTemplateDeployment","details":[{"code":"CanNotCreateMultipleFreeAccounts","message":"Operation failed. Only one free account is allowed for account type 'SpeechServices'."}],"message":"The template deployment 'Microsoft.Template-20230630082430' is not valid according to the validation procedure. The tracking id is '33b77277-8d6f-4e4d-9f05-b37eb40b9d98'. See inner errors for details."}

Any thoughts on how to pursue?

Azure Function BackEnd endpoint access not working.

I installed the required programmes on Windows 10 21H2 x64 and executed the source code.
Everything was executed normally until the azd up command, but when accessing the Backend Endpoint created in Azure apps, the following error is returned.
"Not Found
The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again."

Everything ran fine until azd up and the logs are as follows.

C:\Users\niceysj\repo\chatgptspk\chatpdf>azd up

Packaging services (azd package)

(✓) Done: Packaging service backend

  • Package Output: C:\Users\niceysj\AppData\Local\Temp\azddeploy2270936496.zip
    (✓) Done: Packaging service functionapp
  • Package Output: C:\Users\niceysj\AppData\Local\Temp\azddeploy1938513666.zip

Provisioning Azure resources (azd provision)
Provisioning Azure resources can take some time

You can view detailed progress in the Azure Portal:
https://portal.azure.com/#blade/HubsExtension/DeploymentDetailsBlade/overview/id/%2Fsubscriptions%2F3e5e369f-3d6e-4c3f-87f4-83c1cca16cb5%2Fproviders%2FMicrosoft.Resources%2Fdeployments%2Fchatgptspk

(✓) Done: Resource group: chatspk
(✓) Done: Application Insights: chatspkappisg
(✓) Done: App Service plan: chatspkasp
(✓) Done: Storage account: chatspkstor
(✓) Done: Storage account: chatspkfuncsa
(✓) Done: Function App: chatspkfunc
(✓) Done: Search service: chatspkazs
(✓) Done: Azure OpenAI: chatspkoai
(✓) Done: App Service: chatspkbackend

Executing predeploy hook => C:\Users\niceysj\AppData\Local\Temp\azd-predeploy-1027173797.ps1
npm WARN deprecated [email protected]: Package moved to @redux-devtools/extension.
npm WARN deprecated [email protected]: Please upgrade to version 7 or higher. Older versions may use Math.random() in certain circumstances, which is known to be problematic. See https://v8.dev/blog/math-random for details.
npm WARN deprecated [email protected]: core-js@<3.23.3 is no longer maintained and not recommended for usage due to the number of issues. Because of the V8 engine whims, feature detection in old core-js versions could cause a slowdown up to 100x even if nothing is polyfilled. Some versions have web compatibility issues. Please, upgrade your dependencies to the actual version of core-js.
npm WARN deprecated [email protected]: core-js@<3.23.3 is no longer maintained and not recommended for usage due to the number of issues. Because of the V8 engine whims, feature detection in old core-js versions could cause a slowdown up to 100x even if nothing is polyfilled. Some versions have web compatibility issues. Please, upgrade your dependencies to the actual version of core-js.
npm WARN deprecated [email protected]: core-js@<3.23.3 is no longer maintained and not recommended for usage due to the number of issues. Because of the V8 engine whims, feature detection in old core-js versions could cause a slowdown up to 100x even if nothing is polyfilled. Some versions have web compatibility issues. Please, upgrade your dependencies to the actual version of core-js.

added 589 packages, and audited 590 packages in 1m

59 packages are looking for funding
run npm fund for details

2 high severity vulnerabilities

To address all issues, run:
npm audit fix

Run npm audit for details.

[email protected] build
tsc && vite build --mode prod

vite v4.1.1 building for prod...
✓ 2703 modules transformed.
../backend/static/index.html 0.46 kB
../backend/static/assets/github-fab00c2d.svg 0.96 kB
../backend/static/assets/index-d593fb67.css 16.45 kB │ gzip: 2.90 kB
../backend/static/assets/index-4215e6c8.js 1,185.83 kB │ gzip: 348.14 kB │ map: 10,008.16 kB

(!) Some chunks are larger than 500 kBs after minification. Consider:

Deploying services (azd deploy)

(✓) Done: Deploying service backend

(✓) Done: Deploying service functionapp

SUCCESS: Your Azure app has been deployed!
You can view the resources created under the resource group chatspk in Azure Portal:
https://portal.azure.com/#@/resource/subscriptions/3e5e369f-3d6e-4c3f-87f4-83c1cca16cb5/resourceGroups/chatspk/overview

C:\Users\niceysj\repo\chatgptspk\chatpdf>npm audit fix

added 93 packages, and audited 94 packages in 3s

12 packages are looking for funding
run npm fund for details

found 0 vulnerabilities

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.