microsoft / kernel-memory Goto Github PK

View Code? Open in Web Editor NEW

1.2K 31.0 212.0 19.71 MB

Index and query any data using LLM and natural language, tracking sources and showing citations.

Home Page: https://microsoft.github.io/kernel-memory

License: MIT License

C# 98.85% Shell 0.96% Batchfile 0.03% Dockerfile 0.17%

indexing llm memory rag semantic-search

kernel-memory's Introduction

Kernel Memory

Kernel Memory (KM) is a multi-modal AI Service specialized in the efficient indexing of datasets through custom continuous data hybrid pipelines, with support for Retrieval Augmented Generation (RAG), synthetic memory, prompt engineering, and custom semantic memory processing.

KM includes a GPT Plugin, web clients, a .NET library for embedded applications, and as a Docker container.

Utilizing advanced embeddings and LLMs, the system enables Natural Language querying for obtaining answers from the indexed data, complete with citations and links to the original sources.

Designed for seamless integration as a Plugin with Semantic Kernel, Microsoft Copilot and ChatGPT, Kernel Memory enhances data-driven features in applications built for most popular AI platforms.

Repository Guidance

This repository presents best practices and a reference architecture for memory in specific AI and LLMs application scenarios. Please note that the provided code serves as a demonstration and is not an officially supported Microsoft offering.

Kernel Memory (KM) and Semantic Memory (SM)

Semantic Memory (SM) is a library for C#, Python, and Java that wraps direct calls to databases and supports vector search. It was developed as part of the Semantic Kernel (SK) project and serves as the first public iteration of long-term memory. The core library is maintained in three languages, while the list of supported storage engines (known as "connectors") varies across languages.

Kernel Memory (KM) is a service built on the feedback received and lessons learned from developing Semantic Kernel (SK) and Semantic Memory (SM). It provides several features that would otherwise have to be developed manually, such as storing files, extracting text from files, providing a framework to secure users' data, etc. The KM codebase is entirely in .NET, which eliminates the need to write and maintain features in multiple languages. As a service, KM can be used from any language, tool, or platform, e.g. browser extensions and ChatGPT assistants.

Here's a few notable differences:

Feature	Semantic Memory	Kernel Memory
Data formats	Text only	Web pages, PDF, Images, Word, PowerPoint, Excel, Markdown, Text, JSON, more being added
Search	Cosine similarity	Cosine similarity, Hybrid search with filters, AND/OR conditions
Language support	C#, Python, Java	Any language, command line tools, browser extensions, low-code/no-code apps, chatbots, assistants, etc.
Storage engines	Azure AI Search, Chroma, DuckDB, Kusto, Milvus, MongoDB, Pinecone, Postgres, Qdrant, Redis, SQLite, Weaviate	Azure AI Search, Elasticsearch, MongoDB Atlas, Postgres, Qdrant, Redis, SQL Server, In memory KNN, On disk KNN. In progress: Azure Cosmos DB for MongoDB vCore, Chroma

and features available only in Kernel Memory:

RAG (Retrieval Augmented Generation)
RAG sources lookup
Summarization
Security Filters (filter memory by users and groups)
Long running ingestion, large documents, with retry logic and durable queues
Custom tokenization
Document storage
OCR via Azure Document Intelligence
LLMs (Large Language Models) with dedicated tokenization
Cloud deployment
OpenAPI
Custom storage schema (partially implemented/work in progress)
Short Term Memory (partially implemented/work in progress)
Concurrent write to multiple vector DBs

Supported Data formats and Backends

📝 MS Office: Word, Excel, PowerPoint
📃 PDF documents
🌐 Fetch web pages and HTML files
🖼️ JPG/PNG/TIFF Images with text via OCR
📄 MarkDown and Raw plain text
💻 JSON files
💡 AI: Azure OpenAI, OpenAI, LLama - thanks to llama.cpp and LLamaSharp, LM Studio, Azure Document Intelligence
🧠 Vector storage: Azure AI Search, Postgres+pgvector, Qdrant, MSSQL Server (third party), Elasticsearch (third party), Redis, Chroma (work in progress), KNN vectors in memory (volatile), KNN vectors on disk (persistent).
🗂 Content storage: Azure Blobs, Local file system, In memory content (volatile).
⏳ Orchestration: Azure Queues, RabbitMQ, Local file based queues, In memory queues (volatile).

Kernel Memory in serverless mode

Kernel Memory works and scales at best when running as a service, allowing to ingest thousands of documents and information without blocking your app.

However, you can use Kernel Memory also serverless, embedding the MemoryServerless class in your app.

Importing documents into your Kernel Memory can be as simple as this:

var memory = new KernelMemoryBuilder()
    .WithOpenAIDefaults(Env.Var("OPENAI_API_KEY"))
    .Build<MemoryServerless>();

// Import a file
await memory.ImportDocumentAsync("meeting-transcript.docx", tags: new() { { "user", "Blake" } });

// Import multiple files and apply multiple tags
await memory.ImportDocumentAsync(new Document("file001")
    .AddFile("business-plan.docx")
    .AddFile("project-timeline.pdf")
    .AddTag("user", "Blake")
    .AddTag("collection", "business")
    .AddTag("collection", "plans")
    .AddTag("fiscalYear", "2023"));

Asking questions:

var answer1 = await memory.AskAsync("How many people attended the meeting?");

var answer2 = await memory.AskAsync("what's the project timeline?", filter: new MemoryFilter().ByTag("user", "Blake"));

The code leverages the default documents ingestion pipeline:

Extract text: recognize the file format and extract the information
Partition the text in small chunks, to optimize search
Extract embedding using an LLM embedding generator
Save embedding into a vector index such as Azure AI Search, Qdrant or other DBs.

Documents are organized by users, safeguarding their private information. Furthermore, memories can be categorized and structured using tags, enabling efficient search and retrieval through faceted navigation.

Data lineage, citations

All memories and answers are fully correlated to the data provided. When producing an answer, Kernel Memory includes all the information needed to verify its accuracy:

await memory.ImportFileAsync("NASA-news.pdf");

var answer = await memory.AskAsync("Any news from NASA about Orion?");

Console.WriteLine(answer.Result + "/n");

foreach (var x in answer.RelevantSources)
{
    Console.WriteLine($"  * {x.SourceName} -- {x.Partitions.First().LastUpdate:D}");
}

Yes, there is news from NASA about the Orion spacecraft. NASA has invited the media to see a new test version of the Orion spacecraft and the hardware that will be used to recover the capsule and astronauts upon their return from space during the Artemis II mission. The event is scheduled to take place at Naval Base San Diego on Wednesday, August 2, at 11 a.m. PDT. Personnel from NASA, the U.S. Navy, and the U.S. Air Force will be available to speak with the media. Teams are currently conducting tests in the Pacific Ocean to demonstrate and evaluate the processes, procedures, and hardware for recovery operations for crewed Artemis missions. These tests will help prepare the team for Artemis II, which will be NASA's first crewed mission under the Artemis program. The Artemis II crew, consisting of NASA astronauts Reid Wiseman, Victor Glover, and Christina Koch, and Canadian Space Agency astronaut Jeremy Hansen, will participate in recovery testing at sea next year. For more information about the Artemis program, you can visit the NASA website.

NASA-news.pdf -- Tuesday, August 1, 2023

Using Kernel Memory Service

Depending on your scenarios, you might want to run all the code locally inside your process, or remotely through an asynchronous service.

If you're importing small files, and need only C# and can block the process during the import, local-in-process execution can be fine, using the MemoryServerless seen above.

However, if you are in one of these scenarios:

I'd just like a web service to import data and send queries to answer
My app is written in TypeScript, Java, Rust, or some other language
I want to define custom pipelines mixing multiple languages like Python, TypeScript, etc
I'm importing big documents that can require minutes to process, and I don't want to block the user interface
I need memory import to run independently, supporting failures and retry logic

then you can deploy Kernel Memory as a service, plugging in the default handlers, or your custom Python/TypeScript/Java/etc. handlers, and leveraging the asynchronous non-blocking memory encoding process, sending documents and asking questions using the MemoryWebClient.

Here you can find a complete set of instruction about how to run the Kernel Memory service.

Quick test using the Docker image

If you want to give the service a quick test, use the following command to start the Kernel Memory Service using OpenAI:

docker run -e OPENAI_API_KEY="..." -it --rm -p 9001:9001 kernelmemory/service

If you prefer using custom settings and services such as Azure OpenAI, Azure Document Intelligence, etc., you should create an appsettings.Development.json file overriding the default values set in appsettings.json, or using the configuration wizard included:

cd service/Service
dotnet run setup

Then run this command to start the Docker image with the configuration just created:

on Windows:

docker run --volume .\appsettings.Development.json:/app/appsettings.Production.json -it --rm -p 9001:9001 kernelmemory/service

on macOS/Linux:

docker run --volume ./appsettings.Development.json:/app/appsettings.Production.json -it --rm -p 9001:9001 kernelmemory/service

To import files using Kernel Memory web service, use `MemoryWebClient`:

#reference clients/WebClient/WebClient.csproj

var memory = new MemoryWebClient("http://127.0.0.1:9001"); // <== URL where the web service is running

// Import a file (default user)
await memory.ImportDocumentAsync("meeting-transcript.docx");

// Import a file specifying a Document ID, User and Tags
await memory.ImportDocumentAsync("business-plan.docx",
    new DocumentDetails("[email protected]", "file001")
        .AddTag("collection", "business")
        .AddTag("collection", "plans")
        .AddTag("fiscalYear", "2023"));

Getting answers via the web service

curl http://127.0.0.1:9001/ask -d'{"query":"Any news from NASA about Orion?"}' -H 'Content-Type: application/json'

{
  "Query": "Any news from NASA about Orion?",
  "Text": "Yes, there is news from NASA about the Orion spacecraft. NASA has invited the media to see a new test version of the Orion spacecraft and the hardware that will be used to recover the capsule and astronauts upon their return from space during the Artemis II mission. The event is scheduled to take place at Naval Base San Diego on August 2nd at 11 a.m. PDT. Personnel from NASA, the U.S. Navy, and the U.S. Air Force will be available to speak with the media. Teams are currently conducting tests in the Pacific Ocean to demonstrate and evaluate the processes, procedures, and hardware for recovery operations for crewed Artemis missions. These tests will help prepare the team for Artemis II, which will be NASA's first crewed mission under the Artemis program. The Artemis II crew, consisting of NASA astronauts Reid Wiseman, Victor Glover, and Christina Koch, and Canadian Space Agency astronaut Jeremy Hansen, will participate in recovery testing at sea next year. For more information about the Artemis program, you can visit the NASA website.",
  "RelevantSources": [
    {
      "Link": "...",
      "SourceContentType": "application/pdf",
      "SourceName": "file5-NASA-news.pdf",
      "Partitions": [
        {
          "Text": "Skip to main content\nJul 28, 2023\nMEDIA ADVISORY M23-095\nNASA Invites Media to See Recovery Craft for\nArtemis Moon Mission\n(/sites/default/ﬁles/thumbnails/image/ksc-20230725-ph-fmx01_0003orig.jpg)\nAboard the USS John P. Murtha, NASA and Department of Defense personnel practice recovery operations for Artemis II in July. A\ncrew module test article is used to help verify the recovery team will be ready to recovery the Artemis II crew and the Orion spacecraft.\nCredits: NASA/Frank Michaux\nMedia are invited to see the new test version of NASA’s Orion spacecraft and the hardware teams will use\nto recover the capsule and astronauts upon their return from space during the Artemis II\n(http://www.nasa.gov/artemis-ii) mission. The event will take place at 11 a.m. PDT on Wednesday, Aug. 2,\nat Naval Base San Diego.\nPersonnel involved in recovery operations from NASA, the U.S. Navy, and the U.S. Air Force will be\navailable to speak with media.\nU.S. media interested in attending must RSVP by 4 p.m., Monday, July 31, to the Naval Base San Diego\nPublic Aﬀairs (mailto:[email protected]) or 619-556-7359.\nOrion Spacecraft (/exploration/systems/orion/index.html)\nNASA Invites Media to See Recovery Craft for Artemis Moon Miss... https://www.nasa.gov/press-release/nasa-invites-media-to-see-recov...\n1 of 3 7/28/23, 4:51 PMTeams are currently conducting the ﬁrst in a series of tests in the Paciﬁc Ocean to demonstrate and\nevaluate the processes, procedures, and hardware for recovery operations (https://www.nasa.gov\n/exploration/systems/ground/index.html) for crewed Artemis missions. The tests will help prepare the\nteam for Artemis II, NASA’s ﬁrst crewed mission under Artemis that will send four astronauts in Orion\naround the Moon to checkout systems ahead of future lunar missions.\nThe Artemis II crew – NASA astronauts Reid Wiseman, Victor Glover, and Christina Koch, and CSA\n(Canadian Space Agency) astronaut Jeremy Hansen – will participate in recovery testing at sea next year.\nFor more information about Artemis, visit:\nhttps://www.nasa.gov/artemis (https://www.nasa.gov/artemis)\n-end-\nRachel Kraft\nHeadquarters, Washington\n202-358-1100\[email protected] (mailto:[email protected])\nMadison Tuttle\nKennedy Space Center, Florida\n321-298-5868\[email protected] (mailto:[email protected])\nLast Updated: Jul 28, 2023\nEditor: Claire O’Shea\nTags:  Artemis (/artemisprogram),Ground Systems (http://www.nasa.gov/exploration/systems/ground\n/index.html),Kennedy Space Center (/centers/kennedy/home/index.html),Moon to Mars (/topics/moon-to-\nmars/),Orion Spacecraft (/exploration/systems/orion/index.html)\nNASA Invites Media to See Recovery Craft for Artemis Moon Miss... https://www.nasa.gov/press-release/nasa-invites-media-to-see-recov...\n2 of 3 7/28/23, 4:51 PM",
          "Relevance": 0.8430657,
          "SizeInTokens": 863,
          "LastUpdate": "2023-08-01T08:15:02-07:00"
        }
      ]
    }
  ]
}

You can find a full example here.

Custom memory ingestion pipelines

On the other hand, if you need a custom data pipeline, you can also customize the steps, which will be handled by your custom business logic:

// Memory setup, e.g. how to calculate and where to store embeddings
var memoryBuilder = new KernelMemoryBuilder()
    .WithoutDefaultHandlers()
    .WithOpenAIDefaults(Env.Var("OPENAI_API_KEY"));

var memory = memoryBuilder.Build();

// Plug in custom .NET handlers
memory.Orchestrator.AddHandler<MyHandler1>("step1");
memory.Orchestrator.AddHandler<MyHandler2>("step2");
memory.Orchestrator.AddHandler<MyHandler3>("step3");

// Use the custom handlers with the memory object
await memory.ImportDocumentAsync(
    new Document("mytest001")
        .AddFile("file1.docx")
        .AddFile("file2.pdf"),
    steps: new[] { "step1", "step2", "step3" });

Web API specs

The API schema is available at http://127.0.0.1:9001/swagger/index.html when running the service locally with OpenAPI enabled.

Examples and Tools

Examples

Tools

.NET packages

Microsoft.KernelMemory.WebClient: The web client library, can be used to call a running instance of the Memory web service. .NET Standard 2.0 compatible.
Microsoft.KernelMemory.SemanticKernelPlugin: a Memory plugin for Semantic Kernel, replacing the original Semantic Memory available in SK. .NET Standard 2.0 compatible.
Microsoft.KernelMemory.Abstractions: The internal interfaces and models shared by all packages, used to extend KM to support third party services. .NET Standard 2.0 compatible.
Microsoft.KernelMemory.MemoryDb.AzureAISearch: Memory storage using Azure AI Search.
Microsoft.KernelMemory.MemoryDb.Postgres: Memory storage using PostgreSQL.
Microsoft.KernelMemory.MemoryDb.Qdrant: Memory storage using Qdrant.
Microsoft.KernelMemory.AI.AzureOpenAI: Integration with Azure OpenAI LLMs.
Microsoft.KernelMemory.AI.LlamaSharp: Integration with LLama LLMs.
Microsoft.KernelMemory.AI.OpenAI: Integration with OpenAI LLMs.
Microsoft.KernelMemory.DataFormats.AzureAIDocIntel: Integration with Azure AI Document Intelligence.
Microsoft.KernelMemory.Orchestration.AzureQueues: Ingestion and synthetic memory pipelines via Azure Queue Storage.
Microsoft.KernelMemory.Orchestration.RabbitMQ: Ingestion and synthetic memory pipelines via RabbitMQ.
Microsoft.KernelMemory.ContentStorage.AzureBlobs: Used to store content on Azure Storage Blobs.
Microsoft.KernelMemory.Core: The core library, can be used to build custom pipelines and handlers, and contains a serverless client to use memory in a synchronous way, without the web service. .NET 6+.

Packages for Python, Java and other languages

Kernel Memory service offers a Web API out of the box, including the OpenAPI swagger documentation that you can leverage to test the API and create custom web clients. For instance, after starting the service locally, see http://127.0.0.1:9001/swagger/index.html.

A .NET Web Client and a Semantic Kernel plugin are available, see the nugets packages above.

A python package with a Web Client and Semantic Kernel plugin will soon be available. We also welcome PR contributions to support more languages.

Contributors


afederici75	alexibraimov	alkampfergit	amomra	anthonypuppo	cherchyk


crickman	dluc	DM-98	GraemeJones104	kbeaugrand	lecramr


luismanez	marcominerva	pascalberger	pawarsum12	qihangnet	slapointe


slorello89	TaoChenOSU	teresaqhoang	vicperdana	xbotter

kernel-memory's People

Contributors

Stargazers

Watchers

Forkers

dluc jaoltr thegovind mygit-2023 lbdavid98 taochenosu cristiana10g bronwin87 linkinng xbotter graemejones104 captcrunchx alexibraimov markwallace-microsoft colombod sstrelnikov web3underbelly vpegasus feng-huang jfontestad kenkenken123 anthonypuppo brunoborges sheme999 wangyq199 trungklam allenlile simplyamitshah michag70 ethicalsecurity-agency freistli aganiezgoda arek99 matteo-grella teresaqhoang johntallon luciano200x kevingautama karlgodtliebsen jplummer01 ashd arturo-quiroga-msft webgrip phlaz weihanli selfishark cjpark-sapcsa sayanghosh123 ankit177b luismanez adiazcan marcinjuraszek chuckbeasley ogkent goyanx ryangr0 bashimr pyrokin5 kdcllc shenjiede smarteasy ksemenenko kill136 webgrip plagueho zluckymn maxakbar martinshuck say383 myesca touristshaun kumar045 apollohuang1 jdev9 id-2 tuff-madman sycomix huddavi healthmemmo allthingsllm f901107 franklinrw divyabgowda meslubi2021 kkoppineedi lucasnobre212 zochory glahaye lecramr hjhwang1 0gis0 jacgatorka jerousrb jensinjames josephrp kiismygun jghuangcn filipzivanovic-reed phykas s-brooks-pe

kernel-memory's Issues

SM like SK should be AI service agnostic

Semantic Kernel 1.0 is AI service agnostic. Semantic kernel now only support azure openai/openai , We should make the Semantic memory AI provider agnostic.

DiversitySampling with Memory for DocumentQA

Moved from semantic-kernel

For document QA, when using Memory and filtering solely by relevance_score, you end up with results that are mostly from similar sections of the same document. However, when a comprehensive answer considering the overall content is necessary, utilizing various parts from diverse documents rather than just including sections from a single document in the prompt can be beneficial.

After reading the provided link, I found it to be a good idea and propose integrating it as a feature used in conjunction with Memory.

This post introduces the concepts of DiversityRanker and LostInTheMiddleRanker. The concept of LostInTheMiddleRanker is closely related to the overall structure of prompts, so I didn't include it in this proposal for now, as it might be challenging to integrate solely with Memory.

Implementation code in haystack is here

Remove ACS limit increase when filtering

Depends on #86.

The latest preview API version 2023-10-01-Preview adds a new parameter vectorFilterMode for specifying prefiltering (default) or postfiltering in the query. Updating the search options to use postfiltering (I think?) will remove the need for the below.

https://github.com/microsoft/semantic-memory/blob/d041156f28a5ef3e141e9ba2e7bbf0b7b6c1e394/dotnet/CoreLib/MemoryStorage/AzureCognitiveSearch/AzureCognitiveSearchMemory.cs#L129

Unable to resolve type IOcrEngine

var memory = new MemoryClientBuilder()
    .WithOpenAIDefaults(apiKey)
    .WithCustomImageOcr(new TesseractEngineWrapper(new TesseractEngine(tesseractOptions!.FilePath, tesseractOptions!.Language, EngineMode.Default)))
    .WithQdrant($"{qdrantOptions.Host}:{qdrantOptions.Port}")
    .Build();

services.AddSingleton(memory);

public DocumentController(ILogger<DocumentController> logger,
    IOptions<DocumentOptions> documentOptions,
    IAuthInfo authInfo,
    IOcrEngine ocrEngine,
    DocumentTypeProvider documentTypeProvider)
{
    _logger = logger;
    _documentOptions = documentOptions.Value;
    _authInfo = authInfo;
    _ocrEngine = ocrEngine;
    _documentTypeProvider = documentTypeProvider;
}

when di ocrengine to controller,throw exception。
Unable to resolve service for type 'Microsoft.SemanticMemory.DataFormats.Image.IOcrEngine' while attempting to activate 'HaoAI.Service.Controllers.DocumentController'

Generate embeddings using batch requests

Original issue: microsoft/semantic-kernel#2168 from @Nurgo

Hi,

I've been working with SemanticKernel and I appreciate all the hard work you've put into this amazing project.

I'm currenly encountering a limitation with the ISemanticTextMemory.SaveInformationAsync method which only processes one text segment at a time, causing many calls to OpenAI's APIs when indexing large documents.

Would it be possible to modify this function to allow batch processing in a single call? Since IEmbeddingGeneration.GenerateEmbeddingsAsync already supports batching, it might be a feasible and impactful improvement.

Thanks for considering this suggestion.

Setting a Filter with MemoryFilter.MinRelevance value, makes the Az Search request fail.

I don't know if this is something we can do in Az search, but I'd like to get documents with Score/Relevance greater than a given value.

The SearchAsync (and AskAsync) has a Filter parameter, and there there is a MinRelevance property. However, this property is not used later in code, and that generates an Azure Search API request with the Filter parameter empty, so it fails. This snippet fails:

        var data = await _memory.SearchAsync(
            query: _question,
            index: Constants.AzureSearchIndexName,
            filter: new MemoryFilter { MinRelevance = 0.8f }
            limit: 10
        );

I'd like to have that Filter by Score, cos I'm having the following behaviour:
I have indexed just 3 documents, 2 of them are related to Domain Driven Design, and the 3rd one is the wikipedia into about Spiderman. I ask then the following question to the AskAsync method: "When an Entity should be considered an aggregate in DDD"

In the relevant results, I'm getting Citations/Parts from the Spiderman document. Obviously, they are sorted below the ones related with DDD, but as when the Prompt is composed, facts are added depending on the Tokens size, you can end up sending facts from Spiderman document. If we can filter the Search query to say "only search documents having score > 0.8" will avoid sending Spiderman facts to the Prompt.

[Discussion] Why not rely on SK connector ?

Semantic Memory seems excellent for the projects I'm addressing and I think I'll consider going with it fairly soon.

On the other hand, I'm surprised to see that this solution doesn't rely on the semantic kernel connector.
What's more, it doesn't provide an abstraction package enabling us to "easily" integrate other storage providers.
I understand that the functionalities needed to realize this solution were not available, but an evolution might have been necessary?

At present, only CognitiveSearch and QDrant are supported, which is good enough to start with, but may slow down the scale-up of these solutions in the enterprise.

Are there any plans to configure the memory service with your own storage provider?

Sharing an improvement: a High Customizable Text Extractor.

Hey Guys!

Below, you will find an attached file that facilitates the overriding of the extraction method during the customization of a new pipeline. Initially developed for personal use, I believe it might be beneficial for you as well. Here is an illustrative example:

var mbuilder = new MemoryClientBuilder();
var memory = mbuilder.Build();
var orchestrator = mbuilder.GetOrchestrator();

// Replacing the default MsWordDecoder
var textExtractor = new TextExtractionHandler("extraction", orchestrator);
textExtractor.AddExtractor(
    (pipeline, file, content, ctoken) => { 
        // return new MsWordDecoder().DocToText(content); 
        return new MyDecoder().DocToText(content);  
    },
    MimeTypes.MsWord
);

Best Regards,
Sandro Bihaiko.

TextExtractionHandler.cs.txt

Add Image support and OCR conversion

Support Azure Databricks for Memory Service

Claudio Mirti reached out from azure data-bricks team to suggest we consider integration as they are also very much focused on AI scenarios:

Data Bricks

This might start with a sample (like NL2SQL).

Orchestrator is not reading the ingestion steps from appsettings

The orchestrator is not reading the ingestion steps from app settings: https://github.com/microsoft/semantic-memory/blob/main/dotnet/CoreLib/Pipeline/BaseOrchestrator.cs#L40. It always consumes the default steps.

Do you have plan to support AWS services?

For example, content storage with AWS S3?

How to update documents/embeddings stored in a vector database using semantic memory?

I've had a look through and it's not clear what the intended workflow is for this scenario. Say I've ingested a number of documents, but one has been updated, how would I go about updating the memory?

Expose Partition "Link" to lookup document snippet

The existing Citation.Link refers to the entire logical document. We also would like to lookup the specific citation partition text, not just the entire document.

Could the Citiation.Partition also expose a Link property?

"text-embedding-ada-003" Does not exist (yet)

It seems like some next-generation of ada was used for embedding, but afaik it doesn't exist yet publicly, so it causes errors to try and use. Likely references need updated to 002 instead at least until 003 is publicly released.

Expose loaded documents list

Hi,

It should be possible to list all documents that are loaded into memory:

var memory = new MemoryClientBuilder()
.WithOpenAIDefaults("ssdsd1234")
.Build();

foreach (var doc in memory .GetLoadedDocuments())
{
Console.WriteLine($" * {doc .SourceName} -- {doc ..LastUpdate:D}");
}

Publish package for Core library

https://github.com/microsoft/chat-copilot needs to consume package for integration

Client unable to configure model parameters

The client currently cannot configure parameters (TopP, temperature...) for the completion model used in the summarization handler and the Ask API: https://github.com/microsoft/semantic-memory/blob/main/dotnet/CoreLib/Search/SearchClient.cs#L312

Would be useful to be able to configure those parameters.

IDEA: Support for Consumption container app; Docker

Currently pipeline is polling queue (Azure storage queue) every second as hard coded. This is not optimal when e.g. using Consumption Container apps & Docker since it keeps always replicas awake..

It would be preferable that those can be managed so memorypipeline could go sleep mode and wake up by aca scaling...

How to remove or replace a file from the Document?

If the file has changed or is no longer needed to be accessed in the current memory.

[QUESTION] Is there a way to substitute default answer_with_facts.txt prompt

Hey guys!

I'm wondering if there is a way to substitute default Completion API prompt to my custom one? Or maybe I can somehow inject my custom SearchClient that would also solve my issue.

Kind regards,
Dawid

Consistently handle index not found error in all memory storage solutions

Currently, when searching in an index that doesn't exist, Azure Cognitive Search and Simple Storage will return an empty list, while qDrant will throw an exception. This makes it difficult for clients to handle errors consistently. Introduce a new exception IndexNotFoundException that all memory solutions should throw when search in an index that does not exist.

Unable to get web service running locally

I am trying to get SM service running locally so I can refactor my project to use the v1 SK release, but I can't get it working. I've followed everything that I see on the readme, and I've ensured the env variable is set to "Development". Below is what happens when i try to run the service. I am getting an error about an empty string value for the Parametr 'modelId'.

I searched "modelId" and saw no other issues raised. Is there some step missing in the readme that I wouldn't know to take?

PS **\semantic-memory\dotnet\Service> ./run
  Determining projects to restore...
  All projects are up-to-date for restore.
MSBuild version 17.7.3+8ec440e68 for .NET
  Determining projects to restore...
  All projects are up-to-date for restore.
  ClientLib -> **\semantic-memory\dotnet\ClientLib\bin\Debug\netstandard2.0\Microsoft.SemanticMemory.Client.dll
  CoreLib -> **\semantic-memory\dotnet\CoreLib\bin\Debug\net6.0\Microsoft.SemanticMemory.Core.dll
  InteractiveSetup -> **\semantic-memory\dotnet\InteractiveSetup\bin\Debug\net6.0\Microsoft.SemanticMemory.InteractiveS
  etup.dll
  Service -> **\semantic-memory\dotnet\Service\bin\Debug\net6.0\Microsoft.SemanticMemory.Service.dll

Build succeeded.
    0 Warning(s)
    0 Error(s)

Time Elapsed 00:00:01.44
Unhandled exception. System.ArgumentException: The value cannot be an empty string or composed entirely of whitespace. (Parameter 'modelId')
   at Microsoft.SemanticKernel.Diagnostics.Verify.ThrowArgumentWhiteSpaceException(String paramName)
   at Microsoft.SemanticKernel.Connectors.AI.OpenAI.AzureSdk.AzureOpenAIClientBase..ctor(String modelId, String endpoint, TokenCredential credential, HttpClient httpClient, ILoggerFactory loggerFactory)
   at Microsoft.SemanticKernel.Connectors.AI.OpenAI.TextEmbedding.AzureTextEmbeddingGeneration..ctor(String modelId, String endpoint, TokenCredential credential, HttpClient httpClient, ILoggerFactory loggerFactory)
   at Microsoft.SemanticMemory.DependencyInjection.<>c__DisplayClass0_0.<AddAzureOpenAIEmbeddingGeneration>b__0(IServiceProvider serviceProvider) in **\semantic-memory\dotnet\CoreLib\AI\AzureOpenAI\DependencyInjection.cs:line 38
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteVisitor`2.VisitCallSiteMain(ServiceCallSite callSite, TArgument argument)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.VisitRootCache(ServiceCallSite callSite, RuntimeResolverContext context)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteVisitor`2.VisitCallSite(ServiceCallSite callSite, TArgument argument)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.Resolve(ServiceCallSite callSite, ServiceProviderEngineScope scope)
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.CreateServiceAccessor(Type serviceType)
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd(TKey key, Func`2 valueFactory)
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.GetService(Type serviceType, ServiceProviderEngineScope serviceProviderEngineScope)
   at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetService[T](IServiceProvider provider)   at Microsoft.SemanticMemory.MemoryClientBuilder.FromConfiguration(SemanticMemoryConfig config, IConfiguration servicesConfiguration) in **\semantic-memory\dotnet\CoreLib\AppBuilders\MemoryClientBuilder.cs:line 304
   at Microsoft.SemanticMemory.MemoryClientBuilder.FromAppSettings(String settingsDirectory) in **\semantic-memory\dotnet\CoreLib\AppBuilders\MemoryClientBuilder.cs:line 218
   at Program.<Main>$(String[] args) in **\semantic-memory\dotnet\Service\Program.cs:line 42
PS **\semantic-memory\dotnet\Service>

Streaming AskAsync response

Hi,

We're using your library in our project. It get us up to speed preety fast in these new AI topics, so thank you for that :).

I was wondering if there is an option to recieve streaming response from our semantic memory service. This could improve user experience, to recieve some first tokens of response instead of waiting for whole thing. OpenAI chatgpt is working this way.

BR,
Dawid

SimpleQueues doesn't mark a message is being processed

This issue was discovered by resolving a bug.
First the bug:
In https://github.com/microsoft/semantic-memory/blob/main/dotnet/CoreLib/Pipeline/Queue/DevTools/SimpleQueues.cs#L163, the _busy is not reset at the end, disallowing the queue from dispatching messages.

The issue: a message will be put into the queue multiple time while it's being processed.
Error:

Unhandled exception. Unhandled exception. warn: Microsoft.SemanticMemory.Pipeline.DistributedPipelineOrchestrator[0]
      Unable to save pipeline status
      System.IO.IOException: The process cannot access the file 'C:\Users\taochen\Projects\semantic-memory\dotnet\Service\tmp-content-storage\default\bf75a9e4f76547809ffe8625d41e430c202309180155594949389\__pipeline_status.json' because it is being used by another process.
         at Microsoft.Win32.SafeHandles.SafeFileHandle.CreateFile(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options)
         at Microsoft.Win32.SafeHandles.SafeFileHandle.Open(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize, Nullable`1 unixCreateMode)
         at System.IO.File.OpenHandle(String path, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize)
         at System.IO.File.WriteToFileAsync(String path, FileMode mode, String contents, Encoding encoding, CancellationToken cancellationToken)
         at Microsoft.SemanticMemory.ContentStorage.DevTools.SimpleFileStorage.WriteTextFileAsync(String index, String documentId, String fileName, String fileContent, CancellationToken cancellationToken) in C:\Users\taochen\Projects\semantic-memory\dotnet\CoreLib\ContentStorage\DevTools\SimpleFileStorage.cs:line 55
         at Microsoft.SemanticMemory.Pipeline.BaseOrchestrator.UpdatePipelineStatusAsync(DataPipeline pipeline, CancellationToken cancellationTokenSystem.IO.IOException: The process cannot access the file 'C:\Users\taochen\Projects\semantic-memory\dotnet\Service\tmp-content-storage\default\bf75a9e4f76547809ffe8625d41e430c202309180155594949389\content.txt.partition.0.txt' because it is being used by another process.
   at Microsoft.Win32.SafeHandles.SafeFileHandle.CreateFile(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options)
   at Microsoft.Win32.SafeHandles.SafeFileHandle.Open(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize, Nullable`1 unixCreateMode)
   at System.IO.Strategies.OSFileStreamStrategy..ctor(String path, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize, Nullable`1 unixCreateMode)
   at System.IO.Strategies.FileStreamHelpers.ChooseStrategyCore(String path, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize, Nullable`1 unixCreateMode)
   at System.IO.File.Create(String path)
   at Microsoft.SemanticMemory.ContentStorage.DevTools.SimpleFileStorage.WriteStreamAsync(String index, String documentId, String fileName, Stream contentStream, CancellationToken cancellationToken) in C:\Users\taochen\Projects\semantic-memory\dotnet\CoreLib\ContentStorage\DevTools\SimpleFileStorage.cs:line 70
   at Microsoft.SemanticMemory.Handlers.TextPartitioningHandler.InvokeAsync(DataPipeline pipeline, CancellationToken cancellationToken) in C:\Users\taochen\Projects\semantic-memory\dotnet\CoreLib\Handlers\TextPartitioningHandler.cs:line 115
   at Microsoft.SemanticMemory.Pipeline.DistributedPipelineOrchestrator.RunPipelineStepAsync(DataPipeline pipeline, IPipelineStepHandler handler, CancellationToken cancellationToken) in C:\Users\taochen\Projects\semantic-memory\dotnet\CoreLib\Pipeline\DistributedPipelineOrchestrator.cs:line 148
   at Microsoft.SemanticMemory.Pipeline.DistributedPipelineOrchestrator.<>c__DisplayClass3_0.<<AddHandlerAsync>b__0>d.MoveNext() in C:\Users\taochen\Projects\semantic-memory\dotnet\CoreLib\Pipeline\DistributedPipelineOrchestrator.cs:line 80
--- End of stack trace from previous location ---
   at Microsoft.SemanticMemory.Pipeline.Queue.DevTools.SimpleQueues.<>c__DisplayClass17_0.<<OnDequeue>b__0>d.MoveNext() in C:\Users\taochen\Projects\semantic-memory\dotnet\CoreLib\Pipeline\Queue\DevTools\SimpleQueues.cs:line 134
--- End of stack trace from previous location ---
   at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()

To reproduce:

Run the service project with SimpleQueues
Launch example 002.

Here https://github.com/microsoft/semantic-memory/blob/main/dotnet/CoreLib/Pipeline/Queue/DevTools/SimpleQueues.cs#L178, a message is read from a file and added to a set. This makes sure the queue will not have duplicated messages. However, when a message is dispatched, the message is not flagged. The next time the queue dispatches messages, the same message will get dispatched again: https://github.com/microsoft/semantic-memory/blob/main/dotnet/CoreLib/Pipeline/Queue/DevTools/SimpleQueues.cs#L207. This will create more than one tasks that will try to hold on to the same resource, or downstream steps unable to process some resources.

Provide a service method to list existing indexes in the Semantic Memory

It's ok for now that if I provide an index that doesn't exist (when uploading a file) it creates the index, but if I want to create an 'admin' app that lets me see and maintain my indexes I have to hardcode my valid index names somewhere so I can see/select them in my app, or roll my own code in the dotnet service to list the indexes that currently exist in whatever vector db I am set up to use.

For the record I really like this project and looking forward to its rapid growth - it really makes it simple to implement low cost POC's to demonstrate value as we move towards more investment in AI.

Feature Request: Allow enable OCR extraction from PDF

I'd like to be able to opt-in enable OCRing PDF documents.
I understand that tesseract doesn't support this, but Form Recognizer does.

`IsDocumentReadyAsync` + `Task.Delay` pattern

There really needs to be some way to wait for a document to be ready, to replace the recommended "check if it's ready, and if not wait 2s and check again" pattern.

Limit number of results for SearchAsync

Can we add a MaxResults, MaxResultCount, w/e property to MemoryFilter? There are cases where we only need a single/top result.

Handle index

.Net: Postgres connector support for vector database

Suggestion: There is a dotnet Faiss Index Wrapper available for a more efficient local vector store

https://github.com/fwaris/FaissNet (and nuget package)

It uses pinvoke over a C++/CMAKE intermediate wrapper that internally wraps the Meta FAISS code.

Currently supports win64 only but can be extended to Linux and MACOS (because of CMAKE).

Searches can leverage the efficient Hierarchical Navigable Small Worlds (HNSW) index algorithm.

Its purely the vector index so need another DB (e.g. SQLite) for additional functionality e.g. for text and metadata.

It does not compete with cloud services e.g. Azure / Quadrant / Pinecone (although some of them may internally use the same FAISS code referenced here) but may be useful for local deployment scenarios where higher performance is required.

I would appreciate if you could consider adopting this project. Currently, I don't have much time to devote to it.

SearchClient overwrites Citation state in SearchAsync

Lines 121-124 overwrite existing citation state. For Link, SourceContentType, and SourceName this will be assigning identical state, except in the case of multiple document import...in which caes Link will be last write wins.

For Tags, however, divergent states are assigned (reserved tags will be last write wins).

https://github.com/microsoft/semantic-memory/blob/main/dotnet/CoreLib/Search/SearchClient.cs

            // If the file is already in the list of citations, only add the partition
            var citation = result.Results.FirstOrDefault(x => x.Link == linkToFile);
            if (citation == null)
            {
                citation = new Citation();
                result.Results.Add(citation);
            }

            // Add the partition to the list of citations
            citation.Link = linkToFile;
            citation.SourceContentType = fileContentType;
            citation.SourceName = fileName;
            citation.Tags = memory.Tags;

#pragma warning disable CA1806 // it's ok if parsing fails
            DateTimeOffset.TryParse(memory.Payload[Constants.ReservedPayloadLastUpdateField].ToString(), out var lastUpdate);
#pragma warning restore CA1806

            citation.Partitions.Add(new Citation.Partition
            {
                Text = partitionText,
                Relevance = (float)relevance,
                LastUpdate = lastUpdate,
            });

Support ChatCompletion streaming over Completions

The OpenAI/AzureOpenAI TextGeneration classes use the legacy Completions API (deprecation from OpenAI, announcement and Azure model compatibility).

This poses a challenge when working with Azure OpenAI Service in particular as you are required to deploy a gpt-35-turbo model version of 0301, which is deprecated, but due to quota limits on standard accounts you are unlikely to be able to deploy that plus the embeddings model and a 0613 model for gpt-35-turbo for the application to use.

Having Semantic Memory move off the Completions API to ChatCompletions would unblock the usage in AOAI applications and ensure that applications aren't caught in the upcoming deprecations.

Reading document from nested directory cause error while saving memory on disk

When trying load document from subdirectory, getting error "System.IO.DirectoryNotFoundException":

Repo for reproduce:
https://github.com/TomaszGrzmilas/AIAssistant.git

Detail error message:

fail: Microsoft.SemanticMemory.Pipeline.BaseOrchestrator[0]
Pipeline start failed
System.IO.DirectoryNotFoundException: Could not find a part of the path 'D:\Programowanie\AI DEVS\Semantic Kernel\AIAssistant\tmp-memory-files\default\2021.txt\Documents\2021.txt'.
at Microsoft.Win32.SafeHandles.SafeFileHandle.CreateFile(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options)
at Microsoft.Win32.SafeHandles.SafeFileHandle.Open(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize, Nullable1 unixCreateMode) at System.IO.Strategies.OSFileStreamStrategy..ctor(String path, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize, Nullable1 unixCreateMode)
at System.IO.Strategies.FileStreamHelpers.ChooseStrategyCore(String path, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize, Nullable`1 unixCreateMode)
at System.IO.File.Create(String path)
at Microsoft.SemanticMemory.ContentStorage.DevTools.SimpleFileStorage.WriteStreamAsync(String index, String documentId, String fileName, Stream contentStream, CancellationToken cancellationToken)
at Microsoft.SemanticMemory.Pipeline.BaseOrchestrator.UploadFormFilesAsync(DataPipeline pipeline, CancellationToken cancellationToken)
at Microsoft.SemanticMemory.Pipeline.BaseOrchestrator.UploadFilesAsync(DataPipeline currentPipeline, CancellationToken cancellationToken)
at Microsoft.SemanticMemory.Pipeline.InProcessPipelineOrchestrator.RunPipelineAsync(DataPipeline pipeline, CancellationToken cancellationToken)
at Microsoft.SemanticMemory.Pipeline.BaseOrchestrator.ImportDocumentAsync(String index, DocumentUploadRequest uploadRequest, CancellationToken cancellationToken)

Config file: move "ImageOcrType" under "DataIngestion"

Clearly define public API and differentiate internal contracts

As part of the transition from preview to production, it may make sense to reduce public surface area. Even though this is an open source project, people taking dependencies (for whatever reason) on internal contracts that are incorrectly scoped will not be delighted as changes progress.

Goal: Scope classes that do not have public intent as internal:

Example: AzureCognitiveSearchMemoryRecord - microsoft/semantic-kernel#2678

IContentStorage Delete methods fail when DocumentId contains invalid char

The different ImportDocument methods work with a Document class. That class, when setting the DocumentId, is doing some replacements to avoid special chars:

public string Id
    {
        get { return this._id; }
        set
        {
            this._id = string.IsNullOrWhiteSpace(value)
                ? FsNameToId(RandomId())
                : FsNameToId(value);
        }
    }

Basically, applies this regular expression:

private static readonly Regex s_replaceSymbolsRegex = new(@"[\s|\||\\|/|\0|'|\`|""|:|;|,|~|!|?|*|+|\-|=|_|^|@|#|$|%|&]");

So, if your documentId is "hello-world", it changes to "hello_world", so all the temp files are created within "hello_world" folder.

However, that regular expression is not applied when the DocumentId is passed as string to the different Delete methods in the IContentStorage implementations. So, if you try to delete the document, the Temp folder is not deleted, as it tries to delete "hello-world" folder, which does not exist, as it was created as "hello_world" folder. Even worse, as you end up with 2 different folders: Hello-world (created by the Delete handler leaving just the pipeline_status.json) and Hello_world folder, created by the extraction handler.

I think the easiest solution is, in the SimpleFileStorage class:

    private string GetDocumentPath(string index, string documentId)
    {
        return Path.Join(this.GetIndexPath(index), Document.FsNameToId(documentId));
    }

And in the AzureBlobStorage class:

    private static string JoinPaths(string index, string documentId)
    {
        return $"{index}/{Document.FsNameToId(documentId)}";
    }

However, not sure if we shouldn´t refactor these classes to pass directly a Document object...

Happy to do a PR if my approach sounds good @dluc

Thanks!

add a method to import content as string instead of document

Would it be possible to add a overriden method ImportDocumentContentAsync that would accept as input the string representation of a document instead of the doc file . this helps in integration with our solution that pre-processes the doc and has the string value available.

Instead of

await memory.ImportDocumentAsync("meeting-transcript.docx", tags: new() { { "user", "Blake" } });

you could :

await memory.ImportDocumentContentAsync("DocumentContent", tags: new() { { "user", "Blake" } });

Customize chunk size in TextPartitioningHandler

I hope to be able to reduce the size of the chunk to control token consumption during Ask.

Simple* services not reading directory from configuration

Simple* services using hardcoded config values.

appsettings.json

      "FileSystemContentStorage": {
        "Directory": "C:/Memory/Cache"
      },
      "FileBasedQueue": {
        "Directory": "C:/Memory/Queue"
      }

MemoryClientBuilder.cs

        this.WithSimpleFileStorage(new SimpleFileStorageConfig { Directory = "tmp-memory-files" });
        this.WithSimpleVectorDb(new SimpleVectorDbConfig { Directory = "tmp-memory-vectors" });

Sample 008-dotnet-nl2sql Issues

A couple of issues to help new downloaders.

The Semantic functions are in the wrong folders, as they are reversed.
minRelevanceScore: 0.75, found nothing, had to change to 0.5, might be good to change the default to be lower for the sample.
Path (I just saw this has been fixed)

Support SQL Server as a memory source

SQL Server works fine as a memory source to store vector embeddings. It is the most cost effective method. See: Azure OpenAI RAG Pattern using a SQL Vector Database

[QUESTION] Should temp files be deleted after ImportDocumentAsync completed?

I've been doing some tests with memory.ImportDocumentAsync (either using Local file system and Azure Blob storage). In both cases, the temp files are not deleted when the pipeline is successfully finished.
As they are temp files, shouldn't be deleted by default after pipeline complete?

According to the source code, the BaseOrchestrator class has a ``CleanUpAfterCompletionAsync``` method, but that method is only executed if you add the delete_document handler to the pipeline. Besides, the code in the orchestrators leads to misunderstanding:

if (pipeline.Complete)
        {
            this.Log.LogInformation("Pipeline '{0}/{1}' complete", pipeline.Index, pipeline.DocumentId);

            // Save the pipeline status. If this fails, the system should retry the current step.
            await this.UpdatePipelineStatusAsync(pipeline, cancellationToken).ConfigureAwait(false);

            await this.CleanUpAfterCompletionAsync(pipeline, cancellationToken).ConfigureAwait(false);
        }

It says that when the pipeline is complete, it will do a CleanUp, which is exactly what I'd want and expect. However, as said before, that method only runs when the DeleteDocument or DeleteIndex handlers are in the pipeline:

if (pipeline.IsDocumentDeletionPipeline())
// stuff

if (pipeline.IsIndexDeletionPipeline())
// stuff

Is there any way to really CleanUp those temp files?

Thanks!

Pipeline unable to process large documents

Observation

Loading documents greater than a certain size become stuck in pipeline while resulting in the following pipeline failure.

Microsoft.SemanticMemory.Pipeline.Queue.AzureQueues.AzureQueue[0]
      Message '598285c6-5d10-40f2-b740-64f763091f45' processing failed with exception, putting message back in the queue
      Azure.RequestFailedException: The request body is too large and exceeds the maximum permissible limit.
RequestId:22cde34c-2003-0047-38e2-e4b1e5000000
Time:2023-09-11T19:05:33.9016314Z
      Status: 413 (The request body is too large and exceeds the maximum permissible limit.)
      ErrorCode: RequestBodyTooLarge

      Additional Information:
      MaxLimit: 65536

      Content:
      ?<?xml version="1.0" encoding="utf-8"?><Error><Code>RequestBodyTooLarge</Code><Message>The request body is too large and exceeds the maximum permissible limit.
RequestId:22cde34c-2003-0047-38e2-e4b1e5000000
Time:2023-09-11T19:05:33.9016314Z</Message><MaxLimit>65536</MaxLimit></Error>

      Headers:
      Server: Windows-Azure-Queue/1.0 Microsoft-HTTPAPI/2.0
      x-ms-request-id: 22cde34c-2003-0047-38e2-e4b1e5000000
      x-ms-version: 2018-11-09
      x-ms-error-code: RequestBodyTooLarge
      Date: Mon, 11 Sep 2023 19:05:33 GMT
      Content-Length: 286
      Content-Type: application/xml

         at Azure.Storage.Queues.MessagesRestClient.EnqueueAsync(QueueMessage queueMessage, Nullable`1 visibilitytimeout, Nullable`1 messageTimeToLive, Nullable`1 timeout, CancellationToken cancellationToken)
         at Azure.Storage.Queues.QueueClient.SendMessageInternal(BinaryData message, Nullable`1 visibilityTimeout, Nullable`1 timeToLive, Boolean async, CancellationToken cancellationToken, String operationName)
         at Azure.Storage.Queues.QueueClient.SendMessageAsync(String messageText, Nullable`1 visibilityTimeout, Nullable`1 timeToLive, CancellationToken cancellationToken)
         at Azure.Storage.Queues.QueueClient.SendMessageAsync(String messageText, CancellationToken cancellationToken)
         at Microsoft.SemanticMemory.Pipeline.Queue.AzureQueues.AzureQueue.EnqueueAsync(String message, CancellationToken cancellationToken) in C:\Users\crickman\source\repos\semantic-memory-sync\dotnet\CoreLib\Pipeline\Queue\AzureQueues\AzureQueue.cs:line 184
         at Microsoft.SemanticMemory.Pipeline.DistributedPipelineOrchestrator.MoveForwardAsync(DataPipeline pipeline, CancellationToken cancellationToken) in C:\Users\crickman\source\repos\semantic-memory-sync\dotnet\CoreLib\Pipeline\DistributedPipelineOrchestrator.cs:line 188
         at Microsoft.SemanticMemory.Pipeline.DistributedPipelineOrchestrator.RunPipelineStepAsync(DataPipeline pipeline, IPipelineStepHandler handler, CancellationToken cancellationToken) in C:\Users\crickman\source\repos\semantic-memory-sync\dotnet\CoreLib\Pipeline\DistributedPipelineOrchestrator.cs:line 156
         at Microsoft.SemanticMemory.Pipeline.DistributedPipelineOrchestrator.<>c__DisplayClass3_0.<<AddHandlerAsync>b__0>d.MoveNext() in C:\Users\crickman\source\repos\semantic-memory-sync\dotnet\CoreLib\Pipeline\DistributedPipelineOrchestrator.cs:line 80

Analysis

The queue message is the entire pipeline state which includes a list of document partitions. This list can only be so large before it exceeds the maximum allowed queue payload size. (attached doc and pipeline.json)

A scalable design might be to manage state outside of the queue and limit the queue message to a state reference.

malala.pipeline.txt

Source Document

I am Malala_ The Story of the Girl Who Stood Up for Education.pdf

WebScraper should send User-Agent header

When calling ImportWebPageAsync method, the WebScraper class is making the request with no User-Agent header. That header should be included, as per the HTTP RFC (https://www.rfc-editor.org/rfc/rfc2616?ref=blog.elmah.io#section-14.43). Besides, most of the servers rejects the request if that header is not included (some of them even reject a request if the user-agent contains the word "curl/xxxx"... to avoid curl request XD).

I've seen in the source code that there are plans to improve the HttpClient in the WebScraper class (guess the idea is to be injected using an HttpClientFactory), but in the meantime, I'm gonna do a Pull Request just adding the header with the value defined in the Telemetry class. Something like:

// TODO: perf/TCP ports/reuse client
using var client = new HttpClient();
client.DefaultRequestHeaders.UserAgent.ParseAdd(Telemetry.HttpUserAgent);
HttpResponseMessage? response = await RetryLogic()
    .ExecuteAsync(async cancellationToken => await client.GetAsync(url, cancellationToken).ConfigureAwait(false))
    .ConfigureAwait(false);

This should fix the issue with many web servers rejecting the request and not being able to Import the web page.

Cheers!

Deleting documents requires knowing document IDs

Currently, deleting documents or memories requires knowing the document IDs. A lot of times, Apps don't track the document IDs. Instead, they may track something that is more relevant to the apps. For example, a chat app may track documents or memories by chat session IDs, which can be a tag in the Semantic Memory land.

Unfortunately, results returned by the search/ and ask/ endpoints do not contain the document IDs. For apps to delete documents or memories, these endpoints need to return the document IDs.

Proposed fix: add the document ID as a property in the Citation object.

Getting Azure.RequestFailedException due to content filtering

Trying to set up a very simple serverless example and receiving the following error:

Azure.RequestFailedException: 'The response was filtered due to the prompt triggering Azure OpenAI’s content management policy. Please modify your prompt and retry. To learn more about our content filtering policies please read our documentation:
https://go.microsoft.com/fwlink/?linkid=2198766

The code:

var embeddingConfig = new AzureOpenAIConfig
{
    APIType = AzureOpenAIConfig.APITypes.EmbeddingGeneration,
    Auth = AzureOpenAIConfig.AuthTypes.APIKey,
    APIKey = _openaikey,
    Endpoint = _openaiurl,
    Deployment = "text-embedding-ada-002",
};

var textCompletionConfig = new AzureOpenAIConfig
{
    APIType = AzureOpenAIConfig.APITypes.ChatCompletion, // should I use TextCompletion?
    Auth = AzureOpenAIConfig.AuthTypes.APIKey,
    APIKey = _openaikey,
    Endpoint = _openaiurl,
    Deployment = "gpt-35-turbo-16k", // should I use "gpt-35-turbo-instruct" with TextCompletion?
};

var memory = new MemoryClientBuilder()
    .WithAzureOpenAIEmbeddingGeneration(embeddingConfig)
    .WithAzureOpenAITextCompletion(textCompletionConfig)
    .BuildServerlessClient();
await memory.ImportTextAsync("This software is called Visual Studio.");
await memory.ImportTextAsync("My name is Roger.");
var answer = await memory.AskAsync("What's the name of this software?");
Console.WriteLine(answer.Result);

Not clear if I'm doing something wrong with the configuration or if there's a library issue. Would appreciate any help. Thanks!

Support additional embedding providers, particularly OSS ones that can run locally

Feature Request - support other embedding providers/approaches (Doc2Vec, Sentence Transformers, etc). Would love to be able to use Semantic Memory/Semantic Kernel for completely local-running 'free/OSS' POC's that can then be easily migrated 'up' to Azure Open AI/Open AI or other paid services as needed.

microsoft / kernel-memory Goto Github PK

kernel-memory's Introduction

Kernel Memory

Repository Guidance

Kernel Memory (KM) and Semantic Memory (SM)

Supported Data formats and Backends

Kernel Memory in serverless mode

Importing documents into your Kernel Memory can be as simple as this:

Asking questions:

Data lineage, citations

Using Kernel Memory Service

Quick test using the Docker image

To import files using Kernel Memory web service, use MemoryWebClient:

Getting answers via the web service

Custom memory ingestion pipelines

Web API specs

Examples and Tools

Examples

Tools

.NET packages

Packages for Python, Java and other languages

Contributors

kernel-memory's People

Contributors

Stargazers

Watchers

Forkers

kernel-memory's Issues

Observation

Analysis

Source Document

Recommend Projects

Recommend Topics

Recommend Org

Jobs

To import files using Kernel Memory web service, use `MemoryWebClient`: