GithubHelp home page GithubHelp logo

semanticassertions's Introduction

SemanticAssertions - Testing Library for Large Language Models (LLMs)

SemanticAssert is a testing library designed to address the unique challenges that arise when working with Large Language Models (LLMs) like OpenAI's GPT-3. When developing applications and systems that utilize LLMs, one of the primary hurdles is dealing with the non-deterministic nature of the model's responses. LLMs can generate different responses for the same input, making it difficult to predict their exact output. This leads to a fundamental question: How can we effectively test LLM-based applications?

Imagine a scenario where you are developing a Question Answering (QA) system. In this context, the model receives user questions, and the expectation is for it to provide answers by searching for relevant information in documents, such as user manuals for washing machines in PDF format. The process involves generating embeddings from these documents, finding the extract that best answers the user's query, and using the LLM to craft a natural language response based on the obtained information. The challenge arises when you attempt to create meaningful tests for the final part of this process, or the entire QA cycle. LLM responses are not deterministic, which means that the same input may yield varying responses, even if the core content remains the same. However, despite the variability in verbosity and phrasing, you have a clear idea of the expected content of the response. SemanticAssert helps you address this challenge by allowing you to define and validate the core content of LLM-generated responses, regardless of their specific wording. It offers a structured approach to testing LLM-powered applications, ensuring that your system consistently provides accurate and relevant information to users.

How Does SemanticAssert Work?

The concept behind SemanticAssert is straightforward. SemanticAssert leverages a Large Language Model (LLM) to determine the correctness of responses. Let's take a look at a simple example of a test case:

string expected = "Mount Teide has 3718 meters";
string actual =  "Mount Teide, located on the island of Tenerife in Spain, has an elevation of approximately 3718 meters above sea level. It is the highest peak in Spain and one of the tallest volcanoes in the world when measured from its base on the ocean floor."

await Async.Assert.AreSimilar(expected, actual);

In this code snippet, Async.Assert.AreSimilar is employed to compare the 'expected' and 'actual' responses. It ensures that the core content in the 'actual' response matches what is expected, allowing for variations in wording, which is especially useful when dealing with non-deterministic LLM responses. This comparison helps verify the accuracy and relevance of the information provided by the LLM.

Additional Asserts for Your Daily Use

In addition to the 'AreSimilar' Assert, SemanticAssert provides several other Asserts that can simplify your daily testing routines. Here are a couple of examples:

Asserting Similarity with a Threshold

You can use the 'AreSimilar' Assert with a similarity threshold, which specifies the minimum similarity value you consider valid between the 'expected' and 'actual' responses. This is especially useful when you want to allow for some variation in the responses generated by the Large Language Model (LLM). Here's an example:

string expected = "Mount Teide has 3718 meters";
string actual =  "Mount Teide, located on the island of Tenerife in Spain, has an elevation of approximately 3718 meters above sea level. It is the highest peak in Spain and one of the tallest volcanoes in the world when measured from its base on the ocean floor."

await Async.Assert.AreSimilar(expected, actual, similarityThreshold: 0.8);

In this code, the 'similarityThreshold' parameter allows you to define the minimum acceptable similarity between the 'expected' and 'actual' responses.

Language Consistency Assertion

SemanticAssert also provides an Assert for verifying that your texts are in the same language. This can be crucial when working with multilingual content or LLMs that may produce responses in various languages. Ensuring language consistency in your responses is essential for providing accurate information to users.

string expected = "This is a text in English";
string actual =  "This is another text that should not raise an exception because it's in the same language";

await Async.Assert.AreInSameLanguage(expected, actual);

In this code snippet, the 'AreInSameLanguage' Assert is used to confirm that both 'expected' and 'actual' are in the same language, ensuring language consistency for your application's responses. This is crucial for providing coherent and accurate information to users.

Configuration

To use SemanticAssert, you need to configure it to work with a Large Language Model (LLM). Currently, SemanticAssert is compatible with Azure OpenAI. Configuration is straightforward and can be done as follows:

Azure Text Completion Configuration

Configuration.Completion.AddAzureTextCompletion(
    "<Your_AzureOpenAI_ChatDeploymentName>",
    "<Your_AzureOpenAI_Endpoint>",
    "<Your_AzureOpenAI_ApiKey>"
);

Azure Text Embedding Generation Configuration

Configuration.Embeddings.AddAzureTextEmbeddingGeneration(
    "<Your_AzureOpenAI_TextEmbeddingsDeploymentName>",
    "<Your_AzureOpenAI_Endpoint>",
    "<Your_AzureOpenAI_ApiKey>"
);

By using these configurations, you can set up SemanticAssert to work seamlessly with your Azure OpenAI deployment.

If you wish to customize the default configurations of SemanticAssert, you can do so through some static methods:

Configuration.AssertProvider.AddAssertProvider(new SKCosineAssertProvider());

The above snippet allows you to generate and compare embeddings when using similarity functions with a threshold. This offers an alternative to using a 'Prompt' for comparisons.

Important

Please note that this documentation is a work in progress, and more details will be added in the future to help you make the most of SemanticAssert.

semanticassertions's People

Contributors

lmarcos000 avatar luism000 avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.