GithubHelp home page GithubHelp logo

microsoft / sammo Goto Github PK

View Code? Open in Web Editor NEW
301.0 9.0 19.0 2.86 MB

A library for prompt engineering and optimization (SAMMO = Structure-aware Multi-Objective Metaprompt Optimization)

License: MIT License

Python 100.00%
llms prompt-engineering prompt-tuning

sammo's Introduction

Latest PyPI version License: MIT Binder

A flexible, easy-to-use library for running and optimizing prompts for Large Language Models (LLMs).

Overview

How to Get Started

Go to the user guide for examples, how-tos, and API reference.

Just want to have a quick look? Try the live demo on Binder.

Install library only

pip install sammo

Install and run tutorials

Prerequisites

  • Python 3.11+

The following commands will install sammo and jupyter and launch jupyter notebook. It's recommended that you create and activate a virtualenv prior to installing packages.

pip install sammo jupyter

# clone sammo to a local directory
git clone https://github.com/microsoft/sammo.git
cd sammo

# launch jupyter notebook and open tutorials directory
jupyter notebook --notebook-dir docs/tutorials

Use Cases

SAMMO is designed to support

  • Efficient data labeling: Supports minibatching by packing and parsing multiple datapoints into a single prompt.
  • Prompt prototyping and engineering: Re-usable components and prompt structures to quickly build and test new prompts.
  • Instruction optimization: Optimize instructions to do better on a given task.
  • Prompt compression: Compress prompts while maintaining performance.
  • Large-scale prompt execution: parallelization and rate-limiting out-of-the-box so you can run many queries in parallel and at scale without overwhelming the LLM API.

It is less useful if you want to build

  • Interactive, agent-based LLM applications (→ check out AutoGen)
  • Interactive, production-ready LLM applications (→ check out LangChain)

Example

This is extending the chat dialog example from Guidance by running queries in parallel.

runner = OpenAIChat(model_id="gpt-3.5-turbo", api_config=API_CONFIG)
expert_names = GenerateText(
    Template(
        "I want a response to the following question:"
        "{{input}}\n"
        "Name 3 world-class experts (past or present) who would be great at answering this? Don't answer the question yet."
    ),
    system_prompt="You are a helpful and terse assistant.",
    randomness=0,
    max_tokens=300,
)

joint_answer = GenerateText(
    "Great, now please answer the question as if these experts had collaborated in writing a joint anonymous answer.",
    history=expert_names,
    randomness=0,
    max_tokens=500,
)

questions = [
    "How can I be more productive?",
    "What will AI look like in 10 years?",
    "How do we end world hunger?",
]
print(Output(joint_answer).run(runner, questions))

Licence

This project is licensed under MIT.

Authors

SAMMO was written by Tobias Schnabel.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

sammo's People

Contributors

pbourke avatar sammym1982 avatar t-schn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sammo's Issues

Change from 4 number version to 3 number

Currently our version numbers look like "0.1.0.6". This proposal is to make the next version number "0.1.1". 3 number versions are conventional for Python, and are a bit easier to work with using some tooling such as poetry version (not a major issue).

I think for a project of this size, we'll be okay to just use semantic versioning with conventional major, minor and patch releases.

Add guidelines/workflow for contributors

Request to add guidelines/workflow somewhere in README or CONTRIBUTING.md - For example, how to replicate the environment, pre-commit hooks, formatting/linting (e.g., black/flake8) that's expected for a PR to be accepted.

How to modify the number of concurrent requests to AI (perhaps the number of threads)?

question

I want to modify the number of concurrent requests to AI, what should I do?

background

I want to connect Zhipu glm-3-turbo model. When I execute the following script, it prompts that the number of API concurrency is too high. zhipu limits the concurrency of my account to 5, so I want to modify the number of concurrent requests for AI in the current project (perhaps called the number of threads) to be compatible with the excessive concurrency I encountered.

Phenomenon:

image

09:01:20,43: sammo.runners.RetriableError: Server error: 429 {'error': {'code': '1302', 'message': '您当前使用该API的并发数过高,请降低并发,或联系客服增加限额。'}}
09:01:20,120: sammo.runners.RetriableError: Server error: 429 {'error': {'code': '1302', 'message': '您当前使用该API的并发数过高,请降低并发,或联系客服增加限额。'}}
09:01:21,172: sammo.runners.RetriableError: Server error: 429 {'error': {'code': '1302', 'message': '您当前使用该API的并发数过高,请降低并发,或联系客服增加限额。'}}
09:01:21,237: sammo.runners.RetriableError: Server error: 429 {'error': {'code': '1302', 'message': '您当前使用该API的并发数过高,请降低并发,或联系客服增加限额。'}}
09:01:22,66: sammo.runners.RetriableError: Server error: 429 {'error': {'code': '1302', 'message': '您当前使用该API的并发数过高,请降低并发,或联系客服增加限额。'}}
09:01:22,221: sammo.runners.RetriableError: Server error: 429 {'error': {'code': '1302', 'message': '您当前使用该API的并发数过高,请降低并发,或联系客服增加限额。'}}
09:01:23,251: sammo.runners.RetriableError: Server error: 429 {'error': {'code': '1302', 'message': '您当前使用该API的并发数过高,请降低并发,或联系客服增加限额。'}}
09:01:23,288: sammo.runners.RetriableError: Server error: 429 {'error': {'code': '1302', 'message': '您当前使用该API的并发数过高,请降低并发,或联系客服增加限额。'}}

code

from sammo.search import BeamSearch

from sammo.mutators import BagOfMutators, InduceInstructions, Paraphrase
import pathlib
import sammo
from sammo.runners import OpenAIChat
from sammo.base import LLMResult,Costs
from sammo.base import Template, EvaluationScore
from sammo.components import Output, GenerateText, ForEach, Union
from sammo.extractors import ExtractRegex
from sammo.data import DataTable
import json
import requests
import os
from sammo.utils import serialize_json

from sammo.instructions import MetaPrompt, Section, Paragraph, InputData
from sammo.dataformatters import PlainFormatter
from sammo.search_op import one_of
from sammo import PROMPT_LOGGER_NAME
import logging

prompt_logger = logging.getLogger(PROMPT_LOGGER_NAME)




class ZhiPuAIChat(OpenAIChat):
    # 'https://open.bigmodel.cn/api/paas/v4/chat/completions
    BASE_URL = "https://open.bigmodel.cn/api/paas/v4"
    SUFFIX = "/chat/completions"

    async def generate_text(
        self,
        prompt: str,
        max_tokens: int | None = None,
        randomness: float | None = 0.01,
        seed: int = 0,
        priority: int = 0,
        system_prompt: str | None = None,
        history: list[dict] | None = None,
        json_mode: bool = False,
    ) -> LLMResult:
        """Calls the chat endpoint of the OAI model.

        Args:
            prompt: The user prompt.
            max_tokens: The maximum number of tokens to generate. If not set, corresponds to maximum
            available tokens.
            randomness: The randomness to use when generating tokens.
            seed: When using randomness, use this seed for local reproducibility (achieved by caching).
            priority: The priority of the request (used for throttling).

        Returns:
            Dictionary with keys "data" (the generated text), "cost" (the number of tokens used),
            and "retries" (the number of retries).
        """
        messages = []
        if system_prompt is not None:
            messages = [{"role": "system", "content": system_prompt}]
            if history:
                history = [x for x in history if x["role"] != "system"]
        if history is not None:
            messages = messages + history

        # check for images in prompt
        revised_prompt = self._post_process_prompt(prompt)
        messages += [{"role": "user", "content": revised_prompt}]

        request = dict(messages=messages, max_tokens=self._max_context_window or max_tokens, temperature=0.1)
        if json_mode:
            request["response_format"] = {"type": "json_object"}
        fingerprint = serialize_json({"seed": seed, "generative_model_id": self._equivalence_class, **request})

        return await self._execute_request(request, fingerprint, priority)

    def _to_llm_result(self, request: dict, json_data: dict, fingerprint: str | bytes) -> LLMResult:
        request_text = request["messages"][-1]["content"]
        prompt_logger.debug(f"\n\n\nAPI call:\n{request_text}\n->\n\n{json_data['choices'][0]['message']['content']}")
        return LLMResult(
            json_data["choices"][0]["message"]["content"],
            history=request["messages"] + [json_data["choices"][0]["message"]],
            costs=self._extract_costs(json_data),
            request_text=request["messages"][-1]["content"],
        )

    def _post_process_prompt(self, prompt: str):
        return prompt

    @staticmethod
    def _extract_costs(json_data: dict) -> dict:
        return Costs(
            input_costs=json_data["usage"].get("prompt_tokens", 0),
            output_costs=json_data["usage"].get("completion_tokens", 0),
        )

class InititialCandidates:
    def __init__(self, dtrain):
        self.dtrain = dtrain

    def __call__(self):
        example_formatter = PlainFormatter(
            all_labels=self.dtrain.outputs.unique(), orient="item"
        )

        labels = self.dtrain.outputs.unique()
        instructions = MetaPrompt(
            [
                Paragraph("Instructions: "),
                Paragraph(
                    one_of(
                        [
                            self.dtrain.constants["instructions"],
                            "",
                            "Find the best output label given the input.",
                            self.dtrain.constants["instructions"] * 2,
                        ]
                    ),
                    id="instructions",
                ),
                Paragraph("\n"),
                Paragraph(
                    f"Output labels: {', '.join(labels)}\n" if len(labels) <= 10 else ""
                ),
                Paragraph(InputData()),
                Paragraph("Output: "),
            ],
            render_as="raw",
            data_formatter=example_formatter,
        )

        return Output(
            instructions.with_extractor("raise"),
            minibatch_size=1,
            on_error="empty_result",
        )

API_CONFIG_FILE = pathlib.Path().cwd() / "config" / "personal.openai"
API_CONFIG = ""
if API_CONFIG_FILE.exists():
    API_CONFIG = API_CONFIG_FILE
if not API_CONFIG:
    raise ValueError('Please set API_CONFIG to {"api_key": "YOUR_KEY"}')

_ = sammo.setup_logger("WARNING")  # we're only interested in warnings for now

runner = ZhiPuAIChat(
    model_id="glm-3-turbo",
    api_config=API_CONFIG,
    cache=os.getenv("CACHE_FILE", "cache.tsv"),
    timeout=30
)

# %load -s load_data,accuracy _init.py
def load_data(
    url="https://raw.githubusercontent.com/SinMu-L/BIG-bench/main/bigbench/benchmark_tasks/implicatures/task.json",
):
    task = json.loads(requests.get(url).content)

    # convert label to single string
    # for x in task["examples"]:
    #     x["output"] = max(x["target_scores"], key=x["target_scores"].get)

    return DataTable.from_records(
        task["examples"],
        input_fields="input",
        constants={"instructions": task["task_prefix"]},
    )


def accuracy(y_True: DataTable, y_pred: DataTable) -> EvaluationScore:
    y_True = y_True.outputs.normalized_values()
    y_pred = y_pred.outputs.normalized_values()
    n_correct = sum([y_p == y_t for y_p, y_t in zip(y_pred, y_True)])

    return EvaluationScore(n_correct / len(y_True))

mydata = load_data()
d_train = mydata.sample(8, seed=42)


mutation_operators = BagOfMutators(
    InititialCandidates(d_train),
    InduceInstructions({"id": "instructions"}, d_train),
    Paraphrase({"id": "instructions"}),
    sample_for_init_candidates=False,
)

prompt_optimizer = BeamSearch(
            runner,
            mutation_operators,
            accuracy,
            maximize=True,
            depth=3,
            mutations_per_beam=2,
            n_initial_candidates=4,
            beam_width=4,
            add_previous=True,
    )
prompt_optimizer.fit(d_train)
prompt_optimizer.show_report()
print(prompt_optimizer.best_prompt)

ps:Wish everything goes well for now

Weird import errors.

I've tried to install sammo on my m1 macbook using
poetry install
and using
pip install sammo jupyter
but when I try to run the any of the notebooks I get

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[4], line 3
      1 # %load -r :27 _init.py
      2 import pathlib
----> 3 import sammo
      4 from sammo.runners import OpenAIChat
      5 from sammo.base import Template, EvaluationScore

File ~/PycharmProjects/github/sammo/sammo/__init__.py:4
      1 # Copyright (c) Microsoft Corporation.
      2 # Licensed under the MIT License.
      3 import logging
----> 4 import beartype
      5 import sammo.utils as utils
      6 from pathlib import Path

even though:

(sammo-py3.11) datascience@headsmac sammo % pip show beartype
Name: beartype
Version: 0.15.0
Summary: Unbearably fast runtime type checking in pure Python.
Home-page: https://beartype.readthedocs.io
Author: Cecil Curry, et al.
Author-email: [email protected]
License: MIT
Location: /Users/datascience/Library/Caches/pypoetry/virtualenvs/sammo-I0aNzNSL-py3.11/lib/python3.11/site-packages
Requires: 
Required-by: sammo
(sammo-py3.11) datascience@headsmac sammo % 

Enable build pipeline and require PRs for merge to main

Enable build pipeline via GitHub Actions to automate quality checks and artifact assembly:

  • add branch protection rule for main requiring PR to merge
  • run pre-commit hooks
  • run pytest tests
  • run mypy type checks
  • fail build if any of above steps fail
  • build (but do not publish) documentation
  • build (but do not publish) PyPI package

Require a PR to merge to main (disable direct push), and require a green build pipeline before merge is allowed.

Subsequent issues will cover the release process wherein a new version is created and the code and documentation artifacts are published (to PyPI, GH Pages and readthedocs). Additional checks to consider in the future include test and documentation coverage reports.

Right now the SAMMO docs site hosted on GH Pages is built from a separate gh-pages branch. This issue moves towards building docs and code from the same branch (so, moving from "publishing from a branch" style to "publishing with a custom GitHub Actions workflow")

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.