GithubHelp home page GithubHelp logo

oughtinc / ice Goto Github PK

View Code? Open in Web Editor NEW
525.0 11.0 65.0 6.4 MB

Interactive Composition Explorer: a debugger for compositional language model programs

Home Page: https://ice.ought.org

License: MIT License

Shell 0.07% Python 91.21% TypeScript 8.30% JavaScript 0.18% CSS 0.21% HTML 0.04%
debugging gpt-3 python language-model

ice's Introduction

Interactive Composition Explorer 🧊

ICE is a Python library and trace visualizer for language model programs.

Screenshot

ice-screenshot Execution trace visualized in ICE

Features

  • Run language model recipes in different modes: humans, human+LM, LM
  • Inspect the execution traces in your browser for debugging
  • Define and use new language model agents, e.g. chain-of-thought agents
  • Run recipes quickly by parallelizing language model calls
  • Reuse component recipes such as question-answering, ranking, and verification

ICE is pre-1.0

⚠️ The ICE API may change at any point. The ICE interface is being actively developed and we may change the API at any point, including removing functionality, renaming methods, splitting ICE into multiple projects, and other similarly disruptive changes. Use at your own risk.

Requirements

ICE requires Python 3.9, 3.10, or 3.11. If you don't have a supported version of Python installed, we recommend using pyenv to install a supported Python version and manage multiple Python versions.

If you use Windows, you'll need to run ICE inside of WSL.

Getting started

  1. As part of general good Python practice, consider first creating and activating a virtual environment to avoid installing ICE 'globally'. For example:

    python -m venv venv
    source venv/bin/activate
  2. Install ICE:

    pip install ought-ice
  3. Run the Hello World recipe in the Primer to see the trace rendered.

  4. Optionally, set secrets (like your OpenAI API key) in ~/.ought-ice/.env. See .env.example for the format. If these are not set, you'll be prompted for them when you run recipes that need them.

Developing ICE

  1. If you want to make changes to ICE itself, clone the repository, then install it in editable mode:

    python -m venv venv
    source venv/bin/activate
    pip install --upgrade pip
    pip install -e '.[dev]' --config-settings editable_mode=compat
    pre-commit install
    npm --prefix ui ci
    npm --prefix ui run dev
  2. If you're working on the backend, you might find it helpful to remove the cache of language model calls:

    rm -r ~/.ought-ice/cache
  3. pre-commit complains if your code doesn't pass certain checks. It runs when you commit, and will possibly reject your commit and make you have to fix the problem(s) before you can commit again. (So you should probably use the same commit message you used the first time.)

Note that you don't technically need to run pre-commit install, but not doing so may cause your commits to fail CI. (Which can be noisy, including by generating commits that will e.g. fix formatting.)

Storybook

We use Storybook for UI tests. You can run them locally:

npm --prefix ui run storybook

Note that build-storybook is only for CI and shouldn't be run locally.

Terminology

  • Recipes are decompositions of a task into subtasks.

    The meaning of a recipe is: If a human executed these steps and did a good job at each workspace in isolation, the overall answer would be good. This decomposition may be informed by what we think ML can do at this point, but the recipe itself (as an abstraction) doesn’t know about specific agents.

  • Agents perform atomic subtasks of predefined shapes, like completion, scoring, or classification.

    Agents don't know which recipe is calling them. Agents don’t maintain state between subtasks. Agents generally try to complete all subtasks they're asked to complete (however badly), but some will not have implementations for certain task types.

  • The mode in which a recipe runs is a global setting that can affect every agent call. For instance, whether to use humans or agents. Recipes can also run with certain RecipeSettings, which can map a task type to a specific agent_name, which can modify which agent is used for that specific type of task.

Additional resources

  1. Join the ICE Slack channel to collaborate with other people composing language model tasks. You can also use it to ask questions about using ICE.

  2. Watch the recording of Ought's Lab Meeting to understand the high-level goals for ICE, how it interacts with Ought's other work, and how it contributes to alignment research.

  3. Read the ICE announcement post for another introduction.

Contributions

ICE is an open-source project by Ought. We're an applied ML lab building the AI research assistant Elicit.

We welcome community contributions:

  • If you're a developer, you can dive into the codebase and help us fix bugs, improve code quality and performance, or add new features.
  • If you're a language model researcher, you can help us add new agents or improve existing ones, and refine or create new recipes and recipe components.

For larger contributions, make an issue for discussion before submitting a PR.

And for even larger contributions, join us - we're hiring!

How to cite

If you use ICE, please cite:

Iterated Decomposition: Improving Science Q&A by Supervising Reasoning Processes. Justin Reppert, Ben Rachbach, Charlie George, Luke Stebbing Jungwon Byun, Maggie Appleton, Andreas Stuhlmüller (2023). Ought Technical Report. arXiv:2301.01751 [cs.CL]

Bibtex:

@article{reppert2023iterated,
  author = {Justin Reppert and Ben Rachbach and Charlie George and Luke Stebbing and Jungwon Byun and Maggie Appleton and Andreas Stuhlm\"{u}ller},
  archivePrefix = {arXiv},
  eprint = {2301.01751},
  primaryClass = {cs.CL},
  title = {Iterated Decomposition: Improving Science Q&A by Supervising Reasoning Processes},
  year = 2023,
  keywords = {language models, decomposition, workflow, debugging},
  url = {https://arxiv.org/abs/2301.01751}
}

ice's People

Contributors

alexmojaki avatar amosjyng avatar brachbach avatar cdreetz avatar cg80499 avatar dependabot[bot] avatar eltociear avatar goodgravy avatar jgzuke avatar jungofthewon avatar kruckenberg avatar lslunis avatar peterroelants avatar phylomatx avatar poppingtonic avatar pre-commit-ci[bot] avatar reppertj avatar rickycheah avatar smithjessk avatar stuhlmueller avatar thesophiaxu avatar tommybark avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ice's Issues

ICE UI stops working

Hi! I'm really excited about the project and the UI was working for a while but it seems like some state got corrupted somewhere, because all of a sudden I get this error message instead of the UI:

ui/index.html not found. Run `npm run build` in the ui directory to create it, or run `npm run dev` to run the dev server and access localhost:5173 instead.

I'm not building ICE from source so I'm pretty confused as to how this could have happened.

I see that this error has happened to someone else, but it wasn't resolved there.
#154 (comment)

Any help would be really appreciated, thanks!

hello world program not running

Hello.

I followed the instructions from the ReadME and the primer to run the hello world program and I am facing some issues.

I get the following when I try to run a simple program:

from ice.recipe import recipe
  File "/opt/homebrew/lib/python3.10/site-packages/ice/recipe.py", line 24, in <module>
    from pydantic import BaseSettings
  File "/opt/homebrew/lib/python3.10/site-packages/pydantic/__init__.py", line 210, in __getattr__
    return _getattr_migration(attr_name)
  File "/opt/homebrew/lib/python3.10/site-packages/pydantic/_migration.py", line 289, in wrapper
    raise PydanticImportError(
pydantic.errors.PydanticImportError: `BaseSettings` has been moved to the `pydantic-settings` package. See https://docs.pydantic.dev/2.1.1/migration/#basesettings-has-moved-to-pydantic-settings for more details.

Which version of pydantic should I use?

Running test recipes in README

The README suggests in https://github.com/oughtinc/ice#running-tests:

Cheap integration tests:

./scripts/run-recipe.sh --mode test

This had me confused for a while.

  1. This prompts me select a recipe, which is already surprising. I expected it to automatically pick parameters for testing, I don't know which recipe I should select.
  2. The first few choices in the list all require papers, so selecting any of them gives a long traceback ending in TypeError: missing a required argument: 'paper' which looks more like a fundamental bug than incorrect user inputs.
  3. Once I figured out that I needed to specify other arguments to pass papers, it wasn't clear what those arguments should be. The CLI help doesn't offer any example values. Maybe the README could suggest something like:
./scripts/run-recipe.sh --mode test --recipe-name AdherenceKeywordBaseline --input-files ./papers/keenan-2018-tiny.txt
  1. I see this relevant-looking comment:

ice/main.py

Lines 151 to 153 in 5ee4161

# If user doesn't specify papers via CLI args, we could prompt them
# but this makes it harder to run recipes that don't take papers as
# arguments, so we won't do that here.

Possible approaches for dealing with this, which I could do if you want:

  • Prompt for papers if the recipe takes them.
  • Filter the list of recipes to choose from based on whether or not papers are specified via CLI args.
  • Show a more helpful error message without the long traceback.
  1. Running with those arguments gives output ending abruptly in a title saying Final result for keenan-2018-tiny.txt with nothing underneath. Where are these results of which it speaks? Did the test pass? Should I use different arguments?
  2. Running the unit tests with ./scripts/run-tests.sh runs all recipes with and without papers automatically.
    1. Why do the 'unit tests' apparently include 'integration tests'?
    2. Why suggest run-recipe.sh if run-tests.sh takes care of this already?

npm error when running ./scripts/run-local.sh

I recently merged from main, after which the ./scripts/run-local.sh script rebuilds the container image, failing att the npm build.

Docker version 20.10.20, build 9fdeb9c

 => ERROR [10/11] RUN npm --prefix ui ci                                                    
------                                                                                      
 > [10/11] RUN npm --prefix ui ci:                                                          
#0 39.86 npm ERR! code ERR_SOCKET_TIMEOUT                                                   
#0 39.86 npm ERR! network Socket timeout                                                    
#0 39.86 npm ERR! network This is a problem related to network connectivity.                
#0 39.86 npm ERR! network In most cases you are behind a proxy or have bad network settings.
#0 39.86 npm ERR! network 
#0 39.86 npm ERR! network If you are behind a proxy, please make sure that the
#0 39.86 npm ERR! network 'proxy' config is set properly.  See: 'npm help config'
#0 39.86 
#0 39.86 npm ERR! A complete log of this run can be found in:
#0 39.86 npm ERR!     /root/.npm/_logs/2022-11-20T21_19_50_106Z-debug-0.log
------
failed to solve: executor failed running [/bin/sh -c npm --prefix ui ci]: exit code: 1

Suggestion: optional rich tracebacks

Currently an unhandled exception just shows the standard plain Python traceback. It can be helpful for debugging to show more information, and the dependencies are already there. One method is this:

    log = get_logger()

    def excepthook(*exc_info):
        log.exception("Uncaught exception", exc_info=exc_info)

    sys.excepthook = excepthook

However this doesn't work on its own, and a helpful warning explains why:

/code/.venv/lib/python3.10/site-packages/structlog/dev.py:421: UserWarning: Remove `format_exc_info` from your processor chain if you want pretty exceptions.

Indeed, removing that processor gives nice rich tracebacks:

Screenshot from 2022-09-22 14-51-17

I'm not familiar with structlog, maybe there's a better official way to do this. I also don't know the motivation for having format_exc_info and the consequences of removing it.

An alternative, more direct approach:

rich.traceback.install(show_locals=True)

This has pretty much the same effect, but it's not logging, and maybe that's a problem.

None of this should be the default behaviour as the tracebacks are huge and annoying when you don't want them, but maybe it could be enabled by a new env var in settings.py or some kind of verbose/debug CLI argument.

Error when running Primer: PydanticImportError

While I trying to run the example from the hello world chapter in the Ought Primer I consistently get this error:

Traceback (most recent call last):
  File "/Users/ptmalmgren/src/ice-demo/hello.py", line 3, in <module>
    from ice.recipe import recipe
  File "/Users/ptmalmgren/Library/Caches/pypoetry/virtualenvs/ice-demo-25sS_li7-py3.10/lib/python3.10/site-packages/ice/recipe.py", line 24, in <module>
    from pydantic import BaseSettings
  File "/Users/ptmalmgren/Library/Caches/pypoetry/virtualenvs/ice-demo-25sS_li7-py3.10/lib/python3.10/site-packages/pydantic/__init__.py", line 210, in __getattr__
    return _getattr_migration(attr_name)
  File "/Users/ptmalmgren/Library/Caches/pypoetry/virtualenvs/ice-demo-25sS_li7-py3.10/lib/python3.10/site-packages/pydantic/_migration.py", line 289, in wrapper
    raise PydanticImportError(
pydantic.errors.PydanticImportError: `BaseSettings` has been moved to the `pydantic-settings` package. See https://docs.pydantic.dev/2.1.1/migration/#basesettings-has-moved-to-pydantic-settings for more details.

For further information visit https://errors.pydantic.dev/2.1.1/u/import-error

Because setup.cfg has no pinned version for Pydantic, I believe it is defaulting to the latest Pydantic 2.1.1 release which contains backwards-incompatible changes.

ICE server not accessible through jupyter proxy

Hi I am trying to use the Ought ICE server on a SageMaker notebook which only allows access to the local server through the jupyter server proxy. I am finding that when I go to the proxy path <notebook_path>/proxy/8935/ there is a blank page because the assets are not accessible since they are trying to be loaded from <notebook_path>/assets/ instead of <notebook_path>/proxy/8935/assets. I tried to manually change the index.html to look at the correct path and wrote a small flask server to make the assets available but now I am greeted with another internal server error which I am not even able to debug because there are no logs to be found anywhere (I was hoping to find logs under ~/.ought-ice/ but none were to be found.
Screenshot 2023-06-20 at 10 10 13 PM

I was wondering, any chance you could provide proper support for running the server behind jupyter proxy to enable working from inside SageMaker jupyter notebooks?. If that is not on your roadmap, any thoughts on how I can go about debugging this issue to get it working?

Thank you very much for your attention to the issue.

Unclear how to use tracing within applications

I tried running an example recipe with python hello.py - and it works perfectly.

However, I can not find any instructions on how to use tracing functionality not as a standalone demonstration, but inside other applications.

A simplest example would be:

hello.py:

from ice.recipe import recipe

async def say_hello():
    return "Hello world!"

recipe.main(say_hello)

test.py:

from hello import say_hello
import asyncio

if __name__ == '__main__':
    print(asyncio.run(say_hello()))

Which does not work when run with python test.py - no traces being recorded.

Could anyone please point me towards a way to achieve this?

New release?

Hey,

I see that ICE seems to be compatible with Python 3.9 and 3.11 now on the master branch. Will there be a new release soon to make this official?

Thanks!

Potential bug in `getaddrinfo` on MacOS

Sometimes openai_complete fails when it tries to use socket.getaddrinfo to look up api.openai.com. He thinks that the issue is likely a threading bug in the underlying syscall because adding @functools.cache to the relevant function prevents the bug from happening.

Talked with @lslunis about this. It's probably not worth doing until we see the issue again, and I can't reproduce it locally on my MacBook.

Skipping slow tests

When I run ./scripts/run-tests.sh, all tests show PASSED except two:

tests/test_metrics.py::test_nubia FAILED
tests/test_metrics.py::test_gold_paragraphs SKIPPED (Parses all PDFs - very slow)
  1. test_nubia fails with an obscure pydantic validation error, I had to add a print to understand the problem. How about adding @mark.skipif(not settings.OUGHT_INFERENCE_API_KEY, reason='...')?
  2. Both tests have @mark.slow but this doesn't do anything by default. I assume this is why test_gold_paragraphs has a hard @mark.skip. This doesn't seem great - what do you do when you actually want to run the test? Should a way to skip slow tests by default (with or without using @mark.slow) be investigated?
  3. -m 'not slow' correctly deselects the slow tests, but only if I directly run docker compose and pytest. ./scripts/run-tests.sh -m 'not slow' doesn't work, and the shell (tested both bash and zsh) shows it running with -m 'not\' slow. Looks like this isn't working:

# https://superuser.com/questions/403263/how-to-pass-bash-script-arguments-to-a-subshell
extra_pytest_args="$(printf "${1+ %q}" "$@")" # Note: this will have a leading space before the first arg

Unable to parse PDFs, "Failed to resolve 'test.elicit.org'"

When trying to run the "Loading paper text" chapter from the Primer, I run into an error indicating that it can't find "test.elicit.org". Since paper.parse_pdf depends on this remote resource to parse the PDF, it can't proceed at all.

Here's a full trace of what I see:

Full trace
python recipes/paper_hello.py --paper papers/keenan-2018.pdf
/home/cass/src/ice/venv/lib/python3.11/site-packages/pydantic/_migration.py:283: UserWarning: `pydantic.generics:GenericModel` has been moved to `pydantic.BaseModel`.
  warnings.warn(f'`{import_path}` has been moved to `{new_location}`.')
/home/cass/src/ice/venv/lib/python3.11/site-packages/pydantic/_internal/_config.py:334: UserWarning: Valid config keys have changed in V2:
* 'keep_untouched' has been renamed to 'ignored_types'
* 'fields' has been removed
  warnings.warn(message, UserWarning)
Traceback (most recent call last):
  File "/home/cass/src/ice/venv/lib/python3.11/site-packages/urllib3/connection.py", line 198, in _new_conn
    sock = connection.create_connection(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cass/src/ice/venv/lib/python3.11/site-packages/urllib3/util/connection.py", line 60, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cass/.pyenv/versions/3.11.0/lib/python3.11/socket.py", line 961, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
socket.gaierror: [Errno -2] Name or service not known

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/cass/src/ice/venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 793, in urlopen
    response = self._make_request(
               ^^^^^^^^^^^^^^^^^^^
  File "/home/cass/src/ice/venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 491, in _make_request
    raise new_e
  File "/home/cass/src/ice/venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 467, in _make_request
    self._validate_conn(conn)
  File "/home/cass/src/ice/venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 1099, in _validate_conn
    conn.connect()
  File "/home/cass/src/ice/venv/lib/python3.11/site-packages/urllib3/connection.py", line 616, in connect
    self.sock = sock = self._new_conn()
                       ^^^^^^^^^^^^^^^^
  File "/home/cass/src/ice/venv/lib/python3.11/site-packages/urllib3/connection.py", line 205, in _new_conn
    raise NameResolutionError(self.host, self, e) from e
urllib3.exceptions.NameResolutionError: <urllib3.connection.HTTPSConnection object at 0x751bb09bb4d0>: Failed to resolve 'test.elicit.org' ([Errno -2] Name or service not known)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/cass/src/ice/venv/lib/python3.11/site-packages/requests/adapters.py", line 589, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/home/cass/src/ice/venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 847, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/home/cass/src/ice/venv/lib/python3.11/site-packages/urllib3/util/retry.py", line 515, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='test.elicit.org', port=443): Max retries exceeded with url: /elicit-previews/james/oug-3083-support-parsing-arbitrary-pdfs-using/parse_pdf (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x751bb09bb4d0>: Failed to resolve 'test.elicit.org' ([Errno -2] Name or service not known)"))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/cass/src/ice/recipes/paper_hello.py", line 10, in <module>
    recipe.main(answer_for_paper)
  File "/home/cass/src/ice/ice/recipe.py", line 176, in main
    defopt.run(
  File "/home/cass/src/ice/venv/lib/python3.11/site-packages/defopt.py", line 348, in run
    call = bind(
           ^^^^^
  File "/home/cass/src/ice/venv/lib/python3.11/site-packages/defopt.py", line 255, in bind
    call, rest = _bind_or_bind_known(*args, _known=False, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cass/src/ice/venv/lib/python3.11/site-packages/defopt.py", line 203, in _bind_or_bind_known
    args, rest = parser.parse_args(argv), []
                 ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cass/.pyenv/versions/3.11.0/lib/python3.11/argparse.py", line 1862, in parse_args
    args, argv = self.parse_known_args(args, namespace)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cass/.pyenv/versions/3.11.0/lib/python3.11/argparse.py", line 1895, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cass/.pyenv/versions/3.11.0/lib/python3.11/argparse.py", line 2103, in _parse_known_args
    start_index = consume_optional(start_index)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cass/.pyenv/versions/3.11.0/lib/python3.11/argparse.py", line 2043, in consume_optional
    take_action(action, args, option_string)
  File "/home/cass/.pyenv/versions/3.11.0/lib/python3.11/argparse.py", line 1955, in take_action
    argument_values = self._get_values(action, argument_strings)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cass/.pyenv/versions/3.11.0/lib/python3.11/argparse.py", line 2485, in _get_values
    value = self._get_value(action, arg_string)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cass/.pyenv/versions/3.11.0/lib/python3.11/argparse.py", line 2518, in _get_value
    result = type_func(arg_string)
             ^^^^^^^^^^^^^^^^^^^^^
  File "/home/cass/src/ice/ice/recipe.py", line 181, in <lambda>
    Paper: lambda path: Paper.load(Path(path)),
                        ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cass/src/ice/ice/paper.py", line 158, in load
    paragraph_dicts = parse_pdf(file)
                      ^^^^^^^^^^^^^^^
  File "/home/cass/src/ice/ice/cache.py", line 28, in sync_wrapper
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File "/home/cass/src/ice/ice/paper.py", line 119, in parse_pdf
    r = requests.post(PDF_PARSER_URL, files=files)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cass/src/ice/venv/lib/python3.11/site-packages/requests/api.py", line 115, in post
    return request("post", url, data=data, json=json, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cass/src/ice/venv/lib/python3.11/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cass/src/ice/venv/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cass/src/ice/venv/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cass/src/ice/venv/lib/python3.11/site-packages/requests/adapters.py", line 622, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='test.elicit.org', port=443): Max retries exceeded with url: /elicit-previews/james/oug-3083-support-parsing-arbitrary-pdfs-using/parse_pdf (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x751bb09bb4d0>: Failed to resolve 'test.elicit.org' ([Errno -2] Name or service not known)"))

❓ Is there an alternative that folks recommend for PDF parsing here?

Use safer eval

I would suggest using a safer version of eval by default in the Interpreters page (e.g. worry about generated code making web requests).

asteval looks simple to use for mathematical expressions, which is all the tutorial requres.

Trace bug when using Pandas objects in recipes

The trace can't handle dataframes and series well

Repro:

  1. pull placebo_qa_trace_broken in ice-ought-private
  2. run python ought/eval/eval_paper_qa_vs_gs.py --recipe-to-run ought/placebo/placebo_keyword_baseline.py:placebo_keyword --gs-df /Users/benjaminrachbach/ought/ice/gold_standards/gold_standards.csv --splits validation --question-short-name placebo

The recipe runs, but the trace will error:

... 
  File "/Users/benjaminrachbach/ought/ice/ice/trace.py", line 351, in get_strings
    result = _get_first_descendant(value)
  File "/Users/benjaminrachbach/ought/ice/ice/trace.py", line 374, in _get_first_descendant
    if value:
  File "/Users/benjaminrachbach/ought/ice-ought-private/.venv/lib/python3.10/site-packages/pandas/core/generic.py", line 1527, in __nonzero__
    raise ValueError(
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

this seems to be because _get_first_descendant is at some point checking the truth value of one of the dataframes used/outputted in the recipe

Make it clear when a response is from cache

I think the interface should do either or both of:
a) Make it clear when a response is from cache
b) Show the correct cost ($0.00) for responses from cache

This would help users to track their costs correctly.

Screenshot 2023-08-18 at 14 14 09

Increase in temperature does not seem to induce randomness

Most likely I have missed something obvious, but I can't seem to induce non-deterministic samples when increasing the temperature parameter.

Reproduction steps:

  1. Change this line to hardcode temperature = 1 (you can also create a new OpenAI Agent type with a higher temperature and call it; this is just more direct)
  2. Run any simple recipe (for example question-answering from the primer).

Expected outcome:

Run 1

image

Run 2

image

Actual outcome:

root@164d19ecd881:/code# python ./ice/recipes/primer/qa.py 
Trace: http://localhost:3000/traces/01GF5472H56ZJHHFMWJSMZPXCD
A hackathon is happening on 9/9/2022.
root@164d19ecd881:/code# python ./ice/recipes/primer/qa.py 
Trace: http://localhost:3000/traces/01GF5477SGFZ3FTP5KFK0AGH6G
A hackathon is happening on 9/9/2022.

Issues with loading traces

Commit: be5e53d, which is just documentation ahead of 10ca67e

image

Specify tracing file format/API

ICE is a great tool with a nice UI, and it would be awesome if it could integrate with popular tools like langchain. To do this we need to describe the data structure/types of the JSON file formats in trace.py. One way to do this would be to make the Trace class inherit from the pydantic BaseModel, and then rather than using a static file handler with FastAPI for traces you could set the response_model=List[Trace] which would autogenerate OpenAPI docs that include the trace data structure. But it looks like there's a lot of custom code in trace.py that I don't fully get so that may be tough.

Even just running ICE and getting an example trace, and putting it in the repo/documentation, and maybe passing it through Quicktype to generate a JSON schema from the JSON would help other people use this tool. Thanks!

Tests don't run on fork PRs

In PRs such as #208 (cc @smithjessk) from external contributors, the Tests check gets stuck on yellow saying:

Expected — Waiting for status to be reported

Screenshot from 2023-01-12 13-02-27

Looking at https://github.com/oughtinc/ice/actions/workflows/tests.yml, the action isn't running at all, so I take it on: [push] doesn't apply to forks.

Based on https://github.com/orgs/community/discussions/26698, it seems the check is 'Expected' because of the main branch protection rule, and of course it remains expected forever since it never runs.

If we do run tests on fork PRs, then currently some tests will fail as secrets such as OPENAI_API_KEY won't be accessible. #193 will partly solve this by running primer recipes in test mode. A few non-primer recipes call openai_complete directly, but they will probably be moved out of the repo soonish anyway.

Pydantic version must be <2 and >=1.10.0 (Python 3.11 and ICE 0.5.0)

Context

I ran into some dependency issues when going through the Ought ICE primer. I thought I'd share the fix in case anyone else is in the same situation. For context, I'm using Python 3.11, and ought-ice == 0.5.0 installed via pip.

Problem

Due to #312, I needed to use pydantic < 2. I had Pydantic 1.9.x in my virtual environment, so I thought it would be compatible. But then I got this eventual error when importing ice:

  File "/Users/miloknowles/envs/ought-primer-SDTjr-vl/lib/python3.11/site-packages/pydantic/utils.py", line 258, in generate_model_signature
    merged_params[param_name] = Parameter(
                                ^^^^^^^^^^
  File "/Users/miloknowles/.pyenv/versions/3.11.7/lib/python3.11/inspect.py", line 2725, in __init__
    raise ValueError('{!r} is not a valid parameter name'.format(name))
ValueError: 'not' is not a valid parameter name

Solution

Due to this issue with FastAPI, I needed to upgrade to pydantic >= 1.10.0. This fixed the above error.

Maybe it would be helpful to pin a more precise version of Pydantic?

Recipe trace page returns {"detail": "Not found"}

I just updated to the latest version (without docker)

After installing the new system in a new virtualenv and checking that my recipes still work with a successful query, I want to go to the trace page and investigate the answers and process, however I get a blank page.

Parsed de97451f-8789-4b75-9b39-a1cda44fcae3-InterpretabilityInTheWild.pdf.
Trace: http://localhost:8935/traces/01GM0RQ60TW1QPZA0G1B244N0W            
...

Screenshot from 2022-12-11 17-17-53

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.