aspuru-guzik-group / olympus Goto Github PK

View Code? Open in Web Editor NEW

79.0 12.0 23.0 148.6 MB

Olympus: a benchmarking framework for noisy optimization and experiment planning

Home Page: https://aspuru-guzik-group.github.io/olympus/

License: MIT License

Python 2.34% Shell 0.32% Jupyter Notebook 97.34%

optimization machine-learning chemistry materials-science experimental-design

olympus's Introduction

Olympus: a benchmarking framework for noisy optimization and experiment planning

Olympus provides a consistent and easy-to-use framework for benchmarking optimization algorithms. With olympus you can:

Build optimization domains using continuous, discrete and categorical parameter types.
Access a suite of 23 experiment planning algortihms via a simple and consistent interface
Access 33 experimentally-derived benchmarks and 33 analytical test functions for optimization benchmarks
Easily integrate custom optimization algorithms
Easily integrate custom datasets, which can be used to train models for custom benchmarks
Enjoy extensive plotting and analysis options for visualizing your benchmark experiments

You can find more details in the documentation.

Installation

Olympus can be installed with pip:

pip install olymp

The package can also be installed via conda:

conda install -c conda-forge olymp

Finally, the package can be built from source:

git clone https://github.com/aspuru-guzik-group/olympus.git
cd olympus
python setup.py develop

You can explore Olympus using the following Colab notebook:

Dependencies

The installation only requires:

python >= 3.6
numpy
pandas

Additional libraries are required to use specific modules and objects. Olympus will alert you about these requirements as you try access the related functionality.

Use cases

The following projects have used Olympus to streamline the benchmarking of optimization algorithms.

Citation

Olympus is an academinc research software. If you make use of it in scientific publications, please cite the following articles:

@article{hase_olympus_2021,
      author = {H{\"a}se, Florian and Aldeghi, Matteo and Hickman, Riley J. and Roch, Lo{\"\i}c M. and Christensen, Melodie and Liles, Elena and Hein, Jason E. and Aspuru-Guzik, Al{\'a}n},
      doi = {10.1088/2632-2153/abedc8},
      issn = {2632-2153},
      journal = {Machine Learning: Science and Technology},
      month = jul,
      number = {3},
      pages = {035021},
      title = {Olympus: a benchmarking framework for noisy optimization and experiment planning},
      volume = {2},
      year = {2021}
}

@misc{hickman_olympus_2023,
	author = {Hickman, Riley and Parakh, Priyansh and Cheng, Austin and Ai, Qianxiang and Schrier, Joshua and Aldeghi, Matteo and Aspuru-Guzik, Al{\'a}n},
	doi = {10.26434/chemrxiv-2023-74w8d},
	language = {en},
	month = may,
	publisher = {ChemRxiv},
	shorttitle = {Olympus, enhanced},
	title = {Olympus, enhanced: benchmarking mixed-parameter and multi-objective optimization in chemistry and materials science},
	urldate = {2023-06-21},
	year = {2023},
}

The preprint is also available at https://arxiv.org/abs/2010.04153.

License

Olympus is distributed under an MIT License.

olympus's People

Contributors

Stargazers

Watchers

olympus's Issues

SteepestDescent has no attribute eta

Code snippet:

surface = Surface('HyperEllipsoid')
planner = Planner('SteepestDescent', goal='minimize')
planner.optimize(emulator=surface, num_iter=100, verbose=False)

Error:

Exception in thread Thread-6:
Traceback (most recent call last):
  File "/Users/Matteo/anaconda2/envs/olympus/lib/python3.7/threading.py", line 917, in _bootstrap_inner
    self.run()
  File "/Users/Matteo/anaconda2/envs/olympus/lib/python3.7/threading.py", line 865, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/Matteo/github_projects/olympus/src/olympus/planners/planner_steepest_descent/wrapper_steepest_descent.py", line 65, in start_optimizer
    guess = guess - self.eta * dy
  File "/Users/Matteo/github_projects/olympus/src/olympus/objects/abstract_object.py", line 85, in __getattr__
    raise AttributeError(f"Object has no attribute {prop}")
AttributeError: Object has no attribute eta

Support for "oracle" models (i.e. categorical/discrete optimization)

Hi! I have a quick question regarding the future prospects for the package.

A common use case for optimization is to have a dataset that fully contains your set of choices, as opposed to an analytical, continuous one. In this case, instead of training a model to emulate the experimental surface, you might want an "oracle" that just retrieves the value at any point in the dataset.
Your algorithm would then be restricted to select points in the dataset, and query this oracle for the value.
Is this use case in the (future) scope of the Olympus?

Integrate tree-based approaches

SMAC (https://github.com/automl/SMAC3)
ENTMOOT (https://github.com/cog-imperial/entmoot)

[Dev] Results from `Olympus().run()` contains results from previous runs

An issue with Olympus ’s run (and thus benchmark) method:

oly = Olympus()

for dataset in datasets:
    oly.run(dataset, ...)
    values = oly.campaign.observations.get_values().flatten()

When looking at the values of dataset[i] , it will contain values obtained from dataset[0..i-1] , so the returned value list becomes longer and longer.

I managed to circumvent this by doing the following and not using Olympus().benchmark.

for dataset in datasets:
    campaign = Campaign()
    Olympus().run(dataset, ..., campaign=campaign)
    values = campaign.observations.get_values().flatten()

Dataset contribution: autonomous discovery of battery electrolytes

Data source: https://data.matr.io/5/

251 experiments across 7 feeder solutions (design variables). Several measured quantities: pH, ionic conductivity, and current density vs. voltage array.

article:

Dave, A.; Mitchell, J.; Kandasamy, K.; Wang, H.; Burke, S.; Paria, B.; Póczos, B.; Whitacre, J.; Viswanathan, V. Autonomous Discovery of Battery Electrolytes with Robotic Experimentation and Machine Learning. Cell Reports Physical Science 2020, 1 (12), 100264. https://doi.org/10.1016/j.xcrp.2020.100264.

Mentioned by @CompRhys in sparks-baird/self-driving-lab-demo#53 (reply in thread)

missing packages

Hi there,

I read the paper "Olympus, enhanced: benchmarking mixed-parameter and multi-objective optimization in chemistry and materials science" and found the multi-objective optimization in the paper pretty interesting. While there are several missing packages or modules in the installation file like scalarizers. I'm wondering where I can find the full version of olympus? Thank you so much for your help!

Using Olympus `Planner`-s to optimize SDL-Demo

I plan to adapt this notebook to one using the supported Planner-s in Olympus. Any interest in making a Jupyter notebook implementation PR in the notebooks directory using the simulator described in 2.2-sensor-simulator.ipynb? I'd then re-run this using the experimental demo and add co-authors where appropriate and if of interest. If that's asking a bit much, no worries, but figured I'd float the idea👍Planning to implement it one way or another, though it seems like it would be a lot more efficient for someone already familiar with the internals to work on it #16

It would be interesting to make an interactive Plotly figure with legend groups or similar defined according to the categories in Planners. See sparks-baird/self-driving-lab-demo#23 for additional context.

Missing documentation requirements

From #2 : the following packages are required to compile the documentation:

scikit-learn
sqlalchemy
matplotlib
seaborn
m2r

After installing numpy and pandas in Python 3.9.* on Windows, `ERROR: No matching distribution found for tensorflow==1.15`

(base) PS C:\Users\sterg\Documents\GitHub\sparks-baird\self-driving-lab-demo> conda create -n olymp python==3.9.*
Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: C:\Users\sterg\Miniconda3\envs\olymp

  added / updated specs:
    - python=3.9


The following NEW packages will be INSTALLED:

  ca-certificates    pkgs/main/win-64::ca-certificates-2022.10.11-haa95532_0 None
  certifi            pkgs/main/win-64::certifi-2022.9.24-py39haa95532_0 None
  openssl            pkgs/main/win-64::openssl-1.1.1s-h2bbff1b_0 None
  pip                pkgs/main/win-64::pip-22.2.2-py39haa95532_0 None
  python             pkgs/main/win-64::python-3.9.13-h6244533_2 None
  setuptools         pkgs/main/win-64::setuptools-65.5.0-py39haa95532_0 None
  sqlite             pkgs/main/win-64::sqlite-3.39.3-h2bbff1b_0 None
  tzdata             pkgs/main/noarch::tzdata-2022f-h04d1e81_0 None
  vc                 pkgs/main/win-64::vc-14.2-h21ff451_1 None
  vs2015_runtime     pkgs/main/win-64::vs2015_runtime-14.27.29016-h5e58377_2 None
  wheel              pkgs/main/noarch::wheel-0.37.1-pyhd3eb1b0_0 None
  wincertstore       pkgs/main/win-64::wincertstore-0.2-py39haa95532_2 None


Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate olymp
#
# To deactivate an active environment, use   
#
#     $ conda deactivate

Retrieving notices: ...working... done       
(base) PS C:\Users\sterg\Documents\GitHub\sparks-baird\self-driving-lab-demo> conda activate olymp
(olymp) PS C:\Users\sterg\Documents\GitHub\sparks-baird\self-driving-lab-demo> pip install olymp[all]
Collecting olymp[all]
  Using cached olymp-0.0.1b0-py3-none-any.whl (4.8 MB)
Collecting numpy
  Downloading numpy-1.23.4-cp39-cp39-win_amd64.whl (14.7 MB)
     ━━━━━━━━━ 14.7/14.7 5.6 MB/s eta 0:00:00
               MB
Collecting pandas
  Downloading pandas-1.5.1-cp39-cp39-win_amd64.whl (10.9 MB)
     ━━━━━━━━━ 10.9/10.9 6.2 MB/s eta 0:00:00
               MB
Collecting deap
  Downloading deap-1.3.3-cp39-cp39-win_amd64.whl (114 kB)
     ━━━━━━━━━ 114.3/11… 6.5 MB/s eta 0:00:00
               kB
Collecting seaborn
  Using cached seaborn-0.12.1-py3-none-any.whl (288 kB)
Collecting tensorflow-probability==0.8
  Downloading tensorflow_probability-0.8.0-py2.py3-none-any.whl (2.5 MB)
     ━━━━━━━━━ 2.5/2.5   5.4 MB/s eta 0:00:00
               MB
Collecting pyswarms
  Downloading pyswarms-1.3.0-py2.py3-none-any.whl (104 kB)
     ━━━━━━━━━━━━ 104.1/104.1   ? eta 0:00:00
                  kB
Collecting sqlalchemy
  Downloading SQLAlchemy-1.4.44-cp39-cp39-win_amd64.whl (1.6 MB)
     ━━━━━━━━━ 1.6/1.6   5.6 MB/s eta 0:00:00
               MB
Collecting hyperopt
  Downloading hyperopt-0.2.7-py2.py3-none-any.whl (1.6 MB)
     ━━━━━━━━━ 1.6/1.6   5.9 MB/s eta 0:00:00
               MB
Collecting phoenics
  Downloading phoenics-0.2.0.tar.gz (177 kB)
     ━━━━━━━━━ 177.7/17… 5.4 MB/s eta 0:00:00
               kB
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error        

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [6 lines of output]
      Traceback (most recent call last):     
        File "<string>", line 2, in <module> 
        File "<pip-setuptools-caller>", line 34, in <module>
        File "C:\Users\sterg\AppData\Local\Temp\pip-install-j1gm7ni6\phoenics_cd91286c19644e1382e889fb4f980ef6\setup.py", line 5, in <module>
          import numpy as np
      ModuleNotFoundError: No module named 'numpy'
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.    
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details

(olymp) PS C:\Users\sterg\Documents\GitHub\sparks-baird\self-driving-lab-demo> pip install numpy pandas
Collecting numpy
  Using cached numpy-1.23.4-cp39-cp39-win_amd64.whl (14.7 MB)
Collecting pandas
  Using cached pandas-1.5.1-cp39-cp39-win_amd64.whl (10.9 MB)
Collecting pytz>=2020.1
  Downloading pytz-2022.6-py2.py3-none-any.whl (498 kB)
     ━━━━━━━━━ 498.1/49… 2.6 MB/s eta 0:00:00
               kB
Collecting python-dateutil>=2.8.1
  Using cached python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
Collecting six>=1.5
  Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Installing collected packages: pytz, six, numpy, python-dateutil, pandas
Successfully installed numpy-1.23.4 pandas-1.5.1 python-dateutil-2.8.2 pytz-2022.6 six-1.16.0
(olymp) PS C:\Users\sterg\Documents\GitHub\sparks-baird\self-driving-lab-demo> pip install olymp[all]  
Collecting olymp[all]
  Using cached olymp-0.0.1b0-py3-none-any.whl (4.8 MB)
Requirement already satisfied: pandas in c:\users\sterg\miniconda3\envs\olymp\lib\site-packages (from olymp[all]) (1.5.1)
Requirement already satisfied: numpy in c:\users\sterg\miniconda3\envs\olymp\lib\site-packages (from olymp[all]) (1.23.4)
Collecting SQSnobFit
  Downloading SQSnobFit-0.4.5.tar.gz (29 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting sqlalchemy
  Using cached SQLAlchemy-1.4.44-cp39-cp39-win_amd64.whl (1.6 MB)
ERROR: Could not find a version that satisfies the requirement tensorflow==1.15; extra == "all" (from olymp[all]) (from versions: 2.5.0, 2.5.1, 2.5.2, 2.5.3, 2.6.0rc0, 2.6.0rc1, 2.6.0rc2, 2.6.0, 2.6.1, 2.6.2, 2.6.3, 2.6.4, 2.6.5, 2.7.0rc0, 2.7.0rc1, 2.7.0, 2.7.1, 2.7.2, 2.7.3, 2.7.4, 2.8.0rc0, 2.8.0rc1, 2.8.0, 2.8.1, 2.8.2, 2.8.3, 2.9.0rc0, 2.9.0rc1, 2.9.0rc2, 2.9.0, 2.9.1, 2.9.2, 2.10.0rc0, 2.10.0rc1, 2.10.0rc2, 2.10.0rc3, 2.10.0, 2.11.0rc0, 2.11.0rc1, 2.11.0rc2)
ERROR: No matching distribution found for tensorflow==1.15; extra == "all"
(olymp) PS C:\Users\sterg\Documents\GitHub\sparks-baird\self-driving-lab-demo>

recursion error when generating docs for databases

When trying to generate the docs for the databases subpackage, sphinx returns an error about a maximum recursion depth:

RuntimeError: maximum recursion depth exceeded while calling a Python object

This may be related/similar to this: sphinx-doc/sphinx#1612

It is not urgent or critical, but because of this databases are not in the docs at the moment.

Does Olympus support integer parameters?

If not, I can just round manually

Cleanup files written by planners

Cma and Phoenics (possibly others) create folders/files when they run (called outcmaes and SearchProgress, respectively). We could cleanup these folders/file after we terminate the planners.

Cannot compile the documentation

Trying to compile the documentation.
Created a virtual environment for Olympus, cloned the olympus repo, installed olympus.
Installed requirements at documentation, missing modules in the requirements:

No module named 'sklearn' (installing manually)
Sqlite databases require sqlalchemy (missing module) (installing manually)
Plotter requires matplotlib, seaborn (installing manually)
No module named m2r (installing manually)

Error after installing these modules:
sphinx/registry.py", line 266, in add_source_parser for filetype in parser.supported: AttributeError: 'str' object has no attribute 'supported' make: *** [Makefile:21: html] Error 2

Plans for multi-objective optimization benchmarks? (`target_ids`)

I'm noticing the following convention for Dataset:

olympus/src/olympus/datasets/dataset.py

Line 42 in bbeb991

target_ids=None,

(i.e., plural target_ids)

When trying to pass multiple list entries to target_ids, I get:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[44], line 1
----> 1 emulator.train()

File c:\Users\sterg\Miniconda3\envs\sdl-demo\lib\site-packages\olympus\emulators\emulator.py:354, in Emulator.train(self, plot, retrain)
    347 # Train
    348 Logger.log(
    349     ">>> Training model on {0:.0%} of the dataset, testing on {1:.0%}...".format(
    350         (1 - self.dataset.test_frac), self.dataset.test_frac
    351     ),
    352     "INFO",
    353 )
--> 354 mdl_train_r2, mdl_test_r2, mdl_train_rmsd, mdl_test_rmsd = self.model.train(
    355     train_features=train_features_scaled,
    356     train_targets=train_targets_scaled,
    357     valid_features=test_features_scaled,
    358     valid_targets=test_targets_scaled,
    359     model_path=model_path,
    360     plot=plot,
    361 )
    363 # write file to indicate training is complete and add R2 in there
    364 with open(f"{model_path}/training_completed.info", "w") as content:

File c:\Users\sterg\Miniconda3\envs\sdl-demo\lib\site-packages\olympus\models\wrapper_tensorflow_model\wrapper_tensorflow_model.py:159, in WrapperTensorflowModel.train(self, train_features, train_targets, valid_features, valid_targets, model_path, plot)
    154 losses.append(loss)
    156 if epoch % self.pred_int == 0:
    157 
    158     # make a prediction on the validation set
--> 159     valid_pred = self.predict(
    160         features=valid_features[valid_indices], num_samples=10
    161     )
    162     valid_r2 = r2_score(valid_targets[valid_indices], valid_pred)
    163     valid_rmsd = np.sqrt(
    164         mean_squared_error(valid_targets[valid_indices], valid_pred)
    165     )

File c:\Users\sterg\Miniconda3\envs\sdl-demo\lib\site-packages\olympus\models\wrapper_tensorflow_model\wrapper_tensorflow_model.py:282, in WrapperTensorflowModel.predict(self, features, num_samples)
    278     for _ in range(num_samples):
    279         predic = self.sess.run(
    280             self.y_pred, feed_dict={self.tf_x: X_test_batch}
    281         )
--> 282         pred[_, start:stop] = predic[:size]
    284 pred = np.mean(pred, axis=0)
    285 return pred

ValueError: could not broadcast input array from shape (50,8) into shape (50,1)

Local location of dataset and models contains absolute directory path and is prepended with "dataset_" and "model"

Running on Windows.

# pip install olymp sqlalchemy matplotlib
# initalize the Olympus orchestrator
from olympus import Olympus, Database, Campaign
from time import time
from os import path

olymp = Olympus()
# we declare a local database to which we store our campaign results
database = Database()

DATASET = "photo_pce10"
NUM_REPETITIONS = 10
PLANNERS = ["Grid"]

elapsed_times = {"planner": [], "time": []}
for PLANNER in PLANNERS:
    for repetition in range(NUM_REPETITIONS):
        print(f"Algorithm: {PLANNER} [repetition {repetition+1}]")

        start_time = time()
        olymp.run(
            planner=PLANNER,  # run simulation with <PLANNER>,
            dataset=DATASET,  # on emulator trained on dataset <DATASET>;
            campaign=Campaign(),  # store results in a new campaign,
            database=database,  # but use the same database to store campaign;
            num_iter=100,  # run benchmark for num_iter iterations
        )
        elapsed_time = time() - start_time
        elapsed_times["planner"].append(PLANNER)
        elapsed_times["time"].append(elapsed_time)

[FATAL] Could not find dataset ` \sterg\Miniconda3\envs\sdl-demo\lib\site-packages\olympus\datasets\dataset_photo_pce10`. Please choose from one of the available datasets: \sterg\Miniconda3\envs\sdl-demo\lib\site-packages\olympus\datasets\dataset_alkox, \sterg\Miniconda3\envs\sdl-demo\lib\site-packages\olympus\datasets\dataset_benzylation, \sterg\Miniconda3\envs\sdl-demo\lib\site-packages\olympus\datasets\dataset_colors_bob, \sterg\Miniconda3\envs\sdl-demo\lib\site-packages\olympus\datasets\dataset_colors_n9, \sterg\Miniconda3\envs\sdl-demo\lib\site-packages\olympus\datasets\dataset_fullerenes, \sterg\Miniconda3\envs\sdl-demo\lib\site-packages\olympus\datasets\dataset_hplc, \sterg\Miniconda3\envs\sdl-demo\lib\site-packages\olympus\datasets\dataset_photo_pce10, \sterg\Miniconda3\envs\sdl-demo\lib\site-packages\olympus\datasets\dataset_photo_wf3, \sterg\Miniconda3\envs\sdl-demo\lib\site-packages\olympus\datasets\dataset_snar, \sterg\Miniconda3\envs\sdl-demo\lib\site-packages\olympus\datasets\dataset_suzuki.

Changing to the following (based on error message) worked for the dataset:

from os import path
...
path.join(
    "\sterg",
    "Miniconda3",
    "envs",
    "sdl-demo",
    "lib",
    "site-packages",
    "olympus",
    "datasets",
    "dataset_photo_pce10",
)
...

But then I get the following error about the model:

[FATAL] Model "BayesNeuralNet" not available in Olympus. Please choose from one of the available models: Rs\sterg\Miniconda3\envs\sdl-demo\lib\site-packages\olympus\models\modelBayesNeuralNet, Rs\sterg\Miniconda3\envs\sdl-demo\lib\site-packages\olympus\models\modelNeuralNe

and it's not apparent what I need to change. What would you suggest?

Grid search produces `IndexError: pop from empty list`

Algorithm: Grid [repetition 1]
[INFO] Loading emulator using a BayesNeuralNet model for the dataset photo_pce10...
[INFO] Last parameter being provided - there will not be any more available samples in the grid.
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
[<ipython-input-7-f36bc3f1915d>](https://localhost:8080/#) in <module>
     36         campaign=Campaign(),  # store results in a new campaign,
     37         database=database,    # but use the same database to store campaign;
---> 38         num_iter=64,         # run benchmark for num_iter iterations
     39     )
     40     elapsed_time = time() - start_time

4 frames
[/usr/local/lib/python3.7/dist-packages/olympus/olympus.py](https://localhost:8080/#) in run(self, planner, dataset, model, goal, campaign, database, num_iter)
     55             planner=planner_, emulator=emulator, campaign=campaign, database=database,
     56         )
---> 57         self.evaluator.optimize(num_iter=num_iter)
     58 
     59     def run_analytic(

[/usr/local/lib/python3.7/dist-packages/olympus/evaluators/evaluator.py](https://localhost:8080/#) in optimize(self, num_iter)
     66             # NOTE: now we get 1 param at a time, a possible future expansion is
     67             #       to return batches
---> 68             params = self.planner.recommend(observations=self.campaign.observations)
     69 
     70             # get measurement from emulator/surface

[/usr/local/lib/python3.7/dist-packages/olympus/planners/abstract_planner.py](https://localhost:8080/#) in recommend(self, observations, return_as)
    130         """
    131         self.tell(observations)
--> 132         return self.ask(return_as=return_as)
    133 
    134     def optimize(self, emulator, num_iter=1, verbose=False):

[/usr/local/lib/python3.7/dist-packages/olympus/planners/abstract_planner.py](https://localhost:8080/#) in ask(self, return_as)
     95 
     96         self.num_generated += 1
---> 97         param_vector = self._ask()
     98 
     99         # check that the parameters suggested are within the bounds of our param_space

[/usr/local/lib/python3.7/dist-packages/olympus/planners/planner_grid/wrapper_grid.py](https://localhost:8080/#) in _ask(self)
    126         if self.grid_created is False:
    127             self._create_grid()
--> 128         param = self.samples.pop(0)
    129 
    130         if len(self.samples) == 0:

IndexError: pop from empty list

Reproducer Colab Notebook

Dataset Import in Windows OS

For windows users of python library I found that the datasets did not import correctly.

It appears the problem is related to the usage of splitting functions in the file path
Within the init.py for the datsets sections the following code is executed. How the data is being split is dependent on the OS being used and this creates the issues for me.
datasets_list = []
for dir_name in glob.glob(f"{home}/dataset_*"):
dir_name = dir_name.split("/")[-1][8:]
datasets_list.append(dir_name)

I changed the code to use only os

datasets_list = []
for dir_name in os.scandir(home):
if 'dataset_' in os.path.split(dir_name)[-1][:8]:
dir_name = os.path.split(dir_name)[-1][8:]
datasets_list.append(dir_name)

I think this problem could occur in other locations and I'll update the bug as I see it

In the examples code , there is an issue (emulator.ipynb)

There is a problem with the example code in the emulator. ipynb.
There is a problem.
Once the Dataset is loaded, the name itself does not exist. Is there anything I missed?
I'm using Python 3.6.13 and installed the rest as provided by Olympus.

dataset = Dataset(
name='colormix_bob',
test_frac=0.2, num_folds=5)
model = BayesNeuralNet(max_epochs=150)

TypeError Traceback (most recent call last)
in
1 dataset = Dataset(
2 name='colormix_bob',
----> 3 test_frac=0.2, num_folds=5)
4 model = BayesNeuralNet(max_epochs=150)
5 dataset.set

TypeError: init() got an unexpected keyword argument 'name'

Cma is noisy

I wonder if we could silence the printing from Cma when verbose=False.

At the moment:

surface = Surface('HyperEllipsoid')
planner = Planner('Cma', goal='minimize')
planner.optimize(emulator=surface, num_iter=100, verbose=False)

>>> (3_w,6)-aCMA-ES (mu_w=2.0,w_1=63%) in dimension 2 (seed=628832, Mon Feb 24 11:52:44 2020)
>>> Iterat #Fevals   function value  axis ratio  sigma  min&max std  t[m:s]
>>>     1      6 5.941832627437431e+00 1.0e+00 4.89e-01  5e-01  5e-01 0:00.6
>>>     2     12 1.364266645555701e+00 1.1e+00 3.68e-01  3e-01  3e-01 0:01.3
>>>     3     18 2.228430411909851e+00 1.2e+00 3.48e-01  3e-01  3e-01 0:01.9
>>>     8     48 1.128310867965480e-01 1.2e+00 1.08e-01  5e-02  5e-02 0:05.0
>>>    15     90 2.983828351449138e-03 1.4e+00 2.95e-02  5e-03  7e-03 0:09.3

Running in debug mode on VS Code doesn't stop on [FATAL] message

Running in VS Code on Windows.

[FATAL] Model "BayesNeuralNet" not available in Olympus. Please choose from one of the available models: Rs\sterg\Miniconda3\envs\sdl-demo\lib\site-packages\olympus\models\modelBayesNeuralNet, Rs\sterg\Miniconda3\envs\sdl-demo\lib\site-packages\olympus\models\modelNeuralNet

using the following launch configuration:

            {
                "name": "Python: Module Code",
                "type": "python",
                "request": "launch",
                "program": "${file}",
                "console": "integratedTerminal",
                "justMyCode": false,
                "env": {
                    "PYDEVD_WARN_SLOW_RESOLVE_TIMEOUT": "2",
                },
                "redirectOutput": true,
            }

Tried with other launch configurations, too.

Should dataset contributions to Olympus be limited to experimental results (e.g. wetlab experiments)?

Totally fine if so

After I have installed numpy, panda and olmp, I try to run the example, but it failed

How to use Olympus with a custom objective function?

I see from the docs that there are facilities for making custom Dataset-s, custom Planner-s, and custom Emulator-s, but what I'm going for would be more like a custom Surface. Does Olympus directly support optimizing an externally supplied (black box) objective function? If so, could you provide a MWE? Alan suggested I use Olympus for self-driving-lab-demo. See this Twitter thread and Alan's replies [1] [2] for context.

Data out of bounds for alkox dataset

The alkox parameter space, defined in its config.json, does not match the data in the csv file. The temperature is expected to be between 60 and 140:

dataset = Dataset('alkox')
for param in dataset.param_space.parameters:
    print(param.name, param.low, param.high)
>>> residence_time 0.5 2.0
>>> ratio 1.0 5.0
>>> concentration 0.1 0.5
>>> temperature 60.0 140.0

But the csv file (dataset.data) contains temperatures between 6 and 8.

This results in the following error when the method set_param_space is called:

[ERROR] Lower bound of 0.5 provided for parameter `residence_time` is higher than minimum found in the data!
[ERROR] Lower bound of 1.0 provided for parameter `ratio` is higher than minimum found in the data!
[ERROR] Upper bound of 0.5 provided for parameter `concentration` is lower than maximum found in the data!
[ERROR] Lower bound of 60.0 provided for parameter `temperature` is higher than minimum found in the data!

Add synthesis dataset

Hey, I'm working on adding a set of perovskite synthesis reactions to src/olympus/datasets as a benchmark dataset (following a previous discussion with @jschrier earlier this month). A brief description of this dataset can be found here.
Questions:

It appears all of the benchmark datasets currently in src/olympus/datasets have continuous targets, but our synthesis dataset has a categorical target. Is this a problem?
What is the best practice to include descriptors for categorical parameters? (I see you have a descriptor.csv in this folder, but would it be better to just include the descriptors directly like this?)

ParticleSwarms: ValueError: operands could not be broadcast together

Code:

surface = Surface('HyperEllipsoid')
planner = Planner('ParticleSwarms', goal='minimize')
planner.optimize(emulator=surface, num_iter=100, verbose=True)

Error:

Exception in thread Thread-4:
Traceback (most recent call last):
  File "/Users/Matteo/anaconda2/envs/olympus/lib/python3.7/threading.py", line 917, in _bootstrap_inner
    self.run()
  File "/Users/Matteo/anaconda2/envs/olympus/lib/python3.7/threading.py", line 865, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/Matteo/github_projects/olympus/src/olympus/planners/planner_particle_swarms/wrapper_particle_swarms.py", line 52, in create_optimizer
    cost, pos = self.optimizer.optimize(self._priv_evaluator, iters=self.max_iters)
  File "/Users/Matteo/anaconda2/envs/olympus/lib/python3.7/site-packages/pyswarms/single/global_best.py", line 185, in optimize
    self.swarm.pbest_pos, self.swarm.pbest_cost = compute_pbest(self.swarm)
  File "/Users/Matteo/anaconda2/envs/olympus/lib/python3.7/site-packages/pyswarms/backend/operators.py", line 69, in compute_pbest
    new_pbest_pos = np.where(~mask_pos, swarm.pbest_pos, swarm.position)
  File "<__array_function__ internals>", line 6, in where
ValueError: operands could not be broadcast together with shapes (10,2,10) (10,2) (10,2)

[Main & Dev] Specific dependencies versions required

Because of the dependency on TensorFlow v1, Python 3.7 is a must. Moreover SQLAlchemy==1.4.47 is also required (the newest version of SQLAlchemy is v2.x.

Probably easiest to provide Conda's environment.yml.

ATTR _ipython_canary_method_should_not_exist_

I get this message many times if I instantiate a Dataset object without assigning it to a variable (in Jupiter notebook), e.g.:

Dataset(kind='colors_bob')
>>> ATTR _ipython_canary_method_should_not_exist_
>>> ATTR _ipython_canary_method_should_not_exist_
>>> ATTR _ipython_canary_method_should_not_exist_
>>> ...

aspuru-guzik-group / olympus Goto Github PK

olympus's Introduction

Olympus: a benchmarking framework for noisy optimization and experiment planning

Installation

Dependencies

Use cases

Citation

License

olympus's People

Contributors

Stargazers

Watchers

Forkers

olympus's Issues

There is a problem with the example code in the emulator. ipynb. There is a problem. Once the Dataset is loaded, the name itself does not exist. Is there anything I missed? I'm using Python 3.6.13 and installed the rest as provided by Olympus.

Recommend Projects

Recommend Topics

Recommend Org

Jobs

There is a problem with the example code in the emulator. ipynb.
There is a problem.
Once the Dataset is loaded, the name itself does not exist. Is there anything I missed?
I'm using Python 3.6.13 and installed the rest as provided by Olympus.