parmoo / parmoo Goto Github PK

Python library for parallel multiobjective simulation optimization

Home Page: https://parmoo.readthedocs.io

License: BSD 3-Clause "New" or "Revised" License

Python 99.70% Shell 0.30%

blackbox-optimization mathematical-software multicriteria-optimization multiobjective multiobjective-optimization numerical-optimization python3 response-surface-methodology simulation-based-optimization simulation-optimization surrogate-based-optimization

parmoo's Introduction

ParMOO: Python library for parallel multiobjective simulation optimization

ParMOO is a parallel multiobjective optimization solver that seeks to exploit simulation-based structure in objective and constraint functions.

To exploit structure, ParMOO models simulations separately from objectives and constraints. In our language:

a design variable is an input to the problem, which we can directly control;

a simulation is an expensive or time-consuming process, including real-world experimentation, which is treated as a blackbox function of the design variables and evaluated sparingly;

an objective is an algebraic function of the design variables and/or simulation outputs, which we would like to optimize; and

a constraint is an algebraic function of the design variables and/or simulation outputs, which cannot exceed a specified bound.

To solve a multiobjective optimization problem (MOOP), we use surrogate models of the simulation outputs, together with the algebraic definition of the objectives and constraints.

ParMOO is implemented in Python. In order to achieve scalable parallelism, we use libEnsemble to distribute batches of simulation evaluations across parallel resources.

Dependencies

ParMOO has been tested on Unix/Linux and MacOS systems.

ParMOO's base has the following dependencies:

Python 3.8+

numpy -- for data structures and performant numerical linear algebra

scipy -- for scientific calculations needed for specific modules

pyDOE -- for generating experimental designs

pandas -- for exporting the resulting databases

Additional dependencies are needed to use the additional features in parmoo.extras:

libEnsemble -- for managing parallel simulation evaluations

And for using the Pareto front visualization library in parmoo.viz:

plotly -- for generating interactive plots

dash -- for hosting interactive plots in your browser

kaleido -- for exporting static plots post-interaction

Installation

The easiest way to install ParMOO is via the Python package index, PyPI (commonly called pip):

pip install < --user > parmoo

where the braces around < --user > indicate that the --user flag is optional.

To install all dependencies (including libEnsemble) use:

pip install < --user > "parmoo[extras]"

You can also clone this project from our GitHub and pip install it in-place, so that you can easily pull the latest version or checkout the develop branch for pre-release features. On Debian-based systems with a bash shell, this looks like:

git clone https://github.com/parmoo/parmoo
cd parmoo
pip install -e .

Alternatively, the latest release of ParMOO (including all required and optional dependencies) can be installed from the conda-forge channel using:

conda install --channel=conda-forge parmoo

Before doing so, it is recommended to create a new conda environment using:

conda create --name channel-name
conda activate channel-name

Testing

If you have pytest with the pytest-cov plugin and flake8 installed, then you can test your installation.

python3 setup.py test

These tests are run regularly using GitHub Actions.

Basic Usage

ParMOO uses numpy in an object-oriented design, based around the MOOP class. To get started, create a MOOP object.

from parmoo import MOOP
from parmoo.optimizers import LocalGPS

my_moop = MOOP(LocalGPS)

To summarize the framework, in each iteration ParMOO models each simulation using a computationally cheap surrogate, then solves one or more scalarizations of the objectives, which are specified by acquisition functions. Read more about this framework at our ReadTheDocs page. In the above example, LocalGPS is the class of optimizers that the my_moop will use to solve the scalarized surrogate problems.

Next, add design variables to the problem as follows using the MOOP.addDesign(*args) method. In this example, we define one continuous and one categorical design variable. Other options include integer, custom, and raw (using raw variables is not recommended except for expert users).

# Add a single continuous design variable in the range [0.0, 1.0]
my_moop.addDesign({'name': "x1", # optional, name
                   'des_type': "continuous", # optional, type of variable
                   'lb': 0.0, # required, lower bound
                   'ub': 1.0, # required, upper bound
                   'tol': 1.0e-8 # optional tolerance
                  })
# Add a second categorical design variable with 3 levels
my_moop.addDesign({'name': "x2", # optional, name
                   'des_type': "categorical", # required, type of variable
                   'levels': ["good", "bad"] # required, category names
                  })

Next, add simulations to the problem as follows using the MOOP.addSimulation method. In this example, we define a toy simulation sim_func(x).

import numpy as np
from parmoo.searches import LatinHypercube
from parmoo.surrogates import GaussRBF

# Define a toy simulation for the problem, whose outputs are quadratic
def sim_func(x):
   if x["x2"] == "good":
      return np.array([(x["x1"] - 0.2) ** 2, (x["x1"] - 0.8) ** 2])
   else:
      return np.array([99.9, 99.9])
# Add the simulation to the problem
my_moop.addSimulation({'name': "MySim", # Optional name for this simulation
                       'm': 2, # This simulation has 2 outputs
                       'sim_func': sim_func, # Our sample sim from above
                       'search': LatinHypercube, # Use a LH search
                       'surrogate': GaussRBF, # Use a Gaussian RBF surrogate
                       'hyperparams': {}, # Hyperparams passed to internals
                       'sim_db': { # Optional dict of precomputed points
                                  'search_budget': 10 # Set search budget
                                 },
                      })

Now we can add objectives and constraints using MOOP.addObjective(*args) and MOOP.addConstraint(*args). In this example, there are 2 objectives (each corresponding to a single simulation output) and one constraint.

# First objective just returns the first simulation output
def f1(x, s): return s["MySim"][0]
my_moop.addObjective({'name': "f1", 'obj_func': f1})
# Second objective just returns the second simulation output
def f2(x, s): return s["MySim"][1]
my_moop.addObjective({'name': "f2", 'obj_func': f2})
# Add a single constraint, that x[0] >= 0.1
def c1(x, s): return 0.1 - x["x1"]
my_moop.addConstraint({'name': "c1", 'constraint': c1})

Finally, we must add one or more acquisition functions using MOOP.addAcquisition(*args). These are used to scalarize the surrogate problems. The number of acquisition functions typically determines the number of simulation evaluations per batch. This is useful to know if you are using a parallel solver.

from parmoo.acquisitions import RandomConstraint

# Add 3 acquisition functions
for i in range(3):
   my_moop.addAcquisition({'acquisition': RandomConstraint,
                           'hyperparams': {}})

Finally, the MOOP is solved using the MOOP.solve(budget) method, and the results can be viewed using MOOP.getPF() method.

import pandas as pd

my_moop.solve(5) # Solve with 5 iterations of ParMOO algorithm
results = my_moop.getPF(format="pandas") # Extract the results as pandas df

After executing the above block of code, the results variable points to a pandas dataframe, each of whose rows corresponds to a nondominated objective value in the my_moop object's final database. You can reference individual columns in the results array by using the name keys that were assigned during my_moop's construction, or plot the results by using the viz library.

Congratulations, you now know enough to get started solving MOOPs with ParMOO!

Next steps:

Learn more about all that ParMOO has to offer (including saving and checkpointing, INFO-level logging, advanced problem definitions, and different surrogate and solver options) at our ReadTheDocs page.

Explore the advanced examples (including a libEnsemble example) in the examples directory.

Install libEnsemble and get started solving MOOPs in parallel.

See some of our pre-built solvers in the parmoo_solver_farm.

To interactively explore your solutions, install its extra dependencies and use our built-in viz tool.

For more advice, consult our FAQs.

Resources

To seek support or report issues, e-mail:

[email protected]

Our full documentation is hosted on:

ReadTheDocs

Please read our LICENSE and CONTRIBUTING files.

Citing ParMOO

Please use one of the following to cite ParMOO.

Our JOSS paper:

@article{parmoo,
    author={Chang, Tyler H. and Wild, Stefan M.},
    title={{ParMOO}: A {P}ython library for parallel multiobjective simulation optimization},
    journal = {Journal of Open Source Software},
    volume = {8},
    number = {82},
    pages = {4468},
    year = {2023},
    doi = {10.21105/joss.04468}
}

Our online documentation:

@techreport{parmoo-docs,
    title       = {{ParMOO}: {P}ython library for parallel multiobjective simulation optimization},
    author      = {Chang, Tyler H. and Wild, Stefan M. and Dickinson, Hyrum},
    institution = {Argonne National Laboratory},
    number      = {Version 0.3.1},
    year        = {2023},
    url         = {https://parmoo.readthedocs.io/en/latest}
}

parmoo's People

Contributors

Stargazers

Watchers

Forkers

enjoyneer87 saidctb hitergelei yuenxq hyrumdickinson shuaiwang88 dennisyangji 012db santoshbalija jomorlier cnm13ryan

parmoo's Issues

Reduce Duplicate Code in Unit Tests

Due to rapid development and interface changes, the unit tests contain significant duplicate code.

This could be reduced by

Using built-in utilities from parmoo.util instead of checking things by hand
Using built-in functions from the objectives/simulations/constraints libraries
Making better usage of ParMOO's new methods, which may not have existed in previous development cycles
Adding testing utilities that are used over and over again (either by creating setup.py files in the test directories, or by adding to parmoo.util)

Should support unscaled input

Should support design variable types inputs that are not embedded/rescaled at all, and passed directly to surrogates.

This is necessary to support certain custom surrogate models.

Add more viz plot types

Plots types we should add include:

heatmap
3D scatterplot
radviz (designed for visualizing MOOP results)
petal diagram
star coordinates plot

Plot types should be based on Plotly. Graph generating code should be written in viz/graph.py. Wrappers can be easily added in viz/plot.py. We may need some tweaking in viz/utilities.py. Ideally, addition of this feature won't require any changes to viz/dashboard.py (dashboard should be graph-agnostic).

Tracking iteration count / function evaluation count in a simulation

I have a simulation

my_moop.addSimulation({ 'name': "my_opt", 'm': 2, 'sim_func': evaluate_iteration, ... )

Where evaluate_iteration() needs information about its iteration id / evaluation id, to do things like create a subdirectory within which to work (e.g., to avoid temporary file name-clashing):

test_name = f"evaluation_{eval_id}"
subdir = os.path.join( top_wd, test_name )
if not os.path.exists( subdir ):
    os.makedirs( subdir )
os.chdir( subdir )

I was wondering how I might do that with ParMOO... did I miss something in the documentation?

Non-descriptive exceptions when reloading checkpoints

Hi,

I am attempting to load a previous checkpoint with Parmoo (version 0.2.0) but I am getting this obscure exception instead. This is what happens when I call moop.load(file)

Traceback (most recent call last):
  File "/home/waldo/Documents/Documentos/CBandV2/Hairpin/BetterOptim/parmoo-refms.py", line 296, in <module>
    parmoo_optimize(filter)
  File "/home/waldo/Documents/Documentos/CBandV2/Hairpin/BetterOptim/parmoo-refms.py", line 283, in parmoo_optimize
    moop.load(FILTER_NAME)
  File "/usr/local/lib/python3.10/dist-packages/parmoo/moop.py", line 2615, in load
    obj_ptr = getattr(mod, obj_name)
AttributeError: module '__main__' has no attribute '<lambda>'

The refmicrostrip.moop and refmicrostrip.surrogate.1 files exist (see attachments):

refmicrostrip.moop.txt
refmicrostrip.surrogate.1.txt

And this is how I am initializing the MOOP object:

    moop = MOOP(LocalGPS)

    params = filter.get_param_list()
    bounds = filter.get_full_bounds()

    # Add all params
    for i in range(len(params)):
        moop.addDesign(
            {
                'name':     params[i],
                'des_type': 'continuous',
                'lb':       bounds.lb[i],
                'ub':       bounds.ub[i]
            })

    moop.addSimulation(
        {
            'name': "FilterSim",
            'm': 2,
            'sim_func': chrosen,
            'search': LatinHypercube,
            'surrogate': GaussRBF,
            'hyperparams': {'search_budget': 20}
        })
    
    moop.addObjective(
        {'name': 'Score', 'obj_func': lambda x, s: sum(s['FilterSim'])}
    )

    logging.basicConfig(
        level=logging.INFO,
        format='%(asctime)s %(levelname)-8s %(message)s',
        datefmt='%Y-%m-%d %H:%M:%S')

    moop.load(FILTER_NAME)

Post-run visualization

Simply exporting solution data to a numpy structured array, does not help with decision making.

We should add tools for visualizing Pareto front data, such as

2D/3D scatter plots
spider plots
projections
etc.

Similarities on direct optimization

Hi fellows,

During my masters, I was able to develop with my supervisor a library in python language called pybary. It is similar to parmoo in the sense it is a data-driven optimization method, we call it "Barycenter Method": given "unknown" oracle map $f: \mathbb{R}^n \to \mathbb{R}$ and design variables $x \in \mathbb{R}^n$, the method applies a arithmetic weighted average on these coordinates respective to weights $w_i \coloneqq e^{-\nu f(x_i)}$: it results on a coordinate close to the smallest evaluation $f(x)$ (the smaller value $f_i \coloneqq f(x_i)$, the greater weight $w_i$) for hyperparameter $\nu > 0$, or to the greatest evaluation for hyperparameter $\nu < 0$.

It is a great opportunity to apply this method in a multi-objective scope, since it allows also a parallel procedure. Are there PhD opportunities on your laboratory?

Add predicter interface and standalone module

When you solve a MOOP using ParMOO, a surrogate model is trained for each of the underlying simulation outputs.

In certain use cases, this surrogate may be the desired product more-so than the Pareto front itself.

We should provide a separate interface for exporting trained surrogate models/parameters/weights, and re-loading them into a standalone "predictor" class, so that they can be deployed in other settings, without requiring the overhead of loading the full MOOP class.

Ideally, this should follow the conventions of standardized AI/ML tools such as torch or tensorflow, for interoperability.

For example, if a torch model were trained as a surrogate and exported, then its save file should be usable either to load a ParMOO predictor, or it should be loadable as a standalone torch model using PyTorch's model load method. This requires us to also export ParMOO's design variable embedding matrix in a portable way.

MOOP.solve() should support additional stopping criteria

Currently, the MOOP.solve() method can only be terminated based on number of iterations.

Additional optional arguments should be added, supporting stopping due to:

total simulation budget reached (perhaps even separate budgets for each simulation when dealing with multiple simulations)
total walltime exceeded
convergence criteria triggered

Should be able to maximize objectives and support lower-bound constraints natively

Currently, ParMOO minimizes all objectives and treats all constraints as upper bounds (<= 0), which is the convention in multiobjective optimization.

For problems involving maximization and lower bounds, users are recommended to multiply their true objective by -1.

This is the standard recommendation, and a reasonable workaround, but is still undesirable for many reasons.

Ideally, ParMOO would offer an option when adding objectives/constraints, to specify whether they are min/max and upper/lower bounds respectively (with the default being to minimize all objectives and treat all constraints as upper bounds).

This should be done by quietly inverting all maximum objectives/lower-bound constraints on input, then transforming back on output.

Restructure Unit Tests

Unit test suite is unorganized, and many unit tests are difficult to read.

This is due to many of the unit tests being written early-on in the development cycle, before several major refactors, and having been updated to support newer interfaces with minimal effort.

Additionally, many unit tests are extremely long and combine multiple checks into a single test function. These should be broken apart into separate tests.

Many unit test files are extremely long, and should be divided up into separate files, perhaps in subdirectories of unit test directory.

funcx support

We should be able to support funcx in order to run simulations remotely, by leveraging the interface that already exists in libEnsemble release 0.9.0: https://github.com/Libensemble/libensemble/releases/tag/v0.9.0

Problem with visualization in Chrome browser

As noted in the docs docs/how-to-write.rst, there is a known issue associated with visualization and the Chrome browser

Curse of dimensionality ?

My problem, number of design variables = 600, solves within MOOP but gives me design suggestions which result in far from the global optimum (pre-determined).

Is this the curse of dimensionality or should I revisit my setup ?

Basics/ important parts of MOOP setup:

self.my_moop = libE_MOOP(TR_LBFGSB, hyperparams={})

self.my_moop.addSimulation(
{
'name': self.name,
'm': self.m,
'sim_func': self.Simulation,
'search': LatinHypercube,
'surrogate': LocalGaussRBF,
'hyperparams': {'search_budget': 2000},
}
)

for i in range(9):
self.my_moop.addAcquisition({'acquisition': RandomConstraint,
'hyperparams': {}})
self.my_moop.addAcquisition({'acquisition': FixedWeights,
'hyperparams': {'weights': np.ones(2) / 2}})

self.my_moop.solve(50)

Missing functions from DTLZ libraries

Some of the DTLZ function have not been implemented in the objectives library:

DTLZ4
DTLZ5
DTLZ6
DTLZ7
DTLZ8
DTLZ9

Additionally, the problems

DTLZ8
DTLZ9
are based on constraints, which are not implemented in the parmoo.simulations.dtlz_sim library or anywhere else.

Design variables should be a class

Should implement design variables as a class, so that additional design variable types can be added in the future.

must define a way for design variables to be embedded and extracted (to/from float types with upper/lower bounds)
must support embeddings that require multiple design variables to be compressed into one dimension

Gaussian_Proc - index out of bounds

Forgive me if this is obvious, but this error can randomly occur and seems to have no dependence upon ParMoo input parameter choices.

Has anyone else encountered & resolved such issue?

Thank you in advance!

Misleading exception when running a libE_MOOP script without mpirun/arguments

When I first tried to run the libe_basic_ex.py example, I thought that there was a bug:

% PYTHONPATH=. python examples/libe_basic_ex.py 
Traceback (most recent call last):
  File "…/parmoo/examples/libe_basic_ex.py", line 60, in <module>
    my_moop.solve(sim_max=30)
  File "…/parmoo/parmoo/extras/libe.py", line 715, in solve
    raise ValueError("Cannot run ParMOO + libE with less than 2 " +
ValueError: Cannot run ParMOO + libE with less than 2 workers -- aborting...

I then learned that the following works:

% PYTHONPATH=. python examples/libe_basic_ex.py --comms local --nworkers 32

The ValueError does not really explain what went wrong. I suggest to either raise a more descriptive exception (e.g. RuntimeError: You need to specify the number of workers on the command line.), or to default to #CPU many workers. If possible, I would personally prefer the latter.

LibEnsemble problem

Hi Tyler H Chang,

I'm trying to run the Example found in your documentation here, and have this error:
pydantic_core._pydantic_core.ValidationError: 1 validation error for libE
5.final_fields
Extra inputs are not permitted [type=extra_forbidden, input_value=['x1', 'x2', 'MySim', 'sim_name'], input_type=list]
For further information visit https://errors.pydantic.dev/2.6/v/extra_forbidden

The example was written 2 years ago, so there might be a problem of version compatibility with Libensemble.
Any insights on that parallel computing topic ?

feature/MDML needs to be merged

MDML provides an interface to support problems involving real-world experiments, while automatically tagging, logging, and timestamping data from real-world experiments in the cloud:
https://github.com/anl-mdml/MDML_Client

ParMOO has already been integrated with MDML for past projects, but the changes have not yet been merged into develop. This is because:

A couple of the inputs have been hard-coded to a specific problem, and need to be changed
It is not clear how we will be able to run continuous-integration/automated testing with this interface, due to the need for a MDML server: https://github.com/anl-mdml/MDML

README.rst needs a code coverage badge

It would be nice to track code coverage in README.rst.

The project:
https://github.com/dbrgn/coverage-badge

Seems promising, but we will need a place to host the badge after it has been generated.

Hosting in a gist seems like a possibility.

A GUI for creating MOOPs

Not all users are comfortable working in Python, and ParMOO's interface is necessarily complicated.

We should build a GUI interface (for example, using tkinter or a web-interface), which allows users to:

define a MOOP and add problem components graphically (and guided by pop-ups/instructions from the GUI)
define their solver and solver/options graphically by browsing options in the current library/global scope,
choose parallel or serial options and activate checkpointing/logging, and then
export a Python script for running their problem.

libE_MOOP hangs on MacOS when run using Python MP option

When libEnsemble runs are started on MacOS and Windows (although Windows is not currently supported) using Python's multiprocessing library, the default is to use spawn instead of fork.

This means that all calls to the libE_MOOP.solve() method should be enclosed in an if __name__ == "__main__": block for safety when using Python MP, otherwise libE_MOOP.solve() can hang.

The ParMOO docs should be updated accordingly.