industrial-optimization-group / desdeo Goto Github PK
View Code? Open in Web Editor NEWAn open source framework for interactive multiobjective optimization methods
Home Page: https://desdeo.it.jyu.fi
An open source framework for interactive multiobjective optimization methods
Home Page: https://desdeo.it.jyu.fi
This includes docs and possible API improvements/adjustments
What is the current behavior?
We need systematic way to allow DESDEO to be connected to simulators that can, e.g., generate solutions to multiobjective optimization problems. This way, we can implement support for simulator-based problems.
Describe the solution you'd like
This solution should be simple enough to be implemented on both sides. We cannot expect DESDEO to be able to connect universally to any software, remotely or locally. Instead, we need to require the server-side to implement a simple interface to their simulator to be able to connect to DESDEO. Then, DESDEO can expect this kind of interface to be available, which allows DESDEO to communicate with it.
What is the motivation/use case for changing the behavior?
We currently have no way to support simulator-based problems, unless we first run the simulator, generate data, and then model a multiobjective optimization problem offline, based on the data.
Describe alternatives you've considered
Because we want for users to be able to supply a simulator even though a graphical user interface in the future, it should be enough to supply an (web) url to a simulator's API. For these, we need the simulator-side to implement a simple web-based API. We could have such an url for each objective function separately, if the values come from different simulators, or just one for all the objective functions.
There exist software that support remote procedure calls in Python. These vary in their features, and come with built-in security considerations as well. Some of these are:
gRPC: gRPC generates server and client (python) files based on .proto
service definitions. An example of such a file can be found here. There is also a quick start guide for Python here. gRPC is not purely Python based.
pyro: pyro is a Python-based solution for remote objects. This allows Python objects to remotely communicate with each other. Pyro offers a quick intro and example here.
ZeroMQ: ZeroMQ seems to be a popular option as well, an it offers a Python implementation. A brief example of it is available here.
FastAPI: since we are already using FastAPI, we could also require users to implement a simple fast API server that provides an interface to their simulator. By careful documentation, and offering a cookie-cutter template, we can clearly indicate what HTTP endpoint we might require from the API implemented by the client. Because, apart from gRPC, the two other options seems to be just specialized web APIs. I am probably mistaken, but correct me if I am wrong.
There are probably many more similar solutions, so if I missed any, feel free to add. I do not have much experience in this, so any input from a more experienced individual, is welcome.
Additional context
I believe whatever option we chooses, documenting the steps necessary to be taken by users for connecting their simulator(s) to DESDEO, will be crucial. Therefore, when choosing an option, we should keep in mind the ease of its implementation for both us, and especially users.
There was some mention of a Thrift interface. If there's any partially working code, it should be added to the repository as a starting point.
(Moved from TODO in code)
I'm not sure which ones are missing.
I don't know if anyone is still tracking bugs in other DESDEO repos, so I added an issue here as well.
Original link to the issue: industrial-optimization-group/desdeo-mcdm#47
If the current checks are okay (we can remove some if you like) you could install the git pre-commit hooks. This means code that doesn't pass the style guide will never get committed, meaning we get better git bisect behaviour in the future.
The guide is here:
https://github.com/industrial-optimization-group/DESDEO#set-up
DESDEO is a framework for interactively solving multi-objective optimization problems.
What is the current behavior?
Problems are formulated using the MOProblem class.
Describe the solution you'd like
Problem formulations should be represented as JSON objects. They can be read into Python as Dataclasses and stored in databases without changes. This requires analytical formulations of objectives to be stored as MathJSON objects instead of Numpy expressions. The MathJSON objects can be converted to Polars expression for evaluation with currently implemented methods. Alternatively, we can implement other converters that convert the problem formulation to, for example, numpy/pandas expressions, PuLP expressions, Gurobipy expression. We can even convert the MathJSON objects to industry-standard file formats (only for single objective optimization. Yan's work is the first step towards this idea.
What is the motivation/use case for changing the behavior?
Currently, arbitrary python objects have to be stored into the database. This is bad behaviour and prevents complicated use cases such as updating/changing problem formulation.
Additional context
Insight into how to handle surrogate modelling, external simulators, arbitrary binaries, and scenario based optimization needs further discussion.
The polars
python package should probably be the default version that we use. There is probably a way to specify alternative dependencies based on CPU type. I will look into this if this is unresolved by the time I come back.
Constraints don't seem to be supported for MOProblems yet.
This module appears to be missing
We need to add support for scenario-based multiobjective optimization problems. The general formulation of such problems is described, e.g., here Eq. 1. The challenge is how to extend the current problem schema to seamlessly support scenarios. One possible approach is to model all the objectives from each scenario separately, and utilize an optional additional index in the Objective models to identify to which scenario an objective function belongs to. Naturally, the new features will have to be documented and tested.
This issue is related to the re-structuring of DESDEO, see the branch desdeo2.
Should it be called desdeo (lowercase) everywhere?
What is the current behavior?
Currently uses Flask.
Describe the solution you'd like
Is it better/easier to use alternatives such as django/fastapi?
What is the motivation/use case for changing the behavior?
Giovanni, can you comment on the usability, verbosity, and features of django? Fastapi is generally thought to be less verbose. It auto-generates the API docs, and handles request validation/JSON-python conversion automatically. We can use dataclasses/Pydantic for structuring the requests.
Describe alternatives you've considered
Flask, Django, Fastapi
DESDEO/desdeo/problem/Problem.py
Line 111 in 0c3be08
Why not setting number of objectives explicitly?
Existing of nadir is very method-specific, no need to rely on it.
Even the ideal point is not always needed to be estimated.
Nr. of objectives is a fundamental property, why not setting it explicitly?
It doesn't seem to make much sense at the moment to have two separate documentations, especially since much of the DESDEO documentation refers to desdeo-vis stuff. A short term solution is just to make desdeo's docs script clone desdeo-vis (use submodules?) and symlink all the docs into desdeo's docs.
(Moved from TODO in comment)
These classes need to either be documented or possibly refactored away. They seem to be the most confusing part of reading the code at the moment.
e.g. it could look something like Django's ORM's models.
There is currently a mix of lists of floats and Numpy arrays in the signatures. It might make sense to standardise on one to avoid having to convert back and forth all the time.
What is the current behavior?
The scalarization functions are implemented as Python classes. One of the methods of the classes can accept objective values (and additionally, preferences, if they have not been set) and returns the scalarized value.
Describe the solution you'd like
A wrapper function that takes the MathJSON implementation of the MOP (as a dataclass) and the preferences and adds an additional key to the dataclass with the scalarized version of the problem.
What is the motivation/use case for changing the behavior?
Creating a scalarized version of the problem enables optimizers (such as Gurobi) to be aware of the problem formulation. This enables a large number of optimizers to be used.
Describe alternatives you've considered
None.
Additional context
Based on discussion with Giovanni, it will be nice to have a generic GLIDE-II based wrapper. Wrappers for individual scalarization functions can be based on the generic wrapper.
Example behaviour:
What is the current behavior?
We are currently utilizing sphinx for documentation. Each sub-project in DESDEO has currently its own implementation of sphinx-based documentation. Each one of these needs to be always manually modified. Moreover, our documentation style should be more clearly structured.
Describe the solution you'd like
We should move to a simpler tool and adopt a clear (and tested) documentation structure. We have previously already discussed MkDocs, which is a promising candidate for an alternative tool.
As for the structure of the documentation, we should follow the diataxis philosophy.
What is the motivation/use case for changing the behavior?
Sphinx is complicated. With a simpler tool, the bar to write and contribute to the documentation will be lowered. With a clear structure for the documentation, we, and users, have also a clearer picture of where to find relevant content. Diataxis is followed by other projects as well, it is tried and tested, and will meet the expectations of many users when it comes to documentation.
Describe alternatives you've considered
Sphinx has been the only real alternative we have considered and used. It works, but the features it offers at the cost of added complexity are not justified for our use-case.
Additional context
This is related to the structure of the project. With a monolithic structure (discussed in #73 ), we will also end up with a single documentation to work on, which facilitates things a lot.
Given the code:
from examples.NarulaWeistroffer import RiverPollution
from desdeo.method.NIMBUS import NIMBUS
from desdeo.optimization import SciPyDE
problem = RiverPollution()
method = NIMBUS(problem, SciPyDE)
results = method.initIteration()
print(results)
print(problem.ideal)
print(problem.nadir)
We get:
[[-5.9613644441387486, -2.848061903858047, -6.416062450560499, -0.07926222306801822]]
[-6.34, -3.44, -7.5, 0.1]
[-4.07, -2.87, -0.32, 9.71]
But -2.848061903858047 < -2.87 and -0.07926222306801822 < 0.1
e.g. like this paper https://www.iitk.ac.in/kangal/papers/k2005009.pdf
What is the current behavior?
Currently, DESDEO is scattered across multiple repositories (the packages): desdeo-problem, desdeo-emo, desdeo-tools, desdeo-mcdm, etc.. This was a nice idea on paper, but in reality, it makes developing DESDEO a nightmare. For instance, when developing features that either depend on, or affect, code from multiple packages, one is often forced to modify a local version of the package (monkey patching), which then leads to dangling changes (i.e., they are out of version control). Moreover, having separate repositories like this also leads to separate documentation, which again, needs to be separately maintained. Not a good time.
Describe the solution you'd like
We should move to a monolithic repository and package for DESDEO. This package would contain all the packages previously mentioned, plus others, like desdeo-api, and packages for other features, like explainablity and group decision-making. Here, package just means a directory containing files with code that fall under the category specified by the package. E.g., problem formulation related code goes to the problem folder. Below is a folder structure of the proposed project structure:
DESDEO (project root)
├── desdeo
│ ├── api
│ ├── emo
│ ├── mcdm
│ ├── problem
│ └── tools
├── docs
│ ├── explanation
│ ├── howtoguides
│ ├── reference
│ └── tutorials
└── tests
- The above shows only the main directory structure.
- All configuration files are found at the root level of the structure.
- The code for the current "packages" are all located in the _desdeo_ folder.
- documentation and tests are in their own folder as well (_docs_ and _tests_, respectively).
What is the motivation/use case for changing the behavior?
As mentioned, working on and developing DESDEO in its current format is a nightmare. The data footprint of DESDEO will be minuscule compared to some popular Python packages, e.g., numpy, scipy, pandas. There really is no reason to provide separate Python packages in DESDEO. Having the described monolithic structure is also justified because there are very little features in DESDEO that would be truly contained in only one package and not dependent on others. DESDEO has evolved, and a monolithic project structure is needed.
Describe alternatives you've considered
We already tried the alternative of having the packages in their own repositories, and it does not serve our purposes.
Additional context
This is also related to how documentation should be structured discussed in #72
For an in situ example of this structure, see the desdeo2 branch in this repo.
We should use
if TYPE_CHECKING:
for these -- but this is blocking on https://github.com/agronholm/sphinx-autodoc-typehints/issues/22
Canonise details such as Python version (3.12), choice of linters and related tools, etc.
Some tests are unstable: e.g. https://travis-ci.com/industrial-optimization-group/DESDEO/builds/84383265
This is because SciPyDE may not always converge. The tests should either use a method guaranteed to always converge or a known good seed should be given to begin with.
Related: #21
What is the current behavior?
In the past, we have assumed that the problems we solve in DESDEO have been defined, and are available as, Python objects. This makes it hard to utilize solvers that expect a more algebraic format of the problem, limiting our access to basic MILP and non-linear MIP solvers.
Describe the solution you'd like
Since we have moved to a new way to define the problems in DESDEO (in json format), this format can be readily parsed into various other formats. I would like to discuss a potential format that would be understood by most of the popular solvers.
What is the motivation/use case for changing the behavior?
In the past, we have lacked support for (mixed-)integer problems.
Describe alternatives you've considered
We could utilize AMPL, which basically allows us to parse problems into an .nl-format. This format is understood by most solvers. Alternatively, we could just model (and solve!) the problems (based on the json representation) as pyomoo models, which also gives us access to a plethora of solvers, and also outputs .nl-files, if needed.
Additional context
pyomoo works well with problems that depend a lot on data, e.g., many data-based parameters. While AMPL seems to be widely accepted, it is not fully open source. Pyomoo allows redistribution and is as permissive as the MIT license. However, I am not exactly sure what part of AMPL is proprietary and what is open source, perhaps I have misunderstood this.
What is the current behavior?
We have currently made some choices regarding the software development tools utilized across the packages found in DESDEO. Some of these choices might require updating to a tool better suited given our needs.
Describe the solution you'd like
Based on our earlier discussion, we have the following tools on the table:
What is the motivation/use case for changing the behavior?
Since we are rethinking the structure of the project (see #73 ), it is a good time to rethink the software development tools we intend to use so that we may choose the tools that best suit our needs.
Describe alternatives you've considered
Especially to linters and formatters, there are many alternatives. ruff seems a good candidate because it is very fast and therefore does not slow down the development process due to, e.g., excess processing times. The discussion about moving to MkDocs from Sphinx is found in another issue (#72 ). There really are not better alternatives to poetry right now, and for testing, which will consist mostly of unit tests, pytest is an industry standard. With the Python version, we should update it as frequently as possible.
Additional context
This would probably help actually using the solutions!
Given:
Output:
INFEASIBLE 0.001383
Expected:
Nothing.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.