AutoAspen is a comprehensive tool for conducting stochastic techno-economic analysis (TEA) using machine learning techniques. It has been developed to facilitate dataset creation, model training, parameter optimization, and Monte Carlo sampling for estimating the distributions of critical parameters, such as the minimum selling price (MFSP), in the context of chemical engineering pathways. With the power of trained models, AutoAspen is capable of:
-
Univariate Uncertainty Analysis that evaluates the impact of individual variables on the MFSP while holding all other factors constant.
-
Bivariate Uncertainty Analysis that provides insights into how the MFSP responds to variations in pairs of variables, which are visually represented in contour plots.
-
Multivariate Uncertainty Analysis that allows users to modify a set of input variables simultaneously during a simulation.
AutoAspen has been developed and tested on Python 3.8 with the following dependencies: numpy 1.20.2, pandas 1.0.5, scipy 1.5.0, scikit-learn 0.23.1, matplotlib 3.2.2, seaborn 0.12.0, pillow 7.2.0, xlrd 1.2.0, pythoncom, and win32com.
-
generate_dataset_template.py
: This script creates a dataset template for training by generating random values for input variables based on specified distributions defined in thevar_info
file. Supported distributions include: normal, alpha, beta, gamma, triangular, pareto, and bernoulli. A samplevar_info
file is provided here. -
generate_dataset.py
: It generates a training dataset by invoking the Aspen model and the .xslm calculator. This step automates the traditional stochastic TEA, which usually involves iterative calls to Aspen Plus software. -
train_regression_model.py
: This script trains and fine-tunes four machine learning models. Users can select the best-performing model for Monte Carlo simulation. Supported machine learning models include: polynomial ridge regression, linear SVM, random forest, and gradient tree boosting. -
predict_and_simulate.py
: This script predicts the MFSP using a trained regression model and conducts univariate, bivariate, or multivariate uncertainty analysis. The distributions and baseline values of input variables are specified in theconfig
file.