GithubHelp home page GithubHelp logo

symreg_methods_comparison's Introduction

symreg_methods_comparison

The repository contains the code for evaluation of five symbolic regression methods, as was done in the paper Probabilistic grammars for modeling dynamical systems from coarse, noisy, and partial data (in submission). The repository is still undergoing updates.

Part 0 - Preparation

  1. Download/Clone the repository to the prefered location.

  2. Pull the singularity container from the SyLabs singularity library. To do that first install singularity (if you have not already), go to the repository location and pull the container from the library using this command: singularity pull library://nomejc/symreg/symreg:latest. If the command fails, you will have to change singularity's default remote server following these steps:

    • Run command singularity remote add --no-login SylabsCloud cloud.sylabs.io
    • Run command singularity remote use SylabsCloud
    • Run the same command singularity pull ... as above.
  3. Download Dynobench benchmark from the Zenodo platform, that is located here: https://zenodo.org/records/10041312. Save the dataset folder .\dynobench\data\* inside the symreg_methods_comparison folder

  4. Download L-ODEfind software from: https://github.com/agussomacal/L-ODEfind.

  5. GPoM requires R to run. While you need to install R yourself (also RStudio IDE is recommended), GPoM package is downloaded automatically from the CRAN repository, inside the R script.

  6. Modify the data files for DSO using the script .\utils\dso_prepare_data.py. Similarly, modify the data for L-ODEfind and GPoM using the script .\utils\lodefind_gpom_prepare_data.py. The modified data files will be saved inside .\data folder.

  7. Create candidate structures for ProGED using .\src\proged_generate_structures.py.

Part I - System identification using training datasets

We ran system identification with the methods ProGED, DSO and SINDy on the high-performance computing cluster. To repeat the experiments, follow the steps below.

Copy the symreg.sif container as well as the folders .\src, .\data and .\results to the local node on the cluster. Note that with ProGED, we first create possible structures using grammars locally, using the script .\src\check_proged\proged_generate_structures.py. The structures are saved in the path .\symreg_methods_comparison\results\sysident_num_full\proged\structures for full observability scenario. The structures folder should also be copied to the cluster, for ProGED to run properly.

Run the bash shell script that corresponds to the method you want to run (e.g. for SINDy, run run_sindy.sh). We call the scripts using slurm's sbatch command. For ProGED, run run_proged_outer.sh, which will call the run_proged.sh. The reason is this way more than one thousand jobs can be submitted to the slurm. Make sure that you also manually create the ./slurm folder for the log files, otherwise the jobs will fail.

We ran the other two methods for partial observability, GPoM and L-ODEfind, locally, using the scripts .\src\check_gpom\gpom_system_identification.R and .\src\check_lodefind\lodefind_system_identification.py. Importantly, the script lodefind_system_identification.py has to be run inside L-ODEfind-master root directory. The results of the system identification will be then saved in the symreg_methods_comparison\results... path as for other methods.

Part II - Validation datasets

In this part, we evaluate all the models that were returned by the methods using the validation datasets. We ran the validation for full observability and for ProGED partial observablity results on the cluster using the command sbatch run_validation.sh. The bash script runs the python code common1_validation_hpc.py. Validation of GPoM and L-ODEfind results was done locally, using the scripts ./check_gpom/gpom_validation.py and ./check_lodefind/lodefind_validation.py respectively. Note that the evaluation on the test set is done within the same script.

Part III - Evaluation of best model per method using three metrics - trajectory error on test data, term difference and complexity

To do the final evaluation of full observability results, first run the common2_testing.py and then the common3_TD_complexity.py script. The figures for the paper were created using the common4_make_figures.py script.

symreg_methods_comparison's People

Contributors

ninaomejc avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.