The notebook in this repo demonstrates automated machine learning using TPOT on a sample of credit card fraud data. AutoML is compute intensive, so to speed up the wall-clock time of the computation, we distribute compute over a Dask cluster, which we create ad-hoc using the CML Workers API.
There are three ways to launch this project on CML:
- From AMP Catalog - Navigate to the AMP Catalog in a CML workspace, select the "AutoML with TPOT" tile, click "Launch as Project", click "Configure Project".
- As ML Prototype - In a CML workspace, click "New Project", add a Project Name, select "ML Prototype" as the Initial Setup option, copy in the repo URL, click "Create Project", click "Configure Project"
- Manual Setup - In a CML workspace, click "New Project", add a Project Name, select "Git" as the Initial Setup option, copy in the repo URL, click "Create Project". In this case, the dependencies in
requirements.txt
must be manually installed. Open up a session and install withpip install -r requirements.txt
in the terminal. (For legacy engines, replacepip
withpip3
).
Once the project has been initialized in a CML workspace, run the automl.ipynb
notebook by starting a Python 3 JupyterLab or notebook session.