kaixhin / fglab Goto Github PK

View Code? Open in Web Editor NEW

222.0 15.0 34.0 4.32 MB

Future Gadget Laboratory

Home Page: https://kaixhin.github.io/FGLab/

License: MIT License

JavaScript 30.01% HTML 57.96% CSS 0.23% RAML 11.79%

machine-learning reproducible-research reproducible-science

fglab's Introduction

Quickstart: https://kaixhin.github.io/FGLab/

FGLab is a machine learning dashboard, designed to make prototyping experiments easier. Experiment details and results are sent to a database, which allows analytics to be performed after their completion. The server is FGLab, and the clients are FGMachines.

Installation
Overview
Examples
API

Installation

FGLab tries to follow the SemVer standard whenever possible. Releases can be found here. There are 3 ways to run FGLab: Installing locally, via Docker, or hosted on Heroku.

Option 1: Local

Install Node.js from the website or your package manager.
Install MongoDB from the website or your package manager.
Make a database directory for MongoDB. For example, mkdir -p <working directory>/db.
Run the MongoDB daemon. From the previous example, run mongod --dbpath <working directory>/db.
Either clone this repository or download and extract a zip/tar.
Move inside the FGLab folder.
Run npm install. npm install also runs bower install to install additional required packages.
FGLab requires a .env file in this directory. For most installations, it should be possible to copy example.env to .env, but it may require customisation for non-standard MongoDB ports, or setting a different port for FGLab. An alternative is to set the following environment variables:

MONGODB_URI (MongoDB database URI)
FGLAB_PORT (port)

Run node lab (or npm start) to start FGLab. You can now access the user interface from a browser on the current machine at http://localhost:<FGLAB_PORT>, where <FGLAB_PORT> is 5080 by default. For remote access, you need to be able to access the machine FGLab is running on from your remote machine via a local network or the internet. Given the default port, you would replace http://localhost:5080 with http://lan-hostname:5080 or http://public-address.com:5080, respectively.

Please read the overview to understand how FGLab and FGMachine cooperate - both are needed in order to run experiments. Afterwards, you should set up instances of FGMachine.

To update, run npm run update.

Option 2: Docker

Start a MongoDB container and link it to the FGLab container:

sudo docker run -d --name mongodb mongo
sudo docker run -d --name fglab --link mongodb:mongo -p 5080:5080 kaixhin/fglab

Although not recommended, it is possible to adjust project schema and other parts of the database. This can be accomplished either by connecting directly to MongoDB or via a GUI such as mongo-express.

sudo docker run -d --name mongo-express --link mongodb:mongo -p 8081:8081 mongo-express

Option 3: Heroku

The deploy button provisions a free dyno running FGLab on Heroku, with a free 500MB MongoDB database from MongoLab.

Overview

FGLab is based on several classes of object. One begins with a project, which involves adjusting variables to achieve the desired results. In machine learning, these variables are hyperparameters, which are set for the project. In a more general setting, the variables are simply options, which may therefore include implementation-dependent details. A project will then comprise of a set of experiments derived from adjusting options.

Projects

A project is created by uploading a JSON schema. JSON is a human-readable data-interchange format that is widely used and has mature libraries available for most programming languages.

The JSON schema represents a map/associative array (without nesting), where the values are an object comprising of several fields:

type:
- int
- float
- bool
- string
- enum
default: Default value
values: An array of strings comprising the enum

See mnist.json as an example schema for a project. Each schema should be uploaded with the filename corresponding to the desired name for the project e.g. mnist.json.

Often it is hard to specify some options in advance e.g. the type or structure of the machine learning model. Sometimes code may change, which would influence the results. The string type can be used to address changing options and versioning manually e.g. cnn.v2.

This is stored by FGLab, and is used to construct a form which lets one choose options and submit an experiment to an available machine. The options are sent to your machine learning program via the FGMachine client. Your machine learning program then accepts the different fields via command-line options, the details of which are in the FGMachine documentation. Note that the _id field is reserved, as this will store the experiment ID as a string.

FGMachine will spawn your machine learning program, which should produce output files to be sent from FGMachine to FGLab. The details of this is available in the FGMachine documentation.

Grid and random search optimisers have also been implemented in FGLab, to allow searching over a range of hyperparameter space. Multiple string values are delimited by commas (,).

Experiments

An experiment is one complete training and testing run with a specific set of options. Depending on the experiment it may be impossible to control for every source of randomness, so experiments with the same set of options will still be assigned unique IDs. Experiments have a unique ID, in addition to a project ID, a machine ID, the chosen options, the current status (running/success/fail), timestamps, results, and custom data; this provides a comprehensive record of the experiment as a whole.

The experiment page contains a "Logs" window, which uses WebSockets to display the experiment's stdout and stderr live. There is also an editable "Notes" text box that is automatically saved (at an interval of 0.5s), displaying on both the experiment page itself and the table of experiment results.

The current format for results is documented with FGMachine.

Machines

A FGMachine client registers itself with FGLab, providing hardware details as well as an address for interaction between FGLab and the machine. A machine (FGMachine) stores its own details, as well as a list of supported projects. Before a new experiment is chosen to be run, FGLab queries all machines in order to determine a machine with the capacity to run the experiment.

Note that machines are implementation-independent, and may well store their own (large) data on experiments, for example learnt parameters and logs. As mentioned before, these can be uploaded to FGLab's database.

Examples

Examples utilising the range of abilities of FGLab/FGMachine can be found in the examples folder.

Password protection

Just set up PASSWORD variable without and quotes in .env file to protect your FGLab with a password. Note: you should type in that password into password field, when prompted to. Example:

PASSWORD=friend

API

The API is largely undocumented due to ongoing development introducing breaking changes. Ongoing documentation is available in RAML: api.raml. The following are noted for convenience:

Submit a new experiment with a set of options

POST /api/v1/projects/{projectId}/experiment

e.g. curl -X POST -H "Content-Type: application/json" -d '{projectOptions}' http://{FGLab address}/api/v1/projects/{projectId}/experiment

If the project does not exist, returns 400 {"error": "Project ID <projectId> does not exist"}. If the projectOptions are invalid, returns 400 {"error": "<validation message>"}. If no machines are available to run the job, returns 501 {"error": "No machine capacity available"}. If the machine fails to run the experiment for some reason, returns 500 {"error": "Experiment failed to run"}. If successful, returns 201 {"_id": "<experimentId>"}.

Start a batch job with a list of option sets

POST /api/v1/projects/{projectId}/batch?retry={retryTimeout (optional)}

e.g. curl -X POST -H "Content-Type: application/json" -d '[{projectOptions}]' http://{FGLab address}/api/v1/projects/{projectId}/batch?retry={retryTimeout (optional)}

The optional retry parameter specifies the maximum time in seconds to wait before trying to run a queued job again after capacity has been reached (the interval is randomly picked from a uniform distribution). If the project does not exist, returns 400 {"error": "Project ID <projectId> does not exist"}. If any of the projectOptions are invalid, returns 400 {"error": "<validation message>"} for the first set of options that are wrong. If successful, returns 201 {"status": "Started"}. Future work aims to create a proper "optimiser" object that can be queried and have its work queue adjusted appropriately (hence differentiating it from a simple batch job queue).

Register a webhook for an event

POST /api/v1/webhooks

e.g. curl -X POST -H "Content-Type: application/json" -d '{webhookOptions}' http://{FGLab address}/api/v1/webhooks

webhookOptions expects the following options

{
  "url": "<URL to POST to>",
  "objects": "<object collection to listen to (currently only 'experiments')>",
  "object_id": "<object ID>",
  "event": "<event to listen to (currently only 'started' or 'finished')>"
}

If a valid URL is not provided, returns 400 {"error": "Invalid or empty URL"}. If a valid object collection is not provided, returns 400 {"error": "Object is not 'experiments'"}. If a valid event is not provided, returns 400 {"error": "Event is not 'started' or 'finished'"}. If an object ID is not provided, returns 400 {"error": "No object ID provided"}. If successful, returns 201 {"status": "Registered", "options": <webhookOptions>"}. When the event occurs, the JSON data used to register the webhook is returned.

fglab's People

Contributors

Stargazers

Watchers

fglab's Issues

"add project to machine" button does not work cross-domain

Hi,

Issue #15 implements a button to add a project to a FGMachine, but the way it is implemented only works when both FGLab and FGMachine have the same domain name.

This is the way the HTTP request is currently made:

$.ajax({
    url: address + "/projects",
    type: "PUT",
    contentType: "application/json",
    data: JSON.stringify({project_id: id})
})

However, according to the documentation for contentType in http://api.jquery.com/jquery.ajax/,

Note: For cross-domain requests, setting the content type to anything other than application/x-www-form-urlencoded, multipart/form-data, or text/plain will trigger the browser to send a preflight OPTIONS request to the server.

This means that currently if address is different from the address you're using to access FGLab, an HTTP OPTIONS request will be sent (which as far as I understood FGMachine doesn't know how to deal with), and no PUT request is ever sent.

I ran into this problem by running FGLab and FGMachine in docker containers, and accessing FGLab's UI at http://localhost:5080.
I temporarily fixed this by changing the contentType to text/plain, but this is not a good solution.

Add notes field for experiment

Make editable notepad on experiment screen for adding notes. A large text field that can be also edited as a field in the MongoDB entry, but primarily for GUI interaction.

Create batch job object

A batch job should be its own object stored in the database, with a reference to its parent project. This should allow querying of the status of jobs in the batch.

Add query params to collection GET to expand API

Self-explanatory - filtering on the database side is far more efficient than having to query all the information.

Cannot add project to host: Access-Control-Allow-Origin

Hi!

Do not know, where to ask about FGLab, so asking here...

I deployed FGLab with two FGMachine instances via Docker on two separate nodes (one running lab+machine, second - only machine). I created a sample project scheme and uploaded it to the FGLab WUI.

Then, when I'm trying to add the project to a machine, the page throws an alert with message 'undefined'. In console, there is a message about cross-site scripting security. Please find the screenshot attached.

Any ideas, where to look and what to do?

Best regards,
Roman.

remote access

how can i access the UI from a browser on a remote machine?

how to specify mongodb instance, error starting FGLab

Hi and thanks for FGLab,
I was wondering how exactly to start FGLab?
I've followed the steps of installation but when I try to start the lab I get the following errors:

> [email protected] start /home/user/FGLab
> node lab.js

Error: No MongoDB instance specified
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! [email protected] start: `node lab.js`
npm ERR! Exit status 1
npm ERR! 
npm ERR! Failed at the [email protected] start script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in:
npm ERR!     /home/user/.npm/_logs/2019-02-13T19_29_46_962Z-debug.log

Also how do you specify if your db is in a different directory than the working one?

Use WebSockets to enable live querying of experiment logs

Linking stdout from the experiment running on FGMachine to a page on FGLab via WebSockets would allow real-time tracking of running experiments - very useful.

Heroku App/Docker: cannot use any FGMachines ({"error":"No machine capacity available"})

I've read all documentation and was trying to run the Bayesian Optimisation example through the Heroku app, but at Step 5 of the "Quickstart" of the github.io documentation, when I click the add button/drop-down menu nothing happens (no options, messages, errors...). Indeed, when I try and Submit a new experiment I get {"error":"No machine capacity available"}

Thus I tried to deploy FGLab and FGMachine locally via docker, but I'm still having problems with FGMachines:

$ docker ps -a
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                    NAMES
acf5d5ea7a73        kaixhin/fgmachine   "node machine"           10 minutes ago      Up 10 minutes       0.0.0.0:5081->5081/tcp   fgmachine
7523ba457713        kaixhin/fglab       "node lab"               2 hours ago         Up 2 hours          0.0.0.0:5080->5080/tcp   fglab
4e01d03b0ae2        mongo               "/entrypoint.sh mongo"   2 hours ago         Up 2 hours          27017/tcp                mongodb

$ docker logs fgmachine
Server listening on port null
{ RequestError: Error: Invalid protocol: <MY LAN IP>:
    at new RequestError (/root/FGMachine/node_modules/request-promise-core/lib/errors.js:14:15)
    at Request.plumbing.callback (/root/FGMachine/node_modules/request-promise-core/lib/plumbing.js:87:29)
    at Request.RP$callback [as _callback] (/root/FGMachine/node_modules/request-promise-core/lib/plumbing.js:46:31)
    at self.callback (/root/FGMachine/node_modules/request/request.js:187:22)
    at emitOne (events.js:96:13)
    at Request.emit (events.js:188:7)
    at Request.init (/root/FGMachine/node_modules/request/request.js:460:17)
    at Request.RP$initInterceptor [as init] (/root/FGMachine/node_modules/request-promise-core/configure/request2.js:45:29)
    at new Request (/root/FGMachine/node_modules/request/request.js:129:8)
    at request (/root/FGMachine/node_modules/request/index.js:55:10)
    at fs.readFile.then.catch (/root/FGMachine/machine.js:80:3)
  name: 'RequestError',
  message: 'Error: Invalid protocol: <MY LAN IP>:',
  cause: 
   Error: Invalid protocol: <MY LAN IP>:
       at Request.init (/root/FGMachine/node_modules/request/request.js:460:31)
       at Request.RP$initInterceptor [as init] (/root/FGMachine/node_modules/request-promise-core/configure/request2.js:45:29)
       at new Request (/root/FGMachine/node_modules/request/request.js:129:8)
       at request (/root/FGMachine/node_modules/request/index.js:55:10)
       at fs.readFile.then.catch (/root/FGMachine/machine.js:80:3),
  error: 
   Error: Invalid protocol: <MY LAN IP>:
       at Request.init (/root/FGMachine/node_modules/request/request.js:460:31)
       at Request.RP$initInterceptor [as init] (/root/FGMachine/node_modules/request-promise-core/configure/request2.js:45:29)
       at new Request (/root/FGMachine/node_modules/request/request.js:129:8)
       at request (/root/FGMachine/node_modules/request/index.js:55:10)
       at fs.readFile.then.catch (/root/FGMachine/machine.js:80:3),
  options: 
   { uri: '<MY LAN IP>:5080/api/v1/machines',
     method: 'POST',
     json: 
      { address: 'localhost:5081',
        hostname: 'asmith26',
        os: [Object],
        cpus: [Object],
        mem: '7.72GB',
        gpus: [] },
     gzip: true,
     callback: [Function: RP$callback],
     transform: undefined,
     simple: true,
     resolveWithFullResponse: false,
     transform2xxOnly: false },
  response: undefined }

I've tried:

 docker run -d --name fgmachine -h $(hostname) -v /var/run/docker.sock:/var/run/docker.sock -e FGLAB_URL=<MY LAN IP>:5080 -e FGMACHINE_URL=<MY LAN IP>:5081 -p 5081:5081 kaixhin/fgmachine

and (difference is FGMACHINE_URL=localhost:5081)

docker run -d --name fgmachine -h $(hostname) -v /var/run/docker.sock:/var/run/docker.sock -e FGLAB_URL=<MY LAN IP>:5080 -e FGMACHINE_URL=localhost:5081 -p 5081:5081 kaixhin/fgmachine

Thank you for any help in advance!

Display machine capacities

The machine page should indicate capacity, plus gpu_capacity per GPU. This will require WebSocket communication between FGLab and each FGMachine.

Merge client-side and server-side form validation

The new experiment and optimisation pages currently use submit and on-change validation respectively. In addition there is validation code server-side (which is necessary for an API). To abide by the DRY principle, perhaps the server-side code can be sent as a function to the client-side, and both forms can utilise on-change validation.

Create GitHub Pages documentation with Hexo

Hexo is one of the most popular static site generators for Node.js. It has the ability to deploy via git, so it does seem possible to use it to create a static site for GitHub Pages.

Package as an Electron app

Packaging as a binary via Electron would help increase the ease with which FGLab can be run.

Create a proper test suite

This project should have a test suite with code coverage. Coveralls can handle the feedback.

Add "add project to machine" button

This should:

Read in the machine's projects.json
Append a template for the new project (or do nothing if already existing)
Save back to machine's projects.json
Return status message

Bug: Ignoring default values for parameters with same name

If parameters in two projects share same name, adding the second project results in displaying first project default values for the parameters with the same name

Ignoring User Input for Booleans

When attempting to run a experiment with optional boolean flags FGLab ignores user input and instead sends True for all boolean arguments.

Document API with RAML

The API currently has a default spec set up on Apiary. The API Blueprint spec should be followed to update this page properly.

Implement grid search

The optimiser's interface has been implemented with grid search in mind, although random search is implemented. Grid search should also be implemented as an option.

Debug hyperparameter optimisation

Some error cropped up that was missed by validation - need to investigate further and fix asap.

Allow uploading small files

MongoDB has the ability to store small files as binary data as part of a document. As opposed to code versions like git commit hashes, uploading source code would be a much more reliable way of maintaining code provenance during development.

Migrate from Bower

Bower is no longer being actively developed (although it is being maintained). It should be replaced with something else e.g. Browserify, Webpack, Yarn etc.

Tensorboard Integration

FGLab has some built-in graph UI:

TensorFlow's TensorBoard has a full-featured, interactive web GUI for visualizing training output:

Would it be possible to store the TensorBoard summary files in FBLab's database and then link from the FBLab web UI to open the TensorBoard UI? Ideally, FBLab would even be able to read stored values (train loss, val loss, etc.) directly from the TensorBoard summary files.

This would be an awesome integration for TensorFlow users.

Add authentication

One use case with a lot of potential is read-only access. This would allow other research group members/other collaborators/anyone with access to view experiment results.

Compare charts across experiments

Since plots can be uploaded, the major benefit of _charts would be the ability to display charts across experiments.