samsung / qaboard Goto Github PK

Experiment tracker: organize, visualize, compare and share runs. Removes toil from algorithm/performance R&D and tuning.

Home Page: https://samsung.github.io/qaboard

License: Apache License 2.0

JavaScript 57.29% HTML 0.19% CSS 0.80% Shell 0.58% Dockerfile 0.81% Python 40.19% Mako 0.04% Makefile 0.10%

visualizations algorithms continuous-quality performance-engineering mlops

qaboard's Introduction

Experiment tracking framework with advanced viewers.

Helps algo/ml/perf engineers share results, collaborate, and build better products.

Features

Organize, View and Compare Results, Tuning/Optimization
Web-based, with sharable URLs.
Visualizations: support for quantitative metrics, and many file formats: advanced image viewer, support for videos, plotly graphs, flame graphs, text, pointclouds, HTML...
Integrations: direct access from Git and CI tools, easily exportable results, API, links to the code, trigger gitlabCI/jenkins/webhooks...
Agnostic to your language/framework: run your existing code, write files, view them.

For screenshots check our website.

Benefits

QA-Board across many projects enables us to:

Scale R&D: enable engineers to achieve more and be more productive.
Faster Time-to-Market: collaboration across teams, workflow integration..
Quality: uncover issues earlier, KPIs, tuning, reporting...

Getting Started

Read the docs! You will learn how to:

Start a QA-Board server
Wrap your code with QA-Board
View output files and KPIs
...and setup parameter tuning, integrations with 3rd party tools, etc.

If you want to learn about the code's organization, or how to contribute, read CONTRIBUTING.md

Feedback? Questions? Need Help? Found a bug?

Don't hesitate to get in touch! Contact [email protected], we'll be delighted to hear your insights.

If you've got questions about setup, deploying, want to develop new features, or just want to chat with the developers, please feel free to start a thread in our Spectrum community!

Found a bug with QA-Board? Go ahead and submit an issue. And, of course, feel free to submit pull requests with bug fixes or changes to the master branch.

Contributors

QA-Board was started at Samsung SIRC by Arthur Flam.

Thanks to the following people for their contributions, testing, feedback or bug reports: Amir Fruchtman, Avi Schori, Yochay Doutsh, Itamar Persi, Amichay Amitay, Lena Grechikhin, Shai Shamir, Matan Danino, Roy Shaul, Gal Hai, Rivka Emanuel, Ela Shahar, Nadav Ofer, David Nukrai. Thanks also to Sebastien Derhy, Elad Rozin, Nathan Levy, Shahaf Duenyas, Yotam Ater, Asaf Jazcilevich and Yoel Yaffe for supporting the project.

You don't see your name? Get in touch to be added to the list!

Credits

The logo is a the Poodle twemoji 🐩, recolored in Samsung Blue 🔵. Copyright 2019 Twitter, Inc and other contributors. Code licensed under the MIT License. Graphics licensed under CC-BY 4.0

qaboard's People

Contributors

Stargazers

Watchers

Forkers

amirfru arthur-flam chloe-flam shantu sayan801 musashi7 davidhuji ljh9248 houdza jhunahn jwcastillo junhwanpark

qaboard's Issues

Choice of reference per batch

There are some usability improvement we could do for projects where there are multiple batches of results (e.g. denoise, hdr...):

Currently, if users select say denoise, then hdr, the reference is still set to denoise. Let's change that! (unless maybe users explicitly selected the reference batch and we can find comparisons...?)
There is a request for a feature where we can set different references for different batches. E.g. instead of comparing by default vs the latest commit in master, we could compare hdr vs some milestone, and denoise versus another one... Now, would we specify those preferences in batches.yaml, qaboard.yaml? The main concern is remaining faithful to the likely users' intent and doing it, UI-wise, without surprising them...

In term in implementation,

in the app: we want to make sure selected.reference_batch et selected.reference_commit_id can be undefined, and get the reference we want from the batch.data or project config.
in the CLI: we need to decide where to read the info and save it.

Bye!
Arthur

Non-scalar metrics: categories, sequences, scatterplots etc

I think it may be really cool if we had Scatterplot metrics (were the variables consists of the numerical metrics and maybe also some numerical inputs).

Pipelines / DAG

Currently QA-Board lacks expressiveness for our common use-case of:

Run on some images
Calibration
Validation
Likewise, we can't express easily pipelines like training-evaluation.

We need to express running series of steps / pipelines / tasks organized as directed-acyclic-graph.

We're looking for feedback or alternative ideas. Especially if you have experience with various flow engines, e.g. DVC. Thanks!

Workarounds

User have done this:

wrapped qa batch with a scripted pipeline
wrote complicated run() function with lots of logic

Status

Implement user-side support for sequential pipelines
Support pipelines officially in QA-Board
Support DAGs

Possible API

batch1:
  inputs:
  - A.jpg
  - B.jpg
  configurations:
  - base

batch2:
  needs: batch1
  type: script
  configurations:
  - python my_script.py {o.output_dir for o in needs["batch1"]}

More complex:

my-calibration-images:
    configurations:
    - base
    inputs:
    - DL50.raw
    - DL55.raw
    - DL65.raw
    - DL75.raw

my-calibration:
    needs:
      calibration_images: my-calibration-images
    type: script
    configurations:
    - python calibration.py ${o.output_directory for o in depends[calibration_images]}

my-evaluation-batch:
    needs:
      calibration: my-calibration
    inputs:
    - test_image_1.raw
    - test_image_2.raw
    - test_image_3.raw
    configurations:
    - base
    - ${depends[calibration].output_directory}/calibration.cde

$ qa batch my-evaluation-batch
#=> qa batch my-calibration-images
#=> qa batch my-calibration
#=> qa batch my-evaluation-batch

Thoughts

We should add built-in support for script input types, than just executes their config as commands. It goes well with DAGs.

my-script:
  needs: batch1
  type: script
  configurations:
  - echo OK

Expected

Easy API
Cache friendly
Can be used in a non-blocking way

Create a batch from filtered outputs

User often do lots of parameter tuning. What some would like is create a new batch from filtered results (e.g. my_param: best_value), so that they can save them as a milestone with a descriptive name.

TODO

Add a "Move filtered outputs" menu item, next to the "Rename batch" item, and much like it.
Send a request with the commit, batch, filter.
Get the params in the API
Add the necessary logic to backend.models.Batch.rename(): get_or_create a batch with the target name and move results there.
Fix: currently we don't check if the user uses a name that's already used when renaming. Let's implement a "move" in this case. The logic for normal/filtered cases can be unified while doing this.

Nginx Misconfig

Hello Team,

In the deploy files there is a bad configuration in the nginx alias where it is possible to perform a traversal path to access files on the server running the QA-Board. An attacker can use this to scour files on the server that could compromise QA-Board users/customers.

For the technique to be applicable, the following conditions must be met:

The location directive should not have a trailing slash in its path;
An aliasdirective must be present within the location context, and it must end with a slash.

From the procedures of the deploy steps I was able to carry out the proof of concept:

git clone https://github.com/Samsung/qaboard.git
cd qaboard

docker-compose pull
docker-compose up -d

Steps to Reproduce

curl "http://localhost:5151/docs../etc/passwd" | head -n 50

I apologize if this is of no use to you.

Best Regards,
dk4trin.

Experiment-focused UI

For interactive workflows, it would be nice to have need less emphasis on commits, more on "experiments":

show per users the latest experiments (qa batch --share + tuning from the UI)
easily organize those experiments: better labels, groups, move, rename, delete, cross-commit labels...
locally, qa --share should just create a new experiment each time (maybe unless qa --share --reuse)...

For better comparisons we'd need:

show N batches together
plots with all their data at the same time (e.g. compare convergence curves...)

We'd need to tweak the UI a bit!

cc @DavidHuji

Open-Source Release Status

Go!

Welcome to QA-Board! We are working on open-sourcing the project and are targeting an official release for Q3 2020.

What needs to be done before the release

Pre-release, as all items marked 🌱 are done.

Open-source process

Authorization to release as OSS 🌱
Licence (Apache 2) 🌱

Docs

Technical changes

qa init creates a functional project
Merge the qatools and qaboard internal repositories 🌱
Make sure nothing is hardcoded for our infra : pass secrets at least as ENV 🌱variables or config files, and handle mount naming convention with a centralized config.... 🌱
Decide on a way to pick a default location where to store the output data. Needs to be mounted by the container and writable easily.
Ensure 1min server bringup with e.g. docker-compose
Remove USER arthurf from Dockerfile, maybe make it configurable to avoid NFS+squash_root issues.

API changes:

qaboard-backend.slamvizapp>backend.backend, qaboard-webapp>webapp
qatools.yaml->qaboard.yaml
Import as qaboard not qatools
Output folder naming conventions with hashes
Friendlier run() API (ctx.obj->ctx : RunContext)
sed -i s/qatools/qaboard/g *

Features

Make it easy to support more async task queues - Wiki
Support for at least one common task queue beyond what we have internally at Samsung, and explain how to implement support for others.
Create a public CI
Support more git servers (e.g. Github). Don't even require Read rights.
Make it easier to debug visualizations and metrics with per-batch settings

PR

Public demo server
All good for a release blog post

Compare results from different projects

Is your feature request related to a problem? Please describe.
When working with subprojects, users often want to compare their results versus a different project. For instance, if a subproject is the v1 of a project, and another is the v2, users will want to select v1 as a reference.

Describe alternatives you've considered
So far, users get around the issue by running by either:

exporting results and comparing with a third-part tool: it's annoying and those 3rd party tools often lack features...
re-running what they want to compare to as part of their project: it require care to do right and in the end it's still confusing.

TODO

Review the text field that selects a commit: maybe use an icon or a "select" button that hint it's possible to hover and select a different project in a meny (BTW currently we show the available milestones on hover, it's hard to discover. Let's change this too).
Offer a select menu with all the related subprojects
Change state.selected so that it's possible to specify a project... Keep in mind it should still be possible to switch new/ref.
Milestones should save non-default projects, and the non-default project should appear in the milestone selection menu.
The backend/api/export_to_folder.py has to do the same matching as the web application, so it could require some tweaking... Same goes for qaboard/optimization.py when optimizing improvement versus a reference.

Renaming batches of results

First, thanks for open-source this great project.
I would like to suggest a useful feature that will enable to edit batch names (really useful when you have many batches and you want to rename them to be with a meaningful concise names).

Missing hover tooltip on the "Image Diffs"

Describe the bug
When hovering over the diff image, the x-y coordinates and the rgb values of the pixel in the images being compared are not updated.

To Reproduce
Steps to reproduce the behavior:

Click on the Visualizations tab
Click on the Image Diff button
Hover over a diff image
See error - x-y and rgb values are "stuck"

Expected behavior
x-y coordinates and rgb values of the images being compared should change and consist of correct values

Desktop

OS: Windows 10
Browser: chrome

QA-Board for performance engineering

Right now QA-Board focuses on algorithm engineering. Another big area is software performance.

How do people track software performance?

Unit tests are not enough to judge software performance. Some organizations:

track their test suit runtime over time. It helps get a trend but comparisons are hard because the tests keeps changing.
use acceptance tests that check runtime/memory thresholds, and monitor regressions.

On the ops side, if we're talking about applications/services:

there are many great products: monitoring like datadog/newrelic, crash analytics like sentry...
smart monitoring solutions correlate anomalies with commits and feature flags.
the "future" is likely tooling based on canary deploys to identify perf regressions on real workflows.

For libraries or products used as dependencies by others, it's not possible to setup those tools. Could QA-Board help "shift-left" and help identify issues before releases?

Development workflows for performance engineering

Engineers doing optimization have a hard time keeping track of all their versions and microbenchmarks. The tooling is focused on the live experience (debuggers-like, checking the assembly) and investigate one version at a time.
To keep track, the best tool I've seen to identify issues ahead of time and help during coding is https://perf.rust-lang.org

Software engineers have the same need for "run tracking" as algorithm engineers.

Features needed

Examples of integrations with tools such as perf.
Visualizations:
- Brendan Gregg's flame graphs, and the diff version
- Charts from a benchmark tool (like hyperfine)
- Compare generated code like Goldbolt
Examples of visualizations of metrics like binary size, IPC, time, page faults, gas..
We could add anomaly detection on top to warn about regressions early.

Reference: perf/profiling tools

gperftools
pprof
flame graph viewers: flamescope,speedscope and flamebearer
List of performance analysis tools
servo benchmarking, screenshot

uncommon failure while running a CI

Describe the bug
failing to run in ci board, getting the following error:
Error signal: -11 SIGSEGV
was adviced by Arthur that another user has the same issue.

To Reproduce
Steps to reproduce the behavior:
run my ci:
https://qa/CDE-Users/HW_ALG/CIS/tests/products/HM1/commit/33f78fa151505a4b26caf6f4d80104dee0f492af?batch=Full_Mode&reference=aa8a6a20fe0b4e43b0276f6a2934a76d7d8850dc&selected_views=logs&filter=

Slimmer docker images

The images we build are big.

Let's do a round of best-practicing and slim them down:

remove unneeded build steps.
merge stable steps to have less layers.
remove caches.
consider multi-stage builds, make them still easily cachable.

What are default username and password of QA-board

Hello,

Thank you for your great project.

I tried steps in https://samsung.github.io/qaboard/docs/deploy
After

git clone https://github.com/Samsung/qaboard.git
cd qaboard

docker-compose pull
docker-compose up -d

I can access to QA-Board at http://localhost:5151 properly.
But I could not Login.
Could you please share how to login QA-Board by default?

Thank you,

samsung / qaboard Goto Github PK

qaboard's Introduction

Features

Benefits

Getting Started

Feedback? Questions? Need Help? Found a bug?

Contributors

Credits

qaboard's People

Contributors

Stargazers

Watchers

Forkers

qaboard's Issues

Workarounds

Status

Possible API

Thoughts

Expected

TODO

Steps to Reproduce

What needs to be done before the release

Open-source process

Docs

Technical changes

API changes:

Features

PR

TODO

How do people track software performance?

Development workflows for performance engineering

Features needed

Reference: perf/profiling tools

Recommend Projects

Recommend Topics

Recommend Org

Jobs