GithubHelp home page GithubHelp logo

romain-jacob / triscale Goto Github PK

View Code? Open in Web Editor NEW
14.0 3.0 2.0 8.48 MB

TriScale software

License: GNU General Public License v3.0

Python 0.29% Jupyter Notebook 99.71%
networking replicability experimental-design data-analysis

triscale's Issues

KPI calculated even if too little data supplied

This might be an issue in TriScale, or me misunderstanding a use-case.

TL;DR: analysis_kpi() returns a valid value when too few data-points supplied - if the "unintuitive" bound is selected (upper for percentile < 50, and vice versa).

Background: The intuitive way to calculate a KPI is to specify a bound which gives us the "worst case" (upper when percentile > 50, and vice versa). This allows us to make the "performance is at least X"-statements. However, I was thinking there was information in the other bound as well. This would show the width of the CI, and we could learn if the given metric varies a lot between runs. The first example coming to mind is industrial scenarios, where not only the maximum latency is interesting, but also its variability.

With this background I was routinely calling analysis_kpi() twice, once with bound set to upper and another with lower. Doing this I noticed I would be getting a valid value when the "unintuitive" bound was selected (upper for percentile < 50, and vice versa), even if I had too little data.

Example with too little data:

import triscale as triscale
import numpy as np

data = np.random.randint(0, 10, size=(5))

settings = {"bound": "lower", "percentile": 99,
            "confidence": 95, "bounds": [min(data), max(data)]}

independent, kpi = triscale.analysis_kpi(
                    data,
                    settings,
                    verbose=False)
print("KPI: " + str(kpi))

With bound set to "upper", the KPI correctly returns NaN. With bound set to "lower", a number is returned.

Division by zero in convergence test if yMax == yMin,

In analysis_metric(), if all elements in "y-axis" data are the same (i.e. min and max are identical), it leads to a division by zero in the convergence test, see https://github.com/romain-jacob/triscale/blob/master/helpers.py#L66 and two lines above.

The issue can easily be reproduced:

x = np.arange(0, 100)
y_same_value = np.full(len(x), 100)
df = pd.DataFrame(
    {'x': x,
     'y': y_same_value})

triscale.analysis_metric( 
    df,
    metric = {'measure': 50},
    convergence = {'expected': True})

>  triscale/helpers.py:66: RuntimeWarning: invalid value encountered in true_divide

An intuitive solution is to simply state that the data is converged in such cases with identical elements, but perhaps I am missing something about the statistics so I'll leave the PR to someone else ๐Ÿ˜…

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.