GithubHelp home page GithubHelp logo

mtreinish / ciml Goto Github PK

View Code? Open in Web Editor NEW
5.0 5.0 4.0 8.39 MB

a machine learning pipeline for analyzing CI results.

License: Apache License 2.0

Python 1.13% Makefile 0.01% Shell 0.15% Smarty 0.01% Dockerfile 0.04% Jupyter Notebook 98.65%

ciml's People

Contributors

afrittoli avatar jlousada315 avatar kwulffert avatar manjeetbhati avatar mtreinish avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

ciml's Issues

db uri

Hi !

Can you please clarify, where I can get the db uri ? I am not able to cache the raw data .

Thank you in advance !

README.MD Example

Hi !

The latest example that was committed doesn't work:

raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='openstack.fortnebula.com', port=13808): Max retries exceeded with url: /v1/AUTH_e8fd161dc34c421a979a9e6421f823e9/logs_18/679218/1/gate/tempest-full/88fdd79/controller/logs/dstat-csv_log.txt.gz (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x11d085690>: Failed to establish a new connection: [Errno 60] Operation timed out'))

Any idea why ?

Thank you

Connection time out

Cache data fails in case of local data (not S3)

Running

ciml-cache-data --build-name tempest-full --db-uri mysql+pymysql://query:[email protected]/subunit2sql

returns

Traceback (most recent call last):
  File "/git/github.com/mtreinish/ciml/.venv-mpl/bin/ciml-cache-data", line 10, in <module>
    sys.exit(cache_data())
  File "/System/Volumes/Data/git/github.com/mtreinish/ciml/.venv-mpl/lib/python3.6/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/System/Volumes/Data/git/github.com/mtreinish/ciml/.venv-mpl/lib/python3.6/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/System/Volumes/Data/git/github.com/mtreinish/ciml/.venv-mpl/lib/python3.6/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/System/Volumes/Data/git/github.com/mtreinish/ciml/.venv-mpl/lib/python3.6/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/git/github.com/mtreinish/ciml/ciml/gather_results.py", line 734, in cache_data
    s3_url)
  File "/git/github.com/mtreinish/ciml/ciml/gather_results.py", line 746, in cache_data_function
    runs, build_name, limit, '1s', db_uri, data_path=data_path, s3=s3)
  File "/git/github.com/mtreinish/ciml/ciml/gather_results.py", line 424, in gather_and_cache_results_for_runs
    data_path=data_path, s3=s3)
  File "/git/github.com/mtreinish/ciml/ciml/gather_results.py", line 335, in _get_data_for_run
    use_http=use_remote, data_path=data_path, s3=s3)
  File "/git/github.com/mtreinish/ciml/ciml/gather_results.py", line 171, in _get_dstat_file
    run_uuid, use_s3, s3)
TypeError: 'NoneType' object is not iterable

Collecting Cache Data

From logstack.openstack.org, I find an available build_name, for example , tempest-full-py3.

After running:
ciml-cache-data --build-name tempest-full-py3 --db-uri mysql+pymysql://query:[email protected]/subunit2sql

and I get:
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

What is the command I need to run before the first one ? or what am I doing wrong?

I get that I need to connect to the DB, but from the README file, I thought running ciml-cache-data would be enough to have the .zip file

Dstat data

Hello !

Why did you choose to use dstat data ? why do you think statistics from your system resources is useful to predict whether a test will fail or not ? what is the intuition ?

In other cases I have found the features are more like: lines of coded added , deleted; similarity between tests (hamming distance) , pass/fail history, etc.

collecting data

can you send me the .zip file of the tempest-full data you use to obtain your results ? I just can't surpass that step and I don't know why :)

the email is joao.lousada1_at_gmail.com

thank you, I will try to see why I can't collect the data independently anyway. But right now I really need to take a look at the dataset and see what features you use and test my own algorithms.

Argument must be string

I now have access to the tempest full dataset and the correspondent .json files.

Now I try to visualize them, by running: ciml-build-dataset --dataset tempest-full --visualize.

and I get the following exception:
raise TypeError("first argument must be string or compiled pattern")
TypeError: first argument must be string or compiled pattern

cache-data nr of entries in data

Hi !

When caching the tempest-full data, I get around 500 entries for test results and from those the test that fail are very few, around 10.

How can I know how many test results a dataset has? and how can I collect a bigger dataset ?

Thank you

Find other build names

how can I find other projects data available to cache, like tempest-full ?

Thank you !

How to connect to the subunit2sql DB

logstash.openstack.org/subunit2sql is the URL of a mysql database, so you need to use >an SQL client if you want to connect directly to it. See more details on >https://docs.openstack.org/infra/system-config/logstash.html#subunit2sql .

The subunit2sql >project defines the DB schema and also exposes a python API to access the data in the >DB.

Originally posted by @afrittoli in >#43 (comment)

to use an SQL client if you want to connect directly to it. How do I connect ? I don't have MySQL experience, so I'm having trouble knowing which are the steps.

Thank you

re.compile(feature_regex)

Hello,

In trainer.py, line 167:
col_regex = re.compile(features_regex)

It seems that features_regex is empty. Added if condition feature_regex is None, set feature_regex = ' ' , an empty string. Now, I have .npz files, but I dont know why I lost in the way. Probably, no labels?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.