GithubHelp home page GithubHelp logo

mozilla / telemetry-analysis-service Goto Github PK

View Code? Open in Web Editor NEW
35.0 36.0 29.0 4.61 MB

Telemetry Analysis Service

Home Page: https://analysis.telemetry.mozilla.org/

License: Mozilla Public License 2.0

Python 82.47% CSS 1.30% HTML 11.08% Shell 0.81% JavaScript 3.68% Makefile 0.25% Dockerfile 0.41%
django telemetry mozilla data analysis spark python aws emr hadoop

telemetry-analysis-service's Introduction

atmo - The code for the Telemetry Analysis Service

Documentation Status CircleCI codecov Stories ready Stories in progress CalVer - Timely Software Versioning

The full documentation can be found on Read The Docs:

https://atmo.readthedocs.io/

telemetry-analysis-service's People

Contributors

comloo avatar dependabot[bot] avatar fbertsch avatar greenkeeper[bot] avatar haroldwoo avatar harterrt avatar jezdez avatar lonnen avatar maurodoglio avatar mozilla-github-standards avatar psiinon avatar pyup-bot avatar renovate[bot] avatar robhudson avatar robotblake avatar sunahsuh avatar vitillo avatar wlach avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

telemetry-analysis-service's Issues

AttributeError: 'SparkJobProvisioner' object has no attribute 'stop'

While testing locally in a branch I bumped into this. I didn't test it on master but looking at the code this seems like it may also be a problem.

The SparkJob.terminate calls self.provisioner.stop(self.current_run_jobflow_id), where self.provisioner is a SparkJobProvisioner. But currently the SparkJobProvisioner has no stop method and neither does the parent class, it is only defined on the ClusterProvisioner. In my branch I changed the terminate method to call self.cluster_provisioner.stop instead and it fixed my issue locally.

TemplateSyntaxError on atmo.dev.mozaws.net

I tried to go to the dev cluster to kill a cluster I spun up earlier and received the django debug error page. Looks like there's a hanging parens from the recent PR switching the template backend @jezdez

TemplateSyntaxError at /

Invalid block tag on line 6: 'url('clusters-new''. Did you forget to register or load this tag?

Request Method:     GET
Request URL:    https://atmo.dev.mozaws.net/
Django Version:     1.9.1
Exception Type:     TemplateSyntaxError
Exception Value:    

Invalid block tag on line 6: 'url('clusters-new''. Did you forget to register or load this tag?

Exception Location:     /usr/local/lib/python2.7/dist-packages/django/template/base.py in invalid_block_tag, line 568

Server time:    Thu, 22 Sep 2016 03:28:02 +0000

Template error:
In template /app/atmo/templates/atmo/new-cluster.html, error at line 6
   Invalid block tag on line 6: 'url('clusters-new''. Did you forget to register or load this tag?   1 : <button type="button" class="btn btn-md btn-primary" data-toggle="modal" data-target="#new-cluster-dialog" title="Launch Spark cluster">
   2 :   <span class="glyphicon glyphicon-play" aria-hidden="true"></span> Launch
   3 : </button>
   4 : <div class="modal fade" id="new-cluster-dialog" tabindex="-1" role="dialog" aria-labelledby="new-cluster-dialog">
   5 :   <div class="modal-dialog" role="document">
   6 :     <form class="modal-content" action=" {% url('clusters-new' %} " method="POST" enctype="multipart/form-data">
   7 :       {% csrf_token %}
   8 :       <div class="modal-header">
   9 :         <button type="button" class="close" data-dismiss="modal" aria-label="Close"><span aria-hidden="true">&times;</span></button>
   10 :         <h4 class="modal-title">Launch a Spark cluster</h4>
   11 :       </div>
   12 :       <div class="modal-body">
   13 :         <div class="form-group">
   14 :           <label>Cluster identifier</label>
   15 :           <div class="control-group">
   16 :             {{ new_cluster_form.identifier }}

Add other types of spot instances

@vitillo: Analysis jobs have different requirements in terms of hardware resources; some might benefit from more cores while others might benefit from more memory. Our users would like to select the instance type so that they can use machines that fit their job well.

Note that our Spark configuration and ETL jobs have been heavily tuned over time to run on c3.4xlarge instances. Introducing new instance types would require a significant amount of manual QA.

Refactor EMR scripts

Quoting from https://bugzilla.mozilla.org/show_bug.cgi?id=1312747:

Currently airflow and atmo are using two different EMR steps [1] [2] for almost the same logic. We should refactor those into a single script and add that directly to the telemetry-analysis-service repository so that we can have different steps in different environments, like staging and production.
The bootstrap script [3] and the Spark configuration [4] should also be moved to telemetry-analysis-service.
[1] https://github.com/mozilla/emr-bootstrap-spark/blob/master/ansible/files/batch.sh
[2] https://github.com/mozilla/telemetry-airflow/blob/master/ansible/files/spark/airflow.sh
[3] https://github.com/mozilla/emr-bootstrap-spark/blob/master/ansible/files/telemetry.sh
[4] https://github.com/mozilla/emr-bootstrap-spark/blob/master/ansible/files/configuration.json

Substitute sqlite with postgresql for heroku deployment

We want to deploy a stage instance of the analysis service on heroku to verify it's working properly. Due to the ephemeral nature of heroku's nodes, we need to move away from sqlite and use the heroku-provided postgresql service.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.