atmo - The code for the Telemetry Analysis Service
The full documentation can be found on Read The Docs:
https://atmo.readthedocs.io/
Telemetry Analysis Service
Home Page: https://analysis.telemetry.mozilla.org/
License: Mozilla Public License 2.0
The full documentation can be found on Read The Docs:
https://atmo.readthedocs.io/
It's a security risk
See the cluster at [0]. It spent >70 minutes provisioning. We need to add the ability to turn off spot instances, so that we can guarantee clusters to people.
Longer-term, we need to expand our spot instances. Different instance types, different regions, and more would allow us wider selection of machines and reduce this problem.
Originally reported in https://bugzilla.mozilla.org/show_bug.cgi?id=1253675
While testing locally in a branch I bumped into this. I didn't test it on master but looking at the code this seems like it may also be a problem.
The SparkJob.terminate
calls self.provisioner.stop(self.current_run_jobflow_id)
, where self.provisioner
is a SparkJobProvisioner
. But currently the SparkJobProvisioner
has no stop
method and neither does the parent class, it is only defined on the ClusterProvisioner
. In my branch I changed the terminate
method to call self.cluster_provisioner.stop
instead and it fixed my issue locally.
I tried to go to the dev cluster to kill a cluster I spun up earlier and received the django debug error page. Looks like there's a hanging parens from the recent PR switching the template backend @jezdez
TemplateSyntaxError at /
Invalid block tag on line 6: 'url('clusters-new''. Did you forget to register or load this tag?
Request Method: GET
Request URL: https://atmo.dev.mozaws.net/
Django Version: 1.9.1
Exception Type: TemplateSyntaxError
Exception Value:
Invalid block tag on line 6: 'url('clusters-new''. Did you forget to register or load this tag?
Exception Location: /usr/local/lib/python2.7/dist-packages/django/template/base.py in invalid_block_tag, line 568
Server time: Thu, 22 Sep 2016 03:28:02 +0000
Template error:
In template /app/atmo/templates/atmo/new-cluster.html, error at line 6
Invalid block tag on line 6: 'url('clusters-new''. Did you forget to register or load this tag? 1 : <button type="button" class="btn btn-md btn-primary" data-toggle="modal" data-target="#new-cluster-dialog" title="Launch Spark cluster">
2 : <span class="glyphicon glyphicon-play" aria-hidden="true"></span> Launch
3 : </button>
4 : <div class="modal fade" id="new-cluster-dialog" tabindex="-1" role="dialog" aria-labelledby="new-cluster-dialog">
5 : <div class="modal-dialog" role="document">
6 : <form class="modal-content" action=" {% url('clusters-new' %} " method="POST" enctype="multipart/form-data">
7 : {% csrf_token %}
8 : <div class="modal-header">
9 : <button type="button" class="close" data-dismiss="modal" aria-label="Close"><span aria-hidden="true">×</span></button>
10 : <h4 class="modal-title">Launch a Spark cluster</h4>
11 : </div>
12 : <div class="modal-body">
13 : <div class="form-group">
14 : <label>Cluster identifier</label>
15 : <div class="control-group">
16 : {{ new_cluster_form.identifier }}
@vitillo: Analysis jobs have different requirements in terms of hardware resources; some might benefit from more cores while others might benefit from more memory. Our users would like to select the instance type so that they can use machines that fit their job well.
Note that our Spark configuration and ETL jobs have been heavily tuned over time to run on c3.4xlarge instances. Introducing new instance types would require a significant amount of manual QA.
Quoting from https://bugzilla.mozilla.org/show_bug.cgi?id=1312747:
Currently airflow and atmo are using two different EMR steps [1] [2] for almost the same logic. We should refactor those into a single script and add that directly to the telemetry-analysis-service repository so that we can have different steps in different environments, like staging and production.
The bootstrap script [3] and the Spark configuration [4] should also be moved to telemetry-analysis-service.
[1] https://github.com/mozilla/emr-bootstrap-spark/blob/master/ansible/files/batch.sh
[2] https://github.com/mozilla/telemetry-airflow/blob/master/ansible/files/spark/airflow.sh
[3] https://github.com/mozilla/emr-bootstrap-spark/blob/master/ansible/files/telemetry.sh
[4] https://github.com/mozilla/emr-bootstrap-spark/blob/master/ansible/files/configuration.json
Here's a quick link to file a bug. Thanks!
(Issues aren't disabled to not lose the archive of already closed ones)
We want to deploy a stage instance of the analysis service on heroku to verify it's working properly. Due to the ephemeral nature of heroku's nodes, we need to move away from sqlite and use the heroku-provided postgresql service.
The view would allow setting up the Google credentials via a view that we redirect to if no credentials are found in the database. That way we can simplify the setup procedure.
Originally reported in https://bugzilla.mozilla.org/show_bug.cgi?id=1312426
for jobs that I have scheduled.... are they currently running? Did they ever run?
User story: Fixing bugs in the ipy notebook code.
Currently: 'edit job' doesn't allow changing the attached notebook. The way to edit it to make a new job.
Originally reported in https://bugzilla.mozilla.org/show_bug.cgi?id=1173429
Originally reported in https://bugzilla.mozilla.org/show_bug.cgi?id=1312427
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.