mozilla / areweslimyet Goto Github PK

View Code? Open in Web Editor NEW

29.0 6.0 12.0 1.11 MB

Code behind areweslimyet.com

Home Page: areweslimyet.com

Python 25.77% JavaScript 68.06% Shell 1.90% HTML 3.13% CSS 1.14%

areweslimyet's Introduction

MozAreWeSlimYet

Code behind areweslimyet.com

Please note: this project has been moved in tree

This page has been left up for posterity.

Regression tracking

Regressions seen on areweslimyet.com should be filed on Mozilla's Bugzilla instance, blocking bug 1120576.

How It works

BenchTester

This provides BenchTester.py, a framework for running a bench test module, and providing it a add_test_results callback that inserts tests into sqlite databases it manages.

The MarionetteTest.py file is such a module, which launches a marionette test, waits for the test to finish.

BuildGetter.py is a helper that has functions for scanning archive.mozilla.org for available builds, and fetching them.

BatchTester.py is a runner for BenchTester that runs a long-lived daemon, running multithreaded tests side-by-side. It requires a 'hook' file that provides functions to turn test objects, represented by json blobs, into actual commands that invoke a test.

BatchTester.py can read in test requests from a status directory, and write out a status.json file. This is used by the areweslimyet.com/status/ page to both queue and monitor running tests.

The AreWeSlimYet test

The benchtester folder has a marionette test that is fairly simple:

Open all 100 pages of TP5, into 30 tabs (re-using tabs round-robin style), on a timer.
Close all the tabs.
Repeat.
At various points, call the memory reporter subsystem and fire an event with a memory snapshot as data that the MarionetteTest.py module will forward to the database.

slimtest_config.py holds the values we configure the endurance test with. Sourced by run_slimtest.py and slimtest_batchtester_hook.py

run_slimtest.py uses BenchTester to load the MarionetteTest module with our endurance test, and run it against a specific firefox build.

slimtest_batchtester_hook.py is a hook that the BatchTester.py daemon requires to schedule our tests. It provides a function to take the requested tests -- JSON objects generated by e.g. /html/status/request.cgi -- and setup a BenchTester run against them. This is effectively the daemonized version of run_slimtest.py, used by the dedicated test machine. See tester_scripts/launch_tester.sh for an example of usage.

slimtest_linux.sh is a wrapper around run_slimtest.py for spawning the TP5 pageset and a VNC session, then running a test in said session. Specifically, it:

Creates a VNC session
Launches nginx against the $PWD/nginx/ prefix, assumed to hold the TP5 pageset needed by the endurance test. (See tester_scripts/tp5.nginx.conf for an example of setting this up)
Invokes run_slimtest.py
Cleans up VNC and nginx

(See "Running a SlimTest" below for a usage example.)

create_graph_json.py takes a BenchTester sqlite database that has results from our endurance test(s), and generates a set of datapoints suitable for graphing. The configuration for what datapoints to export is embedded at the beginning of this script.

merge_graph_json.py takes a series of json files output by create_graph_json.py of the form seriesname-a, seriesname-b, etc., and creates a master 'seriesname.json' which holds a condensed view of the subseries, as well as references to the subseries files. This is used by the website to store tests in per-month databases, and then create a much smaller "master" file. The website will then request the sub-series when the graph is zoomed in sufficiently on one region.

The website

The html folder holds the website currently hosted at https://areweslimyet.com/. It expects the master file created by merge_graph_json.py to be at html/data/areweslimyet.json, and the relevant create_graph_json.py output to live alongside it.

html/status/ reads the output of the BatchTester.py daemon and shows you what it's up to.

html/status/request.cgi allows you to write to /status/batch/ to send requests to the daemon. This script is not active on the public mirror for obvious reasons.

html/status/slimyet.js holds most of the magic. Note that the configuration in this file for what graphs to show must match the datapoints configured for export in create_graph_json.py. The annotations that appear on the graph with question marks are defined in this file.

Running a SlimTest

Obtain the TP5 pageset, or a similar set of pages to use (though you'll need TP5 for results comparable to the official areweslimyet.com test)
Install marionette-client from pip (pip install 'marionette-client')
Install mercurial from pip (pip install 'mercurial')
The test takes almost two hours by default, so lets stuff it in a vnc session
- vncserver :9
Start a local webserver for the TP5 pageset, which AWSY expects to be on localhost:8001 through localhost:8100
- nginx -C my_tp5_thing/nginx.conf
- To use a different (more public) pageset, edit benchtester/test_memory_usage.py's TEST_SITES array to target the desired pages
Get a Firefox build to test, let's say it's ./firefox/
Pick a database to put this data in, lets say mytests.sqlite (it doesn't have to exist, BenchTester will create it)
Run it! ./run_slimtest.py --binary ./firefox/firefox --sqlite ./mytests.sqlite -l foo.log --buildname mytestbuild --buildtime $(date +%s --date="Jan 1 2014")
- buildname is the name of this build in the database
- buildtime is its unix timestamp, used by the website as the x axis

Your results are in mytests.sqlite, use e.g. sqliteman to examine them, or see "Generating the Website Data" below for using the areweslimyet website to visualize them.

Generating the Website Data

For the official test box we split up test databases by month into files named db/areweslimyet-YYYY-MM.sqlite, which are fed to create_graph_json.py to create html/data/areweslimyet-YYYY-MM.json.gz

merge_graph_data.py then creates html/data/areweslimyet.json.gz, the 'zoomed out' master file. Note that this master file is required even if you only have one sub-series of data (and the subseries do not need to be split by month, you're welcome to have areweslimyet-all.sqlite as the only subseries)

This means, if you have a database named mytests.sqlite from "Running a SlimTest" above, you would need to do the following:

# Create mytests-main.json.gz with the full graph data for my series
./create_graph_json.py ./mytests.sqlite mytests-1 html/data/
# (and optionally create mytests-2 mytests-3, etc)
# Merge series into overview file mytests.json.gz (required even if you only
# have one series)
./merge_graph_json.py mytests html/data/

That's it! Now view your data lives in html/data/. Note that you need a webserver capable of serving .json.gz files transparently in order for the javascript to request them from e.g. /html/data/foo.json. (Alternatively, simply run gzip -d on the produced files, though be warned that they get quite large)

areweslimyet's People

Contributors

Stargazers

Watchers

Forkers

eggpi ericrahm yfdyh000 chmanchester jonco3 klahnakoski ashwin02 shawnjohnjr billbarnhill trellixvulnteam

areweslimyet's Issues

Handle process names in the json creation script

Once process names are added to the DB (#77) we should update our scripts to actually use them.

Graph ARMv7 build data

@staktrace started pushing ARMv7 data (as opposed to ARMv6), but we're not doing anything with it yet. We should make sure we display both sets of data.

@staktrace said:

Ok, I'm updating the harness to use the ARMv7 builds. :johns, The data file we upload to arcus has "Android-ARMv6" as the testname, I'm going to change that to Android-ARM for the new data files. That might require a change on your end, not sure. Let me know when you're ready with me sending over data and I'll start up the harness again.

@Nephyrin responded:

...you should be able to submit builds with any test name and they'll be recorded, so you can start that now. For the front-end, though, we'll need a change to create_graph_json [1] to export a graph line that includes tests from both series. Adding the newer test to the gTests array and then adding some kind of "mergewith" property that merges the output data before writing it is probably the five-line fix to that. I can probably do that at some point this week, unless you want to take a shot at it.

[1] https://github.com/Nephyrin/MozAreWeSlimYet/blob/master/create_graph_json.py#L36

Add ?merging=mean or some such

Being able to tune how points are merged might make visualizing some things easier

Support copy+pasting from memory reports

Similar to about:memory's ascii-tree approach, CnP should produce sane data.

Use pulse to watch for m-i tinderbox builds

Currently we use a cronjob to check for new mozilla-inbound builds. This has the downside that it can find a build dir where the linux64 build isn't done yet which leads to a failed run.

With pulse we can get notified when the build actually finishes and avoid this issue. It would also consolidate logic into trywatcher.

Include all changesets in graph, rather than just ones with tests

Clicking on a point should show you all changesets in the range, rather than silently ignoring ones without complete tests (perhaps as greyed out lines in the list view?)

Dates in tooltips should be clearer

When looking at merged tooltips, it should be clearer that the date given is just a median of the runs that tooltips represents. Maybe some judicious use of the tilde.

Allow linking to ranges/graphs/builds

E.g.

https://areweslimyet.com/#range:1361391234,1361394620
https://areweslimyet.com/#select:aceeea086ccb
https://areweslimyet.com/#graph:explicit

And combinations thereof

Switch to marionette 0.9

Going from 0.8.7 => 0.9.0 most client functionality we use was moved to marionette_driver. We need to update our references. See https://github.com/mozilla/firefox-ui-tests/pull/106/files for an example.

Add ability to annotate builds

It would be nice to able to annotate specific builds with information. For example you could annotate cset b784ce7fd90f with the note "added new fonts" so that the memory jump on AWSY has some sort of explanation. These annotations should be visible (maybe with a button to toggle them) on the graph.

Add a bootstrap server setup script

For setting up new servers it would be useful to add a setup script. This would do such things as:

Install system package requirements
Create awsy user, should be sys user < 500, preferably home under /var
Checkout / update awsy code
Setup crontab w/ proper paths, create all required paths
Create virtualenv, install requirements from pip (detect proxy and use right params)
Setup tightvnc password
Verify all required network connections are available
Configure nginx / httpd

Optional stuff:

Send user to pulse guardian to setup an awsy account

"Datapoint is the median of n tests" view should still include pushlog links

I think that pretty much any link to hg.mozilla.org for a changeset should be in the form of a pushlog link, or at least have an easily-accessible pushlog link, since there could always be untested changes between the tested ones. When I click on a data point that is an aggregate data point, and it pops up the window with a list of csets with the heading "Datapoint is the median of tests", the list of csets there don't have individual pushlog links. Sometimes just by looking down the memory deltas in the list it is obvious which "cset" caused the jump, but clicking on that link will take you to an individual cset rather than the pushlog between that entry and the preceding one, which is misleading.

Keep all graphs zoom in sync

Would be nice if zoom maintained zoom/date range on all 3 graphs.

Use mozinstall to extract builds

mozinstall handles installation on all platforms. This gets us one step closer to x-plat support by not hard coding the extraction logic to .tar.bz2 and not hard coding the binary path to firefox/firefox.

On AWSY status page, it's hard to tell when the "build range" checkbox is disabled

A disabled checked checkbox is obvious, but at least on my mac, a disabled unchecked checkbox looks almost identical to an enabled unchecked checkbox.

Perhaps you could make the labels gray when disabling the checkboxes?

Add "try" mode

Rather than shoving try build request into the "ftp" category, we should just add a proper try mode. This will affect status.js, request.cgi, try_watcher.py, and BatchTester.py.

Provide some easy way to see recent data on the default view

[njn] johns: something that I would like: because we default to showing all the historical data, you have to zoom in several times to see recent changes
[njn] johns: it would be nice if the default view was more recent, and you could zoom out

Hide all-at-once tests by default

If we don't think the all-at-once tests are useful, let's hide them by default.

When you hover over a point, we should highlight all points with that same build-id

We could highlight by making the points themselves glow, or by drawing a vertical line between the points.

Add "note" to trywatcher queued builds

Add a note that trywatcher queued the build, possibly include a user name as well.

enable TLS >1.0

Please enable TLS version 1.1 and 1.2.

If you set "security.tls.version.min" in "about:config" to 2 or 3 you get an error about "no-cipher-overlap"

Make clicking on the pushlog link from a try revision take you to the try pushlog

Currently, AWSY assumes that revisions are from mozilla-inbound, so clicking on the pushlog link from a try revision doesn't work. It takes you to the mozilla-inbound pushlog for that revision, which almost certainly doesn't exist.

In the list view, color-code changesets by push

When clicking on a condensed datapoint, you get a list of changesets and their test results -- we should somehow group or colorcode these by push (which can we can do based on the timestamp)

Graph gets confused if you zoom in on empty space

Maybe this won't be a problem once we fill in all the builds.

Inline all javascript

There's some issue in marionette that causes about a 75% test failure rate due to load_script failing to actually load the script. Instead we'll just hand the script over directly.

Timestamps on collapsed data points are outside the range of the data points

I'm mousing over a data point on the graph, and it shows:

build d9a88189be9c .. ecd327272240 (pushlog)
Mon, 15 Apr 2013 17:18:42 GMT

Then I click on the data point, and in the top left it shows:

build d9a88189be9c .. ecd327272240 (pushlog)
Mon, 15 Apr 2013 17:42:31 GMT -- Mon, 15 Apr 2013 18:25:37 GMT

Why is the timestamp for the collapsed data point outside the range of the exploded data point? (Also I only noticed this because I was trying to figure out why the annotations don't seem to line up exactly with the right dots on the graph, and I suspect that it might because of the difference in time scales being used here)

Try integration

Use case: Push to try w/ some sort of '-awsy:series_name' flag and it will automatically queue up an awsy run if the try build succeeds

Support exporting to about:memory's JSON format

It's probably not feasible to switch to using about:memory's JSON format, but providing an option to export to it might be helpful.

Switch over to marionette

Mozmill does not support e10s so we need to switch to marionette. This step is just to switch frameworks, after that we'll deal with enabling e10s in #54.

Click to zoom doesn't work until you move the mouse

STR:

Zoom out
Click to zoom in
Without moving the mouse, click again to zoom in further

Graph doesn't zoom.

Track each memory report entry 'kind'

Update our DB schema to track the report kind (HEAP, NON_HEAP, OTHER).
Update scripts to include a _kind attribute, we can probably default to heap
Update about_memory_worker.js to use the _kind attribute.

Use mozdownload in BuildGetter

mozdownload does a pretty thorough job of handling downloads from ftp.m.o, we should switch over to that.

Allow using/checking inbound-archive for re-queuing older builds

http://inbound-archive.pub.build.mozilla.org/pub/mozilla.org/

Has a full cycle of inbound builds, though it is not always as up to date as the main FTP. When re-queuing old builds on the status interface, we should be able to go back as many as six months by checking there as a backup.

Support e10s

We're currently just disable e10s to work around mozmill's lack of support. The next steps are to:

switch over to marionette
start recording multiprocess data
figure out how we want to display it.

Visually distinguish between merged points and unmerged points

In bug 842756, we had an issue where the data looked different once we zoomed in enough that we stopped coalescing points.

ISTM that it would be helpful if we'd visually distinguish between coalesced and not-coalesced points. For example, we could use + for not-coalesced points, or we could color (non-)coalesced points black.

open HG links in a new tab

When you click on an HG link, you end up navigating, while I'd prefer it open in a new tab.

Integration test for BatchTester

Add test that exercises the batch tester logic. For this test we probably just want to verify that for each build type we go through all steps up to testing and that the proper artifacts are created (status is updated, queue state is persisted, etc).

We can use a mock test type in place of MarionetteTest and the mock http server for the build downloads.

"Compile specific revisions" no longer works

Per discussion with @Nephyrin there are a couple of issues:

We're using an outdated .mozconfig from before mozconfigs were in tree
Even if building worked, it would throw off the numbers for currently running tests

The prefered solution is to trigger buildbot builds (presumably some sort of JSON post) w/ an awsy account. Then we could pull those builds down when they complete and use the same behavior as "tinderbox" or "try" builds.

Sanity check input when adding a try build

If you input an invalid try hash we just get a cryptic error in the recent batch requests log:

An exception occured while processing batch -- <type 'exceptions.AttributeError'>: FTPBuild instance has no attribute '_revision'

There are probably two improvements here:

Sanity check the input (must be 12 alnum chars)
Propagate more detailed exceptions in the BuildGetter (in this case something like "build w/ hash blah not found")

AWSY mobile links should point to mozilla-inbound

If you click on any of the pushlog links at https://www.areweslimyet.com/?mobile it will take you to the pushlog page for m-c instead of m-i. So for example this link:

https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=793c5c49bd9b&tochange=0515eda1f932

should really be this:

https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=793c5c49bd9b&tochange=0515eda1f932

About:memory differ

For memory regressions it's sometimes handy to see a diff of the memory reports from two changesets. It would be nice to get a "diff against previous" feature built into awsy. Right now I have a basic java tool hacked together (https://github.com/staktrace/awsy-armv6/blob/master/about-memory-differ/Differ.java) but it's probably easier to rewrite it from scratch for this.

Hover dialog should be a little farther from the cursor

Often I find that I accidentally hover over a point I don't mean to and then can't get to the point I do care about because the hover dialog is now hiding it. I'd prefer if the dialog were a little farther away from the mouse cursor so that it doesn't interfere with hovering over the points.

Add indication to status page as to whether tests are actually running

It would be useful to know if the test daemon is actually running.

One suggestion from @Nephyrin

to make the /status/ page slightly more useful when the tester is fubar you could put a incrementing counter/timestamp into status.json and have the page warn when its not updating

Hide annotations that are too close together

The overlapping annotations are ugly and unusable

Perhaps a 'priority' field can be added to gAnnotations to specify which ones should get precedence

Add tests for trywatcher

Trywatcher has some basic tests for utility functions. It would be useful to add testing of on_event through writing out the batch request.

Email on completion

For manually triggered builds it would be nice to get a notification upon completion. For trywatcher we can get the email that was used in the try commit.

Capture gecko log

Particularly for custom runs it would be useful to be able to get the gecko log. For m-i builds it's probably less important and would take up a ton of disk space.

Rework database to include process info

We need to include a process name when adding values to the database and update the the json generation to handle the presence of a process name.

One option is to just shove it in as an additional :metedata field. Preferably we'd add a process name column / table to the DB instead. This might be a good time to split out other metadata fields and add the heap kind from #39 in.

Account for shared libraries from resident graph line

The resident graph gets noisy when multiple tests are running on one machine, due to shared library page sharing. We should either subtract this from the line as plotted, or see if we can disable this page sharing at the OS level to better isolate tests.

Display multiprocess data

Update the website to include graphs for all processes. This should be flexible enough to handle more than 2 processes (particularly for eventual b2g support).